* PROBLEM: cgroup cost too much memory when transfer small files to tmpfs @ 2020-07-21 11:19 jingrui 2020-07-21 14:45 ` Shakeel Butt ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: jingrui @ 2020-07-21 11:19 UTC (permalink / raw) To: tj, Lizefan, hannes, hannes, mhocko, vdavydov.dev Cc: akpm, linux-mm, cgroups, linux-kernel, caihaomin, Weiwei (N) Cc: Johannes Weiner <hannes@cmpxchg.org> ; Michal Hocko <mhocko@kernel.org>; Vladimir Davydov <vdavydov.dev@gmail.com> Thanks. --- PROBLEM: cgroup cost too much memory when transfer small files to tmpfs. keywords: cgroup PERCPU/memory cost too much. description: We send small files from node-A to node-B tmpfs /tmp directory using sftp. On node-B the systemd configured with pam on like below. cat /etc/pam.d/password-auth | grep systemd -session optional pam_systemd.so So when transfer a file, a systemd session is created, that means a cgroup is created, then file saved at /tmp will associated with a cgroup object. After file transferred, session and cgroup-dir will be removed, but the file in /tmp still associated with the cgroup object. The PERCPU memory in cgroup/css object cost a lot(about 0.5MB/per-cgroup-object) on 200/cpus machine. When lot of small files transferred to tmpfs, the cgroup/css object memory cost become huge in this scenes to be used. systemd related issue: https://github.com/systemd/systemd/issues/16499 kernel version: 4.19+ Problem: 1. Do we have any idea to descrease cgroup memory cost in this case? 2. When user remove cgroup directory, does it possible associated file memory to root cgroup? 3. Can we provide an option that do not associate memory with cgroup in tmpfs? ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: cgroup cost too much memory when transfer small files to tmpfs 2020-07-21 11:19 PROBLEM: cgroup cost too much memory when transfer small files to tmpfs jingrui @ 2020-07-21 14:45 ` Shakeel Butt 2020-07-21 17:41 ` Johannes Weiner 2020-07-24 7:55 ` Michal Hocko 2 siblings, 0 replies; 11+ messages in thread From: Shakeel Butt @ 2020-07-21 14:45 UTC (permalink / raw) To: jingrui Cc: tj, Lizefan, hannes, mhocko, vdavydov.dev, akpm, linux-mm, cgroups, linux-kernel, caihaomin, Weiwei (N) On Tue, Jul 21, 2020 at 4:20 AM jingrui <jingrui@huawei.com> wrote: > > Cc: Johannes Weiner <hannes@cmpxchg.org> ; Michal Hocko <mhocko@kernel.org>; Vladimir Davydov <vdavydov.dev@gmail.com> > > Thanks. > > --- > PROBLEM: cgroup cost too much memory when transfer small files to tmpfs. > > keywords: cgroup PERCPU/memory cost too much. > > description: > > We send small files from node-A to node-B tmpfs /tmp directory using sftp. On > node-B the systemd configured with pam on like below. > > cat /etc/pam.d/password-auth | grep systemd > -session optional pam_systemd.so > > So when transfer a file, a systemd session is created, that means a cgroup is > created, then file saved at /tmp will associated with a cgroup object. After > file transferred, session and cgroup-dir will be removed, but the file in /tmp > still associated with the cgroup object. Is there a way for you to re-use the cgroup instead of creating and deleting cgroup for each individual file transfer session? > The PERCPU memory in cgroup/css object > cost a lot(about 0.5MB/per-cgroup-object) on 200/cpus machine. > > When lot of small files transferred to tmpfs, the cgroup/css object memory > cost become huge in this scenes to be used. > > systemd related issue: https://github.com/systemd/systemd/issues/16499 > > kernel version: 4.19+ > > Problem: > > 1. Do we have any idea to descrease cgroup memory cost in this case? > 2. When user remove cgroup directory, does it possible associated file memory to root cgroup? No, the memory remains associated with the cgroup and the cgroup becomes zombie on deletion. > 3. Can we provide an option that do not associate memory with cgroup in tmpfs? Only way, if you don't want to disable memcg, is to move the file receiver process to root cgroup. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: cgroup cost too much memory when transfer small files to tmpfs 2020-07-21 11:19 PROBLEM: cgroup cost too much memory when transfer small files to tmpfs jingrui 2020-07-21 14:45 ` Shakeel Butt @ 2020-07-21 17:41 ` Johannes Weiner 2020-07-21 18:49 ` Roman Gushchin 2020-07-24 7:55 ` Michal Hocko 2 siblings, 1 reply; 11+ messages in thread From: Johannes Weiner @ 2020-07-21 17:41 UTC (permalink / raw) To: jingrui Cc: tj, Lizefan, mhocko, vdavydov.dev, akpm, linux-mm, cgroups, linux-kernel, caihaomin, Weiwei (N), guro On Tue, Jul 21, 2020 at 11:19:52AM +0000, jingrui wrote: > Cc: Johannes Weiner <hannes@cmpxchg.org> ; Michal Hocko <mhocko@kernel.org>; Vladimir Davydov <vdavydov.dev@gmail.com> > > Thanks. > > --- > PROBLEM: cgroup cost too much memory when transfer small files to tmpfs. > > keywords: cgroup PERCPU/memory cost too much. > > description: > > We send small files from node-A to node-B tmpfs /tmp directory using sftp. On > node-B the systemd configured with pam on like below. > > cat /etc/pam.d/password-auth | grep systemd > -session optional pam_systemd.so > > So when transfer a file, a systemd session is created, that means a cgroup is > created, then file saved at /tmp will associated with a cgroup object. After > file transferred, session and cgroup-dir will be removed, but the file in /tmp > still associated with the cgroup object. The PERCPU memory in cgroup/css object > cost a lot(about 0.5MB/per-cgroup-object) on 200/cpus machine. CC Roman who had a patch series to free all this extended (percpu) memory upon cgroup deletion: https://lore.kernel.org/patchwork/cover/1050508/ It looks like it never got merged for some reason. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: cgroup cost too much memory when transfer small files to tmpfs 2020-07-21 17:41 ` Johannes Weiner @ 2020-07-21 18:49 ` Roman Gushchin 2020-07-21 19:12 ` Shakeel Butt 0 siblings, 1 reply; 11+ messages in thread From: Roman Gushchin @ 2020-07-21 18:49 UTC (permalink / raw) To: Johannes Weiner Cc: jingrui, tj, Lizefan, mhocko, vdavydov.dev, akpm, linux-mm, cgroups, linux-kernel, caihaomin, Weiwei (N), guro On Tue, Jul 21, 2020 at 01:41:26PM -0400, Johannes Weiner wrote: > On Tue, Jul 21, 2020 at 11:19:52AM +0000, jingrui wrote: > > Cc: Johannes Weiner <hannes@cmpxchg.org> ; Michal Hocko <mhocko@kernel.org>; Vladimir Davydov <vdavydov.dev@gmail.com> > > > > Thanks. > > > > --- > > PROBLEM: cgroup cost too much memory when transfer small files to tmpfs. > > > > keywords: cgroup PERCPU/memory cost too much. > > > > description: > > > > We send small files from node-A to node-B tmpfs /tmp directory using sftp. On > > node-B the systemd configured with pam on like below. > > > > cat /etc/pam.d/password-auth | grep systemd > > -session optional pam_systemd.so > > > > So when transfer a file, a systemd session is created, that means a cgroup is > > created, then file saved at /tmp will associated with a cgroup object. After > > file transferred, session and cgroup-dir will be removed, but the file in /tmp > > still associated with the cgroup object. The PERCPU memory in cgroup/css object > > cost a lot(about 0.5MB/per-cgroup-object) on 200/cpus machine. > > CC Roman who had a patch series to free all this extended (percpu) > memory upon cgroup deletion: > > https://lore.kernel.org/patchwork/cover/1050508/ > > It looks like it never got merged for some reason. The mentioned patchset can make the problem less noticeable, but can't solve it completely. It has never been merged, because the dying cgroup problem was mostly solved by other methods: slab memory reparenting and various reclaim fixes. So there was no more reason to complicate the code to release the memcg memory early. The overhead of creating and destroying a new memory cgroup for a transfer of a small file will be noticeable anyway. So IMO the solution is to use a single cgroup for all transfers. I don't know if systemd supports such mode out of the box, but it shouldn't be hard to add it. But also I wonder if we need a special tmpfs mount option, something like "noaccount". Not only for this specific case, but also for the case when tmpfs is extensively shared between multiple cgroups or if it's used to pass some data from one cgroup to another, or if we care about the performance more than about the accounting; in other words for cases where the accounting makes more harm than good. Thanks! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: cgroup cost too much memory when transfer small files to tmpfs 2020-07-21 18:49 ` Roman Gushchin @ 2020-07-21 19:12 ` Shakeel Butt 2020-07-21 19:27 ` Roman Gushchin 0 siblings, 1 reply; 11+ messages in thread From: Shakeel Butt @ 2020-07-21 19:12 UTC (permalink / raw) To: Roman Gushchin Cc: Johannes Weiner, jingrui, tj, Lizefan, mhocko, vdavydov.dev, akpm, linux-mm, cgroups, linux-kernel, caihaomin, Weiwei (N), guro On Tue, Jul 21, 2020 at 11:51 AM Roman Gushchin <guro@fb.com> wrote: > > On Tue, Jul 21, 2020 at 01:41:26PM -0400, Johannes Weiner wrote: > > On Tue, Jul 21, 2020 at 11:19:52AM +0000, jingrui wrote: > > > Cc: Johannes Weiner <hannes@cmpxchg.org> ; Michal Hocko <mhocko@kernel.org>; Vladimir Davydov <vdavydov.dev@gmail.com> > > > > > > Thanks. > > > > > > --- > > > PROBLEM: cgroup cost too much memory when transfer small files to tmpfs. > > > > > > keywords: cgroup PERCPU/memory cost too much. > > > > > > description: > > > > > > We send small files from node-A to node-B tmpfs /tmp directory using sftp. On > > > node-B the systemd configured with pam on like below. > > > > > > cat /etc/pam.d/password-auth | grep systemd > > > -session optional pam_systemd.so > > > > > > So when transfer a file, a systemd session is created, that means a cgroup is > > > created, then file saved at /tmp will associated with a cgroup object. After > > > file transferred, session and cgroup-dir will be removed, but the file in /tmp > > > still associated with the cgroup object. The PERCPU memory in cgroup/css object > > > cost a lot(about 0.5MB/per-cgroup-object) on 200/cpus machine. > > > > CC Roman who had a patch series to free all this extended (percpu) > > memory upon cgroup deletion: > > > > https://lore.kernel.org/patchwork/cover/1050508/ > > > > It looks like it never got merged for some reason. > > The mentioned patchset can make the problem less noticeable, but can't solve it completely. > It has never been merged, because the dying cgroup problem was mostly solved by other methods: > slab memory reparenting and various reclaim fixes. So there was no more reason to complicate > the code to release the memcg memory early. > > The overhead of creating and destroying a new memory cgroup for a transfer of a small > file will be noticeable anyway. So IMO the solution is to use a single cgroup for all > transfers. I don't know if systemd supports such mode out of the box, but it shouldn't > be hard to add it. > > But also I wonder if we need a special tmpfs mount option, something like "noaccount". > Not only for this specific case, but also for the case when tmpfs is extensively > shared between multiple cgroups or if it's used to pass some data from one cgroup > to another, or if we care about the performance more than about the accounting; > in other words for cases where the accounting makes more harm than good. > Internally we actually have an tmpfs mount option "memcg=" which charges all the memory of the tmpfs files on that mount to the given memcg and the motivation is the shared tmpfs files between multiple cgroups. One concrete use-case is the shared memory used for communication between the application and the user space network driver [1]. The "memcg=root" can be used as a "noaccount" option. [1] https://sosp19.rcs.uwaterloo.ca/slides/marty.pdf ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: cgroup cost too much memory when transfer small files to tmpfs 2020-07-21 19:12 ` Shakeel Butt @ 2020-07-21 19:27 ` Roman Gushchin 0 siblings, 0 replies; 11+ messages in thread From: Roman Gushchin @ 2020-07-21 19:27 UTC (permalink / raw) To: Shakeel Butt Cc: Johannes Weiner, jingrui, tj, Lizefan, mhocko, vdavydov.dev, akpm, linux-mm, cgroups, linux-kernel, caihaomin, Weiwei (N), guro On Tue, Jul 21, 2020 at 12:12:58PM -0700, Shakeel Butt wrote: > On Tue, Jul 21, 2020 at 11:51 AM Roman Gushchin <guro@fb.com> wrote: > > > > On Tue, Jul 21, 2020 at 01:41:26PM -0400, Johannes Weiner wrote: > > > On Tue, Jul 21, 2020 at 11:19:52AM +0000, jingrui wrote: > > > > Cc: Johannes Weiner <hannes@cmpxchg.org> ; Michal Hocko <mhocko@kernel.org>; Vladimir Davydov <vdavydov.dev@gmail.com> > > > > > > > > Thanks. > > > > > > > > --- > > > > PROBLEM: cgroup cost too much memory when transfer small files to tmpfs. > > > > > > > > keywords: cgroup PERCPU/memory cost too much. > > > > > > > > description: > > > > > > > > We send small files from node-A to node-B tmpfs /tmp directory using sftp. On > > > > node-B the systemd configured with pam on like below. > > > > > > > > cat /etc/pam.d/password-auth | grep systemd > > > > -session optional pam_systemd.so > > > > > > > > So when transfer a file, a systemd session is created, that means a cgroup is > > > > created, then file saved at /tmp will associated with a cgroup object. After > > > > file transferred, session and cgroup-dir will be removed, but the file in /tmp > > > > still associated with the cgroup object. The PERCPU memory in cgroup/css object > > > > cost a lot(about 0.5MB/per-cgroup-object) on 200/cpus machine. > > > > > > CC Roman who had a patch series to free all this extended (percpu) > > > memory upon cgroup deletion: > > > > > > https://lore.kernel.org/patchwork/cover/1050508/ > > > > > > It looks like it never got merged for some reason. > > > > The mentioned patchset can make the problem less noticeable, but can't solve it completely. > > It has never been merged, because the dying cgroup problem was mostly solved by other methods: > > slab memory reparenting and various reclaim fixes. So there was no more reason to complicate > > the code to release the memcg memory early. > > > > The overhead of creating and destroying a new memory cgroup for a transfer of a small > > file will be noticeable anyway. So IMO the solution is to use a single cgroup for all > > transfers. I don't know if systemd supports such mode out of the box, but it shouldn't > > be hard to add it. > > > > But also I wonder if we need a special tmpfs mount option, something like "noaccount". > > Not only for this specific case, but also for the case when tmpfs is extensively > > shared between multiple cgroups or if it's used to pass some data from one cgroup > > to another, or if we care about the performance more than about the accounting; > > in other words for cases where the accounting makes more harm than good. > > > > Internally we actually have an tmpfs mount option "memcg=" which > charges all the memory of the tmpfs files on that mount to the given > memcg and the motivation is the shared tmpfs files between multiple > cgroups. One concrete use-case is the shared memory used for > communication between the application and the user space network > driver [1]. The "memcg=root" can be used as a "noaccount" option. It sounds like a good idea to me. I'm slightly worried about possible security implications of allowing to pass a custom cgroup, but I guess we can start with supporting the root cgroup only. Thanks! ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: PROBLEM: cgroup cost too much memory when transfer small files to tmpfs 2020-07-21 11:19 PROBLEM: cgroup cost too much memory when transfer small files to tmpfs jingrui 2020-07-21 14:45 ` Shakeel Butt 2020-07-21 17:41 ` Johannes Weiner @ 2020-07-24 7:55 ` Michal Hocko 2020-07-24 9:35 ` 答复: " jingrui 2 siblings, 1 reply; 11+ messages in thread From: Michal Hocko @ 2020-07-24 7:55 UTC (permalink / raw) To: jingrui Cc: tj, Lizefan, hannes, vdavydov.dev, akpm, linux-mm, cgroups, linux-kernel, caihaomin, Weiwei (N) On Tue 21-07-20 11:19:52, jingrui wrote: [...] > systemd related issue: https://github.com/systemd/systemd/issues/16499 Well, I would be really careful with one-off and short lived cgroups. Firstly there are charges which cannot be easily reparented and secondly even if the memory footprint is reduced there would be still memcgs standing in the way. [...] > 1. Do we have any idea to descrease cgroup memory cost in this case? Others have already commented on this. > 2. When user remove cgroup directory, does it possible associated file memory to root cgroup? We used to do that in the past but removed it by b2052564e66d ("mm: memcontrol: continue cache reclaim from offlined groups"). Please read through the changelog for the reasoning behind. > 3. Can we provide an option that do not associate memory with cgroup in tmpfs? What is the reason to run under !root cgroup in those sessions if you do not care about accounting anyway? tmpfs is a persistent charge until the file is removed. So if those outlive the session then you either want them to be charged to somebody or you do not care about accounting at all, no? Or could you explain your usecase some more? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 11+ messages in thread
* 答复: PROBLEM: cgroup cost too much memory when transfer small files to tmpfs 2020-07-24 7:55 ` Michal Hocko @ 2020-07-24 9:35 ` jingrui 2020-07-24 11:35 ` Michal Hocko 0 siblings, 1 reply; 11+ messages in thread From: jingrui @ 2020-07-24 9:35 UTC (permalink / raw) To: Michal Hocko Cc: tj, Lizefan, hannes, vdavydov.dev, akpm, linux-mm, cgroups, linux-kernel, caihaomin, Weiwei (N) On Friday, July 24, 2020 3:55 PM, Michal Hocko wrote: > What is the reason to run under !root cgroup in those sessions if you do not care about accounting anyway? The systemd not support run those sessions under root cgroup, disable pam-systemd will not create session/cgroup, but this is not safe and make systemd-logind not work. > tmpfs is a persistent charge until the file is removed. So if those outlive the session then you either want them to be charged to somebody or you do not care about accounting at all, no? Or could you explain your usecase some more? In some usecase, we dont have disk and keep files in memory, we treat tmpfs just like disk, so dont care tmpfs accouting at all. -- Jingrui BR. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: 答复: PROBLEM: cgroup cost too much memory when transfer small files to tmpfs 2020-07-24 9:35 ` 答复: " jingrui @ 2020-07-24 11:35 ` Michal Hocko 2020-07-27 3:14 ` jingrui 0 siblings, 1 reply; 11+ messages in thread From: Michal Hocko @ 2020-07-24 11:35 UTC (permalink / raw) To: jingrui Cc: tj, Lizefan, hannes, vdavydov.dev, akpm, linux-mm, cgroups, linux-kernel, caihaomin, Weiwei (N) On Fri 24-07-20 09:35:26, jingrui wrote: > > On Friday, July 24, 2020 3:55 PM, Michal Hocko wrote: > > > What is the reason to run under !root cgroup in those sessions if you do not care about accounting anyway? > > The systemd not support run those sessions under root cgroup, disable > pam-systemd will not create session/cgroup, but this is not safe and make > systemd-logind not work. Could you be more specific please? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 11+ messages in thread
* RE: 答复: PROBLEM: cgroup cost too much memory when transfer small files to tmpfs 2020-07-24 11:35 ` Michal Hocko @ 2020-07-27 3:14 ` jingrui 2020-07-27 13:40 ` 答复: " Fangxiuning (Jack, EulerOS) 0 siblings, 1 reply; 11+ messages in thread From: jingrui @ 2020-07-27 3:14 UTC (permalink / raw) To: Michal Hocko Cc: tj, Lizefan, hannes, vdavydov.dev, akpm, linux-mm, cgroups, linux-kernel, caihaomin, Weiwei (N), Fangxiuning (Jack, EulerOS) Cc Fangxiuning On Fri 24-07-20 09:35:26, jingrui wrote: > > On Friday, July 24, 2020 3:55 PM, Michal Hocko wrote: > > > What is the reason to run under !root cgroup in those sessions if you do not care about accounting anyway? > > The systemd not support run those sessions under root cgroup, disable > pam-systemd will not create session/cgroup, but this is not safe and > make systemd-logind not work. Could you be more specific please? As I know, when user call sftp client to send files, the server will call pam-systemd.so lib to create session and cgroup. We can skip call pam-systemd.so by config /etc/pam.d/password-auth drop the line " -session optional pam_systemd.so". But this config is global, and will affect other services, such as ssh login. We don’t find a way just don’t create cgroup dir for sftp. @Xiuning Would you please take a look and give some suggestion? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 11+ messages in thread
* 答复: 答复: PROBLEM: cgroup cost too much memory when transfer small files to tmpfs 2020-07-27 3:14 ` jingrui @ 2020-07-27 13:40 ` Fangxiuning (Jack, EulerOS) 0 siblings, 0 replies; 11+ messages in thread From: Fangxiuning (Jack, EulerOS) @ 2020-07-27 13:40 UTC (permalink / raw) To: jingrui, Michal Hocko Cc: tj, Lizefan, hannes, vdavydov.dev, akpm, linux-mm, cgroups, linux-kernel, caihaomin, Weiwei (N) @Xiuning Would you please take a look and give some suggestion? I don't suggest this solution for using in long term which skip call pam-systemd.so to fix this issue, Sftp sends files and call pam-systemd.so to create session which manage resources more reasonable, this is evolution direction of systemd upstream. Systemd don't have better solution and Kernel cgroup maybe give a better one for this issue. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-07-27 13:40 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-07-21 11:19 PROBLEM: cgroup cost too much memory when transfer small files to tmpfs jingrui 2020-07-21 14:45 ` Shakeel Butt 2020-07-21 17:41 ` Johannes Weiner 2020-07-21 18:49 ` Roman Gushchin 2020-07-21 19:12 ` Shakeel Butt 2020-07-21 19:27 ` Roman Gushchin 2020-07-24 7:55 ` Michal Hocko 2020-07-24 9:35 ` 答复: " jingrui 2020-07-24 11:35 ` Michal Hocko 2020-07-27 3:14 ` jingrui 2020-07-27 13:40 ` 答复: " Fangxiuning (Jack, EulerOS)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).