All of lore.kernel.org
 help / color / mirror / Atom feed
* oomd with 6.0-rc1 has ridiculously high memory pressure stats wit
@ 2022-08-20  2:51 Chris Murphy
       [not found] ` <d0df567c-1f6a-418d-8db7-3f777bd109c8-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Murphy @ 2022-08-20  2:51 UTC (permalink / raw)
  To: cgroups-u79uwXL29TY76Z2rM5mHXA

Hi,

Tracking a downstream bug in Fedora Rawhide testing, where 6.0-rc1 has landed, and we're seeing various GNOME components getting kllled off by systemd-oomd, with the stats showing suspiciously high values:

https://bugzilla.redhat.com/show_bug.cgi?id=2119518

e.g.

Killed /user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/org.gnome.Shell-r28gBBs99rhXz5zEmyOJwQ@public.gmane.org due to memory pressure for /user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org being 27925460729.27% > 50.00% for > 20s with reclaim activity

I'm not seeing evidence of high memory pressure in /proc/pressure though, whereas oomd is reporting really high memory pressure and absolute time for it that makes no sense at all:

>>Sep 09 03:01:05 fedora systemd-oomd[604]:                 Pressure: Avg10: 1255260529528.42 Avg60: 325612.68 Avg300: 757127258245.62 Total: 2month 4w 2d 8h 15min 12s

It's been up for about 2 minutes at this point, not 3 months.

Thanks,


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: oomd with 6.0-rc1 has ridiculously high memory pressure stats wit
       [not found] ` <d0df567c-1f6a-418d-8db7-3f777bd109c8-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
@ 2022-08-23 18:27   ` Tejun Heo
       [not found]     ` <YwUcGvE/rhHEZ+KO-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2022-08-23 18:27 UTC (permalink / raw)
  To: Chris Murphy; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Johannes Weiner

(cc'ing Johannes for visibility)

On Fri, Aug 19, 2022 at 10:51:27PM -0400, Chris Murphy wrote:
> Hi,
> 
> Tracking a downstream bug in Fedora Rawhide testing, where 6.0-rc1 has landed, and we're seeing various GNOME components getting kllled off by systemd-oomd, with the stats showing suspiciously high values:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=2119518
> 
> e.g.
> 
> Killed /user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/org.gnome.Shell-r28gBBs99rhXz5zEmyOJwQ@public.gmane.org due to memory pressure for /user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org being 27925460729.27% > 50.00% for > 20s with reclaim activity
> 
> I'm not seeing evidence of high memory pressure in /proc/pressure though, whereas oomd is reporting really high memory pressure and absolute time for it that makes no sense at all:
> 
> >>Sep 09 03:01:05 fedora systemd-oomd[604]:                 Pressure: Avg10: 1255260529528.42 Avg60: 325612.68 Avg300: 757127258245.62 Total: 2month 4w 2d 8h 15min 12s
> 
> It's been up for about 2 minutes at this point, not 3 months.
> 
> Thanks,
> 
> 
> --
> Chris Murphy Murphy

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: oomd with 6.0-rc1 has ridiculously high memory pressure stats wit
       [not found]     ` <YwUcGvE/rhHEZ+KO-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
@ 2022-08-23 18:39       ` Chris Murphy
       [not found]         ` <9412f39b-9ec1-4542-944c-19577a358b97-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Murphy @ 2022-08-23 18:39 UTC (permalink / raw)
  To: Tejun Heo; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Johannes Weiner

Another example,  with 6.0-rc2:

# oomctl
Dry Run: no
Swap Used Limit: 90.00%
Default Memory Pressure Limit: 60.00%
Default Memory Pressure Duration: 20s
System Context:
        Memory: Used: 1.7G Total: 3.8G
        Swap: Used: 0B Total: 3.8G
Swap Monitored CGroups:
        Path: /
                Swap Usage: (see System Context)
Memory Pressure Monitored CGroups:
        Path: /user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org
                Memory Pressure Limit: 50.00%
                Pressure: Avg10: 0.00 Avg60: 0.00 Avg300: 0.10 Total: 140y 3month 18h 24min 20s
                Current Memory Usage: 1.5G
                Memory Min: 250.0M
                Memory Low: 0B
                Pgscan: 0
                Last Pgscan: 0
# cat /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/memory.pressure
some avg10=0.00 avg60=0.00 avg300=0.18 total=8367757640799118
full avg10=0.00 avg60=0.00 avg300=0.07 total=4426019660641432

# grep -R . /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/
https://drive.google.com/file/d/1Ro6rKnEx1CCapmO3rz6SDysjP1Bs4_Re/view?usp=sharing 

Actual uptime is ~10 minutes.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: oomd with 6.0-rc1 has ridiculously high memory pressure stats wit
       [not found]         ` <9412f39b-9ec1-4542-944c-19577a358b97-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
@ 2022-08-23 18:59           ` Chris Murphy
       [not found]             ` <0a6105f9-012a-4b75-b741-6549d7e169d8-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Murphy @ 2022-08-23 18:59 UTC (permalink / raw)
  To: Tejun Heo; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Johannes Weiner

Same VM but a different boot:

Excerpts:

/sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/gvfs-goa-volume-monitor.service/io.pressure:some avg10=3031575.41 avg60=56713935870.67 avg300=624837039080.83 total=18446621498826359
/sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/gvfs-goa-volume-monitor.service/io.pressure:full avg10=3031575.41 avg60=56713935870.80 avg300=624837039080.99 total=16045481047390973

None of that seems possible.

io is also affected:

/sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/org.gnome.SettingsDaemon.Smartcard.service/io.pressure:full avg10=0.00 avg60=0.13 avg300=626490311370.87 total=16045481047397307

# oomctl
# grep -R . /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/
https://drive.google.com/file/d/1JoUxjQ2ribDvn5jmydCWXJdg0daaNScG/view?usp=sharing

We're going to try reverting 5f69a6577bc33d8f6d6bbe02bccdeb357b287f56 and see if it helps.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: oomd with 6.0-rc1 has ridiculously high memory pressure stats wit
       [not found]             ` <0a6105f9-012a-4b75-b741-6549d7e169d8-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
@ 2022-08-23 20:12               ` Tejun Heo
       [not found]                 ` <YwU0mLBMuxpZ7Zwq-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Tejun Heo @ 2022-08-23 20:12 UTC (permalink / raw)
  To: Chris Murphy; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Johannes Weiner

On Tue, Aug 23, 2022 at 02:59:29PM -0400, Chris Murphy wrote:
> Same VM but a different boot:
> 
> Excerpts:
> 
> /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/gvfs-goa-volume-monitor.service/io.pressure:some avg10=3031575.41 avg60=56713935870.67 avg300=624837039080.83 total=18446621498826359
> /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/gvfs-goa-volume-monitor.service/io.pressure:full avg10=3031575.41 avg60=56713935870.80 avg300=624837039080.99 total=16045481047390973
> 
> None of that seems possible.
> 
> io is also affected:
> 
> /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/org.gnome.SettingsDaemon.Smartcard.service/io.pressure:full avg10=0.00 avg60=0.13 avg300=626490311370.87 total=16045481047397307
> 
> # oomctl
> # grep -R . /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/
> https://drive.google.com/file/d/1JoUxjQ2ribDvn5jmydCWXJdg0daaNScG/view?usp=sharing
> 
> We're going to try reverting 5f69a6577bc33d8f6d6bbe02bccdeb357b287f56 and see if it helps.

Can you see whether the following helps?

Thanks.

diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index ec66b40bdd40..00d62681ea6a 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -957,7 +957,7 @@ int psi_cgroup_alloc(struct cgroup *cgroup)
 	if (static_branch_likely(&psi_disabled))
 		return 0;
 
-	cgroup->psi = kmalloc(sizeof(struct psi_group), GFP_KERNEL);
+	cgroup->psi = kzalloc(sizeof(struct psi_group), GFP_KERNEL);
 	if (!cgroup->psi)
 		return -ENOMEM;
 

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: oomd with 6.0-rc1 has ridiculously high memory pressure stats wit
       [not found]                 ` <YwU0mLBMuxpZ7Zwq-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
@ 2022-08-24 11:40                   ` Chris Murphy
       [not found]                     ` <f354bbb3-6619-4ab0-b0fb-a0098ffb0205-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Murphy @ 2022-08-24 11:40 UTC (permalink / raw)
  To: Tejun Heo; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Johannes Weiner



On Tue, Aug 23, 2022, at 4:12 PM, Tejun Heo wrote:
> On Tue, Aug 23, 2022 at 02:59:29PM -0400, Chris Murphy wrote:
>> Same VM but a different boot:
>> 
>> Excerpts:
>> 
>> /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/gvfs-goa-volume-monitor.service/io.pressure:some avg10=3031575.41 avg60=56713935870.67 avg300=624837039080.83 total=18446621498826359
>> /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/gvfs-goa-volume-monitor.service/io.pressure:full avg10=3031575.41 avg60=56713935870.80 avg300=624837039080.99 total=16045481047390973
>> 
>> None of that seems possible.
>> 
>> io is also affected:
>> 
>> /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/org.gnome.SettingsDaemon.Smartcard.service/io.pressure:full avg10=0.00 avg60=0.13 avg300=626490311370.87 total=16045481047397307
>> 
>> # oomctl
>> # grep -R . /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/
>> https://drive.google.com/file/d/1JoUxjQ2ribDvn5jmydCWXJdg0daaNScG/view?usp=sharing
>> 
>> We're going to try reverting 5f69a6577bc33d8f6d6bbe02bccdeb357b287f56 and see if it helps.
>
> Can you see whether the following helps?
>
> Thanks.
>
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index ec66b40bdd40..00d62681ea6a 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -957,7 +957,7 @@ int psi_cgroup_alloc(struct cgroup *cgroup)
>  	if (static_branch_likely(&psi_disabled))
>  		return 0;
> 
> -	cgroup->psi = kmalloc(sizeof(struct psi_group), GFP_KERNEL);
> +	cgroup->psi = kzalloc(sizeof(struct psi_group), GFP_KERNEL);
>  	if (!cgroup->psi)
>  		return -ENOMEM;

Looks like it's fixed in limited testing. It'll get put into Rawhide and automated testing today hopefully. Thanks!

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: oomd with 6.0-rc1 has ridiculously high memory pressure stats wit
       [not found]                     ` <f354bbb3-6619-4ab0-b0fb-a0098ffb0205-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
@ 2022-08-26 12:06                       ` Chris Murphy
       [not found]                         ` <a7a96563-fd07-4970-8c25-f0784c83c915-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
  0 siblings, 1 reply; 8+ messages in thread
From: Chris Murphy @ 2022-08-26 12:06 UTC (permalink / raw)
  To: Chris Murphy, Tejun Heo; +Cc: cgroups-u79uwXL29TY76Z2rM5mHXA, Johannes Weiner



On Wed, Aug 24, 2022, at 7:40 AM, Chris Murphy wrote:
> On Tue, Aug 23, 2022, at 4:12 PM, Tejun Heo wrote:
>> On Tue, Aug 23, 2022 at 02:59:29PM -0400, Chris Murphy wrote:
>>> Same VM but a different boot:
>>> 
>>> Excerpts:
>>> 
>>> /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/gvfs-goa-volume-monitor.service/io.pressure:some avg10=3031575.41 avg60=56713935870.67 avg300=624837039080.83 total=18446621498826359
>>> /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/gvfs-goa-volume-monitor.service/io.pressure:full avg10=3031575.41 avg60=56713935870.80 avg300=624837039080.99 total=16045481047390973
>>> 
>>> None of that seems possible.
>>> 
>>> io is also affected:
>>> 
>>> /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/session.slice/org.gnome.SettingsDaemon.Smartcard.service/io.pressure:full avg10=0.00 avg60=0.13 avg300=626490311370.87 total=16045481047397307
>>> 
>>> # oomctl
>>> # grep -R . /sys/fs/cgroup/user.slice/user-1000.slice/user-HmGangybm7RTDjBF/Jpztg@public.gmane.org/
>>> https://drive.google.com/file/d/1JoUxjQ2ribDvn5jmydCWXJdg0daaNScG/view?usp=sharing
>>> 
>>> We're going to try reverting 5f69a6577bc33d8f6d6bbe02bccdeb357b287f56 and see if it helps.
>>
>> Can you see whether the following helps?
>>
>> Thanks.
>>
>> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
>> index ec66b40bdd40..00d62681ea6a 100644
>> --- a/kernel/sched/psi.c
>> +++ b/kernel/sched/psi.c
>> @@ -957,7 +957,7 @@ int psi_cgroup_alloc(struct cgroup *cgroup)
>>  	if (static_branch_likely(&psi_disabled))
>>  		return 0;
>> 
>> -	cgroup->psi = kmalloc(sizeof(struct psi_group), GFP_KERNEL);
>> +	cgroup->psi = kzalloc(sizeof(struct psi_group), GFP_KERNEL);
>>  	if (!cgroup->psi)
>>  		return -ENOMEM;
>
> Looks like it's fixed in limited testing. It'll get put into Rawhide 
> and automated testing today hopefully. 

Patch has been in Rawhide without any failures, consider it fixed. Thanks!


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: oomd with 6.0-rc1 has ridiculously high memory pressure stats wit
       [not found]                         ` <a7a96563-fd07-4970-8c25-f0784c83c915-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
@ 2022-08-26 21:11                           ` Tejun Heo
  0 siblings, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2022-08-26 21:11 UTC (permalink / raw)
  To: Chris Murphy
  Cc: Chris Murphy, cgroups-u79uwXL29TY76Z2rM5mHXA, Johannes Weiner

On Fri, Aug 26, 2022 at 08:06:15AM -0400, Chris Murphy wrote:
> Patch has been in Rawhide without any failures, consider it fixed. Thanks!

Upstream fix is 2b97cf76289a ("Hao Jia <jiahao.os-EC8Uxl6Npydl57MIdRCFDg@public.gmane.org>") which was
already sitting in my queue by the time you reported the problem. -rc3
should have it fixed. I should have pushed that out earlier. Sorry about
that.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-08-26 21:11 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-20  2:51 oomd with 6.0-rc1 has ridiculously high memory pressure stats wit Chris Murphy
     [not found] ` <d0df567c-1f6a-418d-8db7-3f777bd109c8-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
2022-08-23 18:27   ` Tejun Heo
     [not found]     ` <YwUcGvE/rhHEZ+KO-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2022-08-23 18:39       ` Chris Murphy
     [not found]         ` <9412f39b-9ec1-4542-944c-19577a358b97-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
2022-08-23 18:59           ` Chris Murphy
     [not found]             ` <0a6105f9-012a-4b75-b741-6549d7e169d8-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
2022-08-23 20:12               ` Tejun Heo
     [not found]                 ` <YwU0mLBMuxpZ7Zwq-NiLfg/pYEd1N0TnZuCh8vA@public.gmane.org>
2022-08-24 11:40                   ` Chris Murphy
     [not found]                     ` <f354bbb3-6619-4ab0-b0fb-a0098ffb0205-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
2022-08-26 12:06                       ` Chris Murphy
     [not found]                         ` <a7a96563-fd07-4970-8c25-f0784c83c915-jFIJ+Wc5/Vo7lZ9V/NTDHw@public.gmane.org>
2022-08-26 21:11                           ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.