From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: ** X-Spam-Status: No, score=2.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98EBCC2BB1D for ; Tue, 14 Apr 2020 22:13:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1D4D12076A for ; Tue, 14 Apr 2020 22:13:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=magicleap.com header.i=@magicleap.com header.b="TAiG2t4/"; dkim=pass (1024-bit key) header.d=magicleap.com header.i=@magicleap.com header.b="oAhp99+h" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1D4D12076A Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=magicleap.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 909F08E0003; Tue, 14 Apr 2020 18:13:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8BAD38E0001; Tue, 14 Apr 2020 18:13:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7A8D78E0003; Tue, 14 Apr 2020 18:13:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0076.hostedemail.com [216.40.44.76]) by kanga.kvack.org (Postfix) with ESMTP id 60A168E0001 for ; Tue, 14 Apr 2020 18:13:02 -0400 (EDT) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 1D17752D1 for ; Tue, 14 Apr 2020 22:13:02 +0000 (UTC) X-FDA: 76707861804.03.bed58_378607f61753f X-HE-Tag: bed58_378607f61753f X-Filterd-Recvd-Size: 15443 Received: from mx0a-001e9b01.pphosted.com (mx0b-001e9b01.pphosted.com [148.163.159.123]) by imf37.hostedemail.com (Postfix) with ESMTP for ; Tue, 14 Apr 2020 22:13:01 +0000 (UTC) Received: from pps.filterd (m0088348.ppops.net [127.0.0.1]) by mx0b-001e9b01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03EM3857031848 for ; Tue, 14 Apr 2020 18:13:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=magicleap.com; h=mime-version : references : in-reply-to : from : date : message-id : subject : to : cc : content-type; s=pp09042018; bh=reEZHUwn63LtbQXTynQVD3l8m3wWsoeiLITmGQD8zPY=; b=TAiG2t4/A61wpvHwrizDN/Gb/6Pl8bzhpnvaAXGQQAxgbZGOTdUAoP0qb1E6ofPh2YuA HDf6ijPMVsKf4xMAoU5KU3TplBCTz0ckgmGqnUsapfyjVd9Ulvpu/YneotvTfDz1SR6X woYq0xPNAFdZv3WbAvmSzi2SHfwXyhXX7vUhR2ir5sLcldOiLlPr9XDr0FPoYZo5x51p Na1ZMi/f69FIZLhr8LaBdzQSA4qK/qAOOLc8dhSoAnBAXQ/2JKAmvykuTXazc52abMy7 QjzypcQz+KFO0m6Wx07/M6eFxMqoFnxj7LijI0R4JDB2IHnfu4n69FcNVNsQbpOiDR8E JQ== Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) by mx0b-001e9b01.pphosted.com with ESMTP id 30dnbfr19e-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=OK) for ; Tue, 14 Apr 2020 18:13:00 -0400 Received: by mail-qv1-f69.google.com with SMTP id c3so1219019qvi.10 for ; Tue, 14 Apr 2020 15:13:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=magicleap.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=reEZHUwn63LtbQXTynQVD3l8m3wWsoeiLITmGQD8zPY=; b=oAhp99+hwHHnOIvSyAKOTXbnh/U5xb+EZdWNTw9EO9L8LA5wywXSPCeXmGOGvuRTVh J4s5TGD9l50rQ8wmwGQRdWkCuiB29bN4piDeMZAMGpzD5jAtGVOgGZIQ8RRFVzjtWGGT fhgxondZTuqItVN+OuRBP6pVy6IAUh1XTua9o= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=reEZHUwn63LtbQXTynQVD3l8m3wWsoeiLITmGQD8zPY=; b=MH6ZL0Kri7eVh2f3FO0Pt3chKxfLwlmE8UP0YVSkMtL/6g2WuLEJ8Psjzld/srMiKS xzXhhGRvrhVbUY/v5maFMfz/E9nwUU9hvkugvIvVKeCWQ5snWr4+VS6doIJLA9QCMK1c epGx/OCijh6xNBMOfb06JuILyakI32QHY/qXWsAPw/96f+zWIbSlxGsI0yE8MKSxzj5P xkx1pyWEF/G4mLRnDktxL4VMBPm/qICA1Glgd50IFLeLZUpprclDLp+NzXb40QKbqmDH j5mQT33yiTqOwWkf1r9QeD81WUdTA/uCG50oh5vAKzfMVd79vBwwayJZdeEwYRMnwZWo LznA== X-Gm-Message-State: AGi0PuZHK1SeMATByNGqGPPIa2bfgJKnjt4DU/9G6UTqsb1sXf98U9gt jAUXJEsO+AgWlKyGu3f6xyFgXNbVu0+TJBaslck6OZki+yueAi5Z2EbJ0Dodi6A4xxbmXZ6r70Q n/j+r9MxTdx8mgkf3C7UtEbTiWNk= X-Received: by 2002:a05:620a:2094:: with SMTP id e20mr10162630qka.365.1586902379590; Tue, 14 Apr 2020 15:12:59 -0700 (PDT) X-Google-Smtp-Source: APiQypIww29UrsiQZnHOHkHq5SL0BR2CdDVp0t+79KD/KWiGZ+vzcs7/y3722VVnY+tkqx7yuO3+JrXfETlBdAEU9H0= X-Received: by 2002:a05:620a:2094:: with SMTP id e20mr10162597qka.365.1586902379184; Tue, 14 Apr 2020 15:12:59 -0700 (PDT) MIME-Version: 1.0 References: <20200413215750.7239-1-lmoiseichuk@magicleap.com> <20200414113730.GH4629@dhcp22.suse.cz> <20200414192329.GC136578@cmpxchg.org> In-Reply-To: <20200414192329.GC136578@cmpxchg.org> From: Leonid Moiseichuk Date: Tue, 14 Apr 2020 18:12:47 -0400 Message-ID: Subject: Re: [PATCH 0/2] memcg, vmpressure: expose vmpressure controls To: Johannes Weiner Cc: Michal Hocko , svc lmoiseichuk , vdavydov.dev@gmail.com, tj@kernel.org, lizefan@huawei.com, cgroups@vger.kernel.org, akpm@linux-foundation.org, rientjes@google.com, minchan@kernel.org, vinmenon@codeaurora.org, andriy.shevchenko@linux.intel.com, penberg@kernel.org, linux-mm@kvack.org Content-Type: multipart/alternative; boundary="0000000000009b328705a347808e" X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138,18.0.676 definitions=2020-04-14_11:2020-04-14,2020-04-14 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 mlxlogscore=999 priorityscore=1501 lowpriorityscore=0 malwarescore=0 suspectscore=0 phishscore=0 clxscore=1015 mlxscore=0 impostorscore=0 spamscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004140156 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --0000000000009b328705a347808e Content-Type: text/plain; charset="UTF-8" I do not agree with all comments, see below. On Tue, Apr 14, 2020 at 3:23 PM Johannes Weiner wrote: > On Tue, Apr 14, 2020 at 12:42:44PM -0400, Leonid Moiseichuk wrote: > > On Tue, Apr 14, 2020 at 7:37 AM Michal Hocko wrote: > > > On Mon 13-04-20 17:57:48, svc_lmoiseichuk@magicleap.com wrote: > > > Anyway, I have to confess I am not a big fan of this. vmpressure turned > > > out to be a very weak interface to measure the memory pressure. Not > only > > > it is not numa aware which makes it unusable on many systems it also > > > gives data way too late from the practice. > > Yes, it's late in the game for vmpressure, and also a bit too late for > extensive changes in cgroup1. > 200 lines just to move functionality from one place to another without logic change? There does not seem to be extensive changes. > > > > Btw. why don't you use /proc/pressure/memory resp. its memcg > counterpart > > > to measure the memory pressure in the first place? > > > > > > > According to our checks PSI produced numbers only when swap enabled e.g. > > swapless device 75% RAM utilization: > > ==> /proc/pressure/io <== > > some avg10=0.00 avg60=1.18 avg300=1.51 total=9642648 > > full avg10=0.00 avg60=1.11 avg300=1.47 total=9271174 > > > > ==> /proc/pressure/memory <== > > some avg10=0.00 avg60=0.00 avg300=0.00 total=0 > > full avg10=0.00 avg60=0.00 avg300=0.00 total=0 > > That doesn't look right. With total=0, there couldn't have been any > reclaim activity, which means that vmpressure couldn't have reported > anything either. > Unfortunately not, vmpressure do reclaiming, I shared numbers/calls in the parallel letter. And I see kswapd+lmkd consumes quite a lot of cpu cycles. That is the same device, swap disabled. If I enable swap (zram based as Android usually does) it starts to make some numbers below 0.1, which does not seem huge pressure. By the time vmpressure reports a drop in reclaim efficiency, psi > should have already been reporting time spent doing reclaim. It > reports a superset of the information conveyed by vmpressure. > > > Probably it is possible to activate PSI by introducing high IO and swap > > enabled but that is not a typical case for mobile devices. > > > > With swap-enabled case memory pressure follows IO pressure with some > > fraction i.e. memory is io/2 ... io/10 depending on pattern. > > Light sysbench case with swap enabled > > ==> /proc/pressure/io <== > > some avg10=0.00 avg60=0.00 avg300=0.11 total=155383820 > > full avg10=0.00 avg60=0.00 avg300=0.05 total=100516966 > > ==> /proc/pressure/memory <== > > some avg10=0.00 avg60=0.00 avg300=0.06 total=465916397 > > full avg10=0.00 avg60=0.00 avg300=0.00 total=368664282 > > > > Since not all devices have zram or swap enabled it makes sense to have > > vmpressure tuning option possible since > > it is well used in Android and related issues are understandable. > > Android (since 10 afaik) uses psi to make low memory / OOM > decisions. See the introduction of the psi poll() support: > > https://urldefense.proofpoint.com/v2/url?u=https-3A__lwn.net_Articles_782662_&d=DwIBAg&c=0ia8zh_eZtQM1JEjWgVLZg&r=dIXgSomcB34epPNJ3JPl0D4WwsDd12lPHClV0_L9Aw4&m=GJC3IQZUa2vG0cqtoa4Ma_R-S_cRvQSZGbpD389b84w&s=Kp-EqrjqguJqWJ-tefwwRPeLIZennPkko0qEV_fgIbc&e= > Android makes a selection PSI (primary) or vmpressure (backup), see line 2872+ https://android.googlesource.com/platform/system/memory/lmkd/+/refs/heads/master/lmkd.cpp#2872 > > It's true that with swap you may see a more gradual increase in > pressure, whereas without swap you may go from idle to OOM much > faster, depending on what type of memory is being allocated. But psi > will still report it. You may just have to use poll() to get in-time > notification like you do with vmpressure. > I expected that any spikes will be visible in previous avg level e.g. 10s Cannot confirm that now but I could play around. If you have preferences about use-cases please let me know. -- With Best Wishes, Leonid --0000000000009b328705a347808e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I do not agree with all comments, see below.
On Tue, A= pr 14, 2020 at 3:23 PM Johannes Weiner <hannes@cmpxchg.org> wrote:
On Tue, Apr 14, 2020 at 12:42:44PM -0400, Leonid Mo= iseichuk wrote:
> On Tue, Apr 14, 2020 at 7:37 AM Michal Hocko <mhocko@kernel.org> wrote:
> > On Mon 13-04-20 17:57:48, svc_lmoiseichuk@magicleap.com wrote:
> > Anyway, I have to confess I am not a big fan of this. vmpressure = turned
> > out to be a very weak interface to measure the memory pressure. N= ot only
> > it is not numa aware which makes it unusable on many systems it a= lso
> > gives data way too late from the practice.

Yes, it's late in the game for vmpressure, and also a bit too late for<= br> extensive changes in cgroup1.
200 lines just to move f= unctionality=C2=A0from one place to another without logic change?
There does not seem to be extensive changes.
=C2=A0

> > Btw. why don't you use /proc/pressure/memory resp. its memcg = counterpart
> > to measure the memory pressure in the first place?
> >
>
> According to our checks PSI produced numbers only when swap enabled e.= g.
> swapless device 75% RAM utilization:
> =3D=3D> /proc/pressure/io <=3D=3D
> some avg10=3D0.00 avg60=3D1.18 avg300=3D1.51 total=3D9642648
> full avg10=3D0.00 avg60=3D1.11 avg300=3D1.47 total=3D9271174
>
> =3D=3D> /proc/pressure/memory <=3D=3D
> some avg10=3D0.00 avg60=3D0.00 avg300=3D0.00 total=3D0
> full avg10=3D0.00 avg60=3D0.00 avg300=3D0.00 total=3D0

That doesn't look right. With total=3D0, there couldn't have been a= ny
reclaim activity, which means that vmpressure couldn't have reported anything either.
Unfortunately not, vmpressure do recl= aiming, I shared numbers/calls in the parallel letter.
And I see = kswapd+lmkd consumes quite a lot of cpu cycles.
That is the same = device, swap disabled.
If I enable swap (zram based as Android us= ually does) it starts to make some numbers below 0.1,
which does = not seem huge pressure.=C2=A0
=C2=A0

By the time vmpressure reports a drop in reclaim efficiency, psi
should have already been reporting time spent doing reclaim. It
reports a superset of the information conveyed by vmpressure.


> Probably it is possible to activate PSI by introducing high IO and swa= p
> enabled but that is not a typical case for mobile devices.
>
> With swap-enabled case memory pressure follows IO pressure with some > fraction i.e. memory is io/2 ... io/10 depending on pattern.
> Light sysbench case with swap enabled
> =3D=3D> /proc/pressure/io <=3D=3D
> some avg10=3D0.00 avg60=3D0.00 avg300=3D0.11 total=3D155383820
> full avg10=3D0.00 avg60=3D0.00 avg300=3D0.05 total=3D100516966
> =3D=3D> /proc/pressure/memory <=3D=3D
> some avg10=3D0.00 avg60=3D0.00 avg300=3D0.06 total=3D465916397
> full avg10=3D0.00 avg60=3D0.00 avg300=3D0.00 total=3D368664282
>
> Since not all devices have zram or swap enabled it makes sense to have=
> vmpressure tuning option possible since
> it is well used in Android and related issues are understandable.

Android (since 10 afaik) uses psi to make low memory / OOM
decisions. See the introduction of the psi poll() support:
https://urldefense.proofpoint.com= /v2/url?u=3Dhttps-3A__lwn.net_Articles_782662_&d=3DDwIBAg&c=3D0ia8z= h_eZtQM1JEjWgVLZg&r=3DdIXgSomcB34epPNJ3JPl0D4WwsDd12lPHClV0_L9Aw4&m= =3DGJC3IQZUa2vG0cqtoa4Ma_R-S_cRvQSZGbpD389b84w&s=3DKp-EqrjqguJqWJ-tefww= RPeLIZennPkko0qEV_fgIbc&e=3D

A= ndroid makes a selection PSI (primary) or vmpressure (backup), see line 287= 2+
https://android.googlesource.c= om/platform/system/memory/lmkd/+/refs/heads/master/lmkd.cpp#2872
=C2=A0

It's true that with swap you may see a more gradual increase in
pressure, whereas without swap you may go from idle to OOM much
faster, depending on what type of memory is being allocated. But psi
will still report it. You may just have to use poll() to get in-time
notification like you do with vmpressure.
I expected t= hat any spikes will be visible in previous avg level e.g. 10s
Can= not confirm that now but I could play around.=C2=A0 If you have preferences= about use-cases please let me know.

=
--
With Best Wishes,
Leonid
--0000000000009b328705a347808e--