From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 61619C433DF for ; Tue, 13 Oct 2020 10:47:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B897520789 for ; Tue, 13 Oct 2020 10:47:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B897520789 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=i-love.sakura.ne.jp Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C3B85940007; Tue, 13 Oct 2020 06:47:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BECEA900002; Tue, 13 Oct 2020 06:47:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0255940007; Tue, 13 Oct 2020 06:47:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0239.hostedemail.com [216.40.44.239]) by kanga.kvack.org (Postfix) with ESMTP id 80347900002 for ; Tue, 13 Oct 2020 06:47:47 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 194848249980 for ; Tue, 13 Oct 2020 10:47:47 +0000 (UTC) X-FDA: 77366576574.17.knot99_5f0db9427202 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id ED2E5180D0180 for ; Tue, 13 Oct 2020 10:47:46 +0000 (UTC) X-HE-Tag: knot99_5f0db9427202 X-Filterd-Recvd-Size: 5838 Received: from www262.sakura.ne.jp (www262.sakura.ne.jp [202.181.97.72]) by imf29.hostedemail.com (Postfix) with ESMTP for ; Tue, 13 Oct 2020 10:47:45 +0000 (UTC) Received: from fsav301.sakura.ne.jp (fsav301.sakura.ne.jp [153.120.85.132]) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTP id 09DAkW8U076598; Tue, 13 Oct 2020 19:46:32 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Received: from www262.sakura.ne.jp (202.181.97.72) by fsav301.sakura.ne.jp (F-Secure/fsigk_smtp/550/fsav301.sakura.ne.jp); Tue, 13 Oct 2020 19:46:32 +0900 (JST) X-Virus-Status: clean(F-Secure/fsigk_smtp/550/fsav301.sakura.ne.jp) Received: from [192.168.1.9] (M106072142033.v4.enabler.ne.jp [106.72.142.33]) (authenticated bits=0) by www262.sakura.ne.jp (8.15.2/8.15.2) with ESMTPSA id 09DAkWqp076592 (version=TLSv1.2 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 13 Oct 2020 19:46:32 +0900 (JST) (envelope-from penguin-kernel@i-love.sakura.ne.jp) Subject: Re: [PATCH] mm, oom: enable rate-limiting controls for oom dumps To: Petr Mladek Cc: =?UTF-8?Q?Ricardo_Ca=c3=b1uelo?= , Michal Hocko , akpm@linux-foundation.org, kernel@collabora.com, hch@lst.de, guro@fb.com, rientjes@google.com, mcgrof@kernel.org, keescook@chromium.org, yzaikin@google.com, linux-mm@kvack.org, Sergey Senozhatsky , Steven Rostedt References: <20201009093014.9412-1-ricardo.canuelo@collabora.com> <20201012152232.GD10602@alley> <20201012154114.GJ29725@dhcp22.suse.cz> <87993bef-3f83-0527-fa52-4f2c28eb7e56@i-love.sakura.ne.jp> <20201013090259.GC26155@alley> From: Tetsuo Handa Message-ID: <9cb10e17-ac04-9f7d-2138-cc044e2b080b@i-love.sakura.ne.jp> Date: Tue, 13 Oct 2020 19:46:32 +0900 User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1 MIME-Version: 1.0 In-Reply-To: <20201013090259.GC26155@alley> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2020/10/13 18:02, Petr Mladek wrote: > On Tue 2020-10-13 09:40:27, Tetsuo Handa wrote: >> On 2020/10/13 0:41, Michal Hocko wrote: >>>> What about introducing some feedback from the printk code? >>>> >>>> static u64 printk_last_report_seq; >>>> >>>> if (consoles_seen(printk_last_report_seq)) { >>>> dump_header(); >>>> printk_last_report_seq = printk_get_last_seq(); >>>> } >>>> >>>> By other words. It would skip the massive report when the consoles >>>> were not able to see the previous one. >>> >>> I am pretty sure this has been discussed in the past but maybe we really >>> want to make ratelimit to work reasonably also for larger sections >>> instead. Current implementation only really works if the rate limited >>> operation is negligible wrt to the interval. Can we have a ratelimit >>> alternative with a scope effect (effectivelly lock like semantic)? >>> if (rate_limit_begin(&oom_rs)) { >>> dump_header(); >>> rate_limit_end(&oom_rs); >>> } >>> >>> rate_limi_begin would act like a try lock with additional constrain on >>> the period/cadence based on rate_limi_end marked values. >>> >> >> Here is one of past discussions. >> >> https://lkml.kernel.org/r/7de2310d-afbd-e616-e83a-d75103b986c6@i-love.sakura.ne.jp >> https://lkml.kernel.org/r/20190830103504.GA28313@dhcp22.suse.cz >> https://lkml.kernel.org/r/57be50b2-a97a-e559-e4bd-10d923895f83@i-love.sakura.ne.jp >> >> Michal Hocko complained about different OOM domains, and now just ignores it... > > How is this related to this discussion, please? AFAIK, we are > discussing how to tune the values of the existing ratelimiting. dump_tasks() is one of functions called from dump_header(). Since Michal wants to recognize OOM domains when ratelimiting dump_tasks(), ratelimit for dump_header() is also expected to recognize OOM domains. > >> Proper ratelimiting for OOM messages had better not to count on asynchronous printk(). > > I am a bit confused. AFAIK, you wanted to print OOM messages > asynchronous ways in the past. The lockless printk ringbuffer is on > its way into 5.10. Handling consoles in kthreads will be the next > step of the printk rework. What I'm proposing is synchronously printing OOM messages from a different thread, for one dump_tasks() call can generate thousands of lines which may significantly delay arrival of non OOM related messages to consoles (or even drop due to logbuf being full). I don't want to enqueue too many OOM related messages to logbuf, even after printk() became completely asynchronous. > > OK, the current state is that printk() is semi-synchronous. It does > console_trylock(). The console is handled immediately when it > succeeds. Otherwise it expects that the current console_lock owner > would do the job. > > Tuning ratelimits is not trivial for a particular system. It would > be better to have some autotuning. If the printk is synchronous, > we could measure how long the printing took. If it is asynchronous, > we could check whether the last report has been already flushed or > not. We could then decide whether to print the new report. Whether the last report has been already flushed needs to recognize OOM domains. > > What is the desired behavior, please? > > Could you please provide some examples how you would tune ratelimit > when printing all messages to the console takes X ms and OOM > happens every Y ms? My proposal is to decide whether to print the new report based on whether all OOM candidates for that OOM domain have been flushed to consoles. There is no X and Y.