From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEE3DC433ED for ; Mon, 17 May 2021 08:22:54 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3BCE761184 for ; Mon, 17 May 2021 08:22:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3BCE761184 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AFF276B0036; Mon, 17 May 2021 04:22:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AAF156B006E; Mon, 17 May 2021 04:22:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B4336B0070; Mon, 17 May 2021 04:22:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0030.hostedemail.com [216.40.44.30]) by kanga.kvack.org (Postfix) with ESMTP id 594EE6B0036 for ; Mon, 17 May 2021 04:22:53 -0400 (EDT) Received: from smtpin31.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id E8805180D870D for ; Mon, 17 May 2021 08:22:52 +0000 (UTC) X-FDA: 78150032184.31.9C08F44 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf17.hostedemail.com (Postfix) with ESMTP id 6023D40002D7 for ; Mon, 17 May 2021 08:22:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1621239772; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dr+J0Dxpisr6jrG1Ah8gcgPpGfwRCikgsb5tMsnVjcI=; b=MgR0jy3NChSbDMzZhD091U5EPY5Q/6PaXGyp4NYGJCcYt6IppnED4N55mCP5moM1gfBrHh mLC4Zsr+cPUPVXkgU0mk5KgRH7qnbNg25Ix8pGlZ0007bGXB66JhUfHb/pkSl0MDyKO9co 3VMUxPBeaqJLb+lXI1mNnwpH8DU4lk8= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-51-sv5gQsOUPweeTk-8SlKv-g-1; Mon, 17 May 2021 04:22:50 -0400 X-MC-Unique: sv5gQsOUPweeTk-8SlKv-g-1 Received: by mail-wm1-f72.google.com with SMTP id g17-20020a05600c0011b029014399f816a3so1136488wmc.7 for ; Mon, 17 May 2021 01:22:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=dr+J0Dxpisr6jrG1Ah8gcgPpGfwRCikgsb5tMsnVjcI=; b=jg0nMZy+cQs6Xf2GaDDC4kxCDAgW9Z7VnqPVYdfFqJgyFu1jN0vlm1DHjHQx1XGJXL SpRl3auzBkMo0cIzabWWYOpIR69D7DLFuPhx3oaQhazOmymj8+GWH5sMrpZ1iUCoWXJw xX1v5VWzdjPurv8Kd8CGcxhIYa/zczW6hyP22iGxiBZuACRADO2ul6WrzuwMiphlLDzI fverPu5tk8zNemcO31wBiKzsqw+HR/466hSf0Qsum8MahLZGjFThhqyDTSgw29D7HMfq 32uk9K4mOCWcUSBxkp+91jjJ+VflXkMSSjR77OILiFrjvi84goKzxmTMYcirpY6aMNK7 TOsw== X-Gm-Message-State: AOAM533KEU3/Urqjap4UFKGJq0u/X+geIfeE2tr/8tGaleQl80BIMMkT sAYZNc3scY8aBiIJYxO+ZbE6naRWBEIKPhs5nISbyhw10cro9zAWE6wIpdX+T3mBG5XvW3rtNUh 9K9ZGc3YRN9E= X-Received: by 2002:adf:f10c:: with SMTP id r12mr5993775wro.26.1621239769059; Mon, 17 May 2021 01:22:49 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyZxi28kWl7YCjnitNIN7mJcuwRpwxRMDF3SKncZqYsivaeGfX0tohQDSOKYKB11G5PqKO12Q== X-Received: by 2002:adf:f10c:: with SMTP id r12mr5993735wro.26.1621239768799; Mon, 17 May 2021 01:22:48 -0700 (PDT) Received: from [192.168.3.132] (p5b0c6833.dip0.t-ipconnect.de. [91.12.104.51]) by smtp.gmail.com with ESMTPSA id b12sm16787985wro.28.2021.05.17.01.22.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 17 May 2021 01:22:48 -0700 (PDT) To: Baoquan He Cc: Mike Rapoport , Dave Young , Andrew Morton , christian.brauner@ubuntu.com, colin.king@canonical.com, corbet@lwn.net, frederic@kernel.org, gpiccoli@canonical.com, john.p.donnelly@oracle.com, jpoimboe@redhat.com, keescook@chromium.org, linux-mm@kvack.org, masahiroy@kernel.org, mchehab+huawei@kernel.org, mike.kravetz@oracle.com, mingo@kernel.org, mm-commits@vger.kernel.org, paulmck@kernel.org, peterz@infradead.org, rdunlap@infradead.org, rostedt@goodmis.org, saeed.mirzamohammadi@oracle.com, samitolvanen@google.com, sboyd@kernel.org, tglx@linutronix.de, torvalds@linux-foundation.org, vgoyal@redhat.com, yifeifz2@illinois.edu, Michal Hocko , kasong@redhat.com References: <2d0f53d9-51ca-da57-95a3-583dc81f35ef@redhat.com> <20210510045338.GB2946@localhost.localdomain> <4a544493-0622-ac6d-f14b-fb338e33b25e@redhat.com> <20210510104359.GC2946@localhost.localdomain> <20210511133641.GE2834@localhost.localdomain> <20210512145150.GG2834@localhost.localdomain> From: David Hildenbrand Organization: Red Hat Subject: Re: [patch 48/91] kernel/crash_core: add crashkernel=auto for vmcore creation Message-ID: <0ef02343-390b-9815-1666-24de4911c0b7@redhat.com> Date: Mon, 17 May 2021 10:22:47 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: <20210512145150.GG2834@localhost.localdomain> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=MgR0jy3N; spf=none (imf17.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 6023D40002D7 X-Stat-Signature: 47afejfgbfirzqkyrjyehj6ts9ihh65d X-HE-Tag: 1621239771-723654 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 12.05.21 16:51, Baoquan He wrote: > On 05/11/21 at 07:07pm, David Hildenbrand wrote: >>>> If the way adding default value into kernel config is disliked, >>>> this a) option looks good. We can get value with x% of system RAM, b= ut >>>> clamp it with CRASH_KERNEL_MIN/MAX. The CRASH_KERNEL_MIN/MAX may nee= d be >>>> defined with a default value for different ARCHes. It's very close t= o >>>> our current implementation, and handling 'auto' in kernel. >>>> >>>> And kernel config provided so that people can tune the MIN/MAX value= , >>>> but no need to post patch to do the tuning each time if have to? >>> Maybe I'm missing something, but the whole point is to avoid kernel >>> configuration option at all. If the crashkernel=3Dauto works good for= 99% of >>> the cases, there is no need to provide build time configuration along= with >>> it. There are plenty of ways users can control crashkernel reservatio= ns >>> with the existing 2-4 (depending on architecture) command line option= s. >>> >>> Simply hard coding a reasonable defaults (e.g. >>> "1G-64G:128M,64G-1T:256M,1T-:512M"), and using these defaults when >>> crashkernel=3Dauto is set would cover the same 99% of users you refer= red to. >> >> Right, and we can easily allocate a bit more as a safety net temporari= ly >> when we can actually shrink the area later. >> >>> >>> If we can resize the reservation later during boot this will also add= ress >>> David's concern about the wasted memory. >>> >> >> Yes. >> >>> You mentioned that amount of memory that is required for crash kernel >>> reservation depends on the devices present on the system. Is is possi= ble to >>> detect how much memory is required at late stages of boot? >> >> Here is my thinking: >> >> There seems to be some kind of formula we can roughly use to come up w= ith >> the final crashkernel size. Baoquan for sure knows all the dirty detai= ls, I >> assume it's roughly "core kernel + drivers + user space". >> >> In the kernel, we can only come up with "core kernel + drivers" expect= ing >> that we will run >> >> a) roughly the same kernel >> b) with roughly the same drivers >=20 > As replied to Mike, kernel size is undecided for different kernel with > different configs. We can define a default minimal size to cover kernel > and driver on systems with not many devices, but hardcoding the size > into upstream is not helpful. If the size is big, users will be asked t= o > check and shrink always. If the size is too small, a new value need be > got and added to cmdline and reboot. >=20 Hi Baoquan, Kairui, Dave, so IIUC now, our "old" kernel cannot actually tell us any reliable=20 "crashkernel area size" because a) it has no idea with which cmdline parameters the crashkernel will be started with, and these can have a big impact. b) it has no idea which driver will be loaded in the crashkernel. c) It has no idea what will be running in the crashkernel user space. AFAIKS, best we can do without further information is, therefore, use=20 some heuristic to a) allocate some memory early during boot in the=20 kernel and b) later refine our allocation, triggered by user space (->=20 shrink the crashkernel area). I dislike calling a) "auto". It provides a default based on some=20 heuristic (boot memory size), and that default might be very unfortunate=20 in some scenarios (-> waste memory). While we could discuss calling the current approach ( a)=20 )"crashkernel=3Ddefault", whereby the default is encoded at compile time=20 as determined by a distributor, I still still quite don't like it=20 because it feels like this is not necessary. We have a way to pass=20 something like that via the cmdline, so it's just a matter of properly=20 using that feature from user space. AFAIKS, all you want is most probably a more dynamic way to construct a=20 kernel cmdline, with some properties specific to a kernel. Let's assume the following: a) When a distributor ships a kernel, he also ships some kind of=20 defaults file. Let's assume for simplicity /lib/modules/5.11.19-200.fc33.x86_64/defaults.conf The file might contain CRASHKERNEL_DEFAULT=3DWHATEVER b) When generating the cmdline for e.g.,=20 /boot/loader/entries/XXX-5.11.19-200.fc33.x86_64.conf we run some script=20 that consult that file in addition to /etc/default/grub. For example, if=20 the kdump service was installed and /etc/default/grub does not contain=20 "crashkernel=3D" (except when we encounter "crashkernel=3Dauto" for compa= t=20 handling), we add "crashkernel=3DWHATEVER". Of course, we might do more=20 involved stuff based on the current setup, user config, etc. c) When we install the kdump service, all we have to do is re-generate=20 the boot entries AFAIKS. Just like we would when adding=20 "crashkernel=3Dauto" right now. The end result would also allow for having per-kernel defaults and=20 change them on kernel updates. Would require some thought on how to make=20 it fly in user space, how to "ship" the defaults etc. --=20 Thanks, David / dhildenb