From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0D47C433FE for ; Tue, 8 Feb 2022 09:19:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 407A36B007B; Tue, 8 Feb 2022 04:19:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B7BE6B007D; Tue, 8 Feb 2022 04:19:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 281226B007E; Tue, 8 Feb 2022 04:19:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0135.hostedemail.com [216.40.44.135]) by kanga.kvack.org (Postfix) with ESMTP id 18FD36B007B for ; Tue, 8 Feb 2022 04:19:51 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id D5277944CC for ; Tue, 8 Feb 2022 09:19:50 +0000 (UTC) X-FDA: 79119065340.12.BB1A701 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf31.hostedemail.com (Postfix) with ESMTP id 6B61A20006 for ; Tue, 8 Feb 2022 09:19:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644311989; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fgE3JiFXNWBGyW03yQCPZB4yjBQ25a9J59YHnaFuEGk=; b=CB6HktTjOZxs0VGEu9mv6QyTF53UYJCJfqRmK/lpG6LAPzzMTRlOfG1OL1TkDW/Rdo4ZSl WrGaGOL2hGkCAaRZ8q6gAYg4AxEuwqWGyZbsjClXwslB9bh/c2qrTVwrplwh4BMPtG3QiJ f64TRjI6gFTUzV9tyKtGA42OgqmOg1g= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-132-982PDnmWOwuSPY15KkYNhw-1; Tue, 08 Feb 2022 04:19:48 -0500 X-MC-Unique: 982PDnmWOwuSPY15KkYNhw-1 Received: by mail-wm1-f72.google.com with SMTP id r16-20020a05600c2c5000b0037bb20c50b8so293102wmg.3 for ; Tue, 08 Feb 2022 01:19:48 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=fgE3JiFXNWBGyW03yQCPZB4yjBQ25a9J59YHnaFuEGk=; b=LknvOu7jPIyLH1DA93gNHalJJ/y6biJsWKu8gAptEOdV7d2WAFPoHL0KH8yQIpds0T nTMIBnYSKTtIDKEy0N6iM11xFlmRc4RcM/aVtjL6IOa/FogmH/mjrOOJh4InWv56zTEj ZH6TG9FN448q0OwR22JD3zBzCUYpXY/HmSAENJfjPsrl7tU/fgnVRCTCgoZbZHnXbmK9 i0WkDEWH8FH0PNjgxovWAOH9tXObtSxg/JDGm0KfPfE4u/VnfB1kyERpKMzrO4wPOBa+ uH2AizRvZ4GjfG21UJafkpcIw0fMCorUZiFo/yh/YCtKFShBS5Gpw457KgAGYwR5getA mr2g== X-Gm-Message-State: AOAM533oL2eQz7K51J1IEMkTP8oYP7mkf5UhWnzdncTtGiUXGVlAXrLP xHwfv4ucyuWr+DJFB5aiLdad3y2AlkhBIFVYuR2aXbGMYo3+hIOzuf/vQUuZgQZF4s8nBCKhJmW 1GDxWK1SDZaw= X-Received: by 2002:a05:600c:3d8c:: with SMTP id bi12mr257734wmb.109.1644311987461; Tue, 08 Feb 2022 01:19:47 -0800 (PST) X-Google-Smtp-Source: ABdhPJz+jC4XaVIAqmUjzqE4Je42JPMCnYWJ1tp9Ma4OMpqAmYGYyuzchSbYXG2LU+CPhV2XdI1Wpw== X-Received: by 2002:a05:600c:3d8c:: with SMTP id bi12mr257716wmb.109.1644311987231; Tue, 08 Feb 2022 01:19:47 -0800 (PST) Received: from ?IPV6:2003:cb:c712:a800:a1a0:a823:5301:d1af? (p200300cbc712a800a1a0a8235301d1af.dip0.t-ipconnect.de. [2003:cb:c712:a800:a1a0:a823:5301:d1af]) by smtp.gmail.com with ESMTPSA id o3sm4567283wrc.41.2022.02.08.01.19.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 08 Feb 2022 01:19:46 -0800 (PST) Message-ID: <72ae5d5b-512e-4dd4-4bb0-d867fb788f60@redhat.com> Date: Tue, 8 Feb 2022 10:19:45 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.4.0 To: Miaohe Lin Cc: isimatu.yasuaki@jp.fujitsu.com, toshi.kani@hp.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton References: <20220207135618.17231-1-linmiaohe@huawei.com> <6d4ab70e-b944-5f7d-e9a3-979ac66c70f7@redhat.com> <828c9b16-6ff0-abb7-3a16-277d2d60de81@huawei.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH] mm/memory_hotplug: fix kfree() of bootmem memory In-Reply-To: <828c9b16-6ff0-abb7-3a16-277d2d60de81@huawei.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Stat-Signature: eqgdxhu8wkksg6qdsjeogo4uqqp61hzp Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=CB6HktTj; spf=none (imf31.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 6B61A20006 X-HE-Tag: 1644311990-283170 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 08.02.22 02:59, Miaohe Lin wrote: > Hi: > On 2022/2/7 22:33, David Hildenbrand wrote: >> On 07.02.22 14:56, Miaohe Lin wrote: >>> We can't use kfree() to release the resource as it might come from bootmem. >>> Use release_mem_region() instead. >> >> How can this happen? release_mem_region() is called either from >> __add_memory() or from add_memory_driver_managed(), where we allocated >> the region via register_memory_resource(). Both functions shouldn't ever >> be called before the buddy is up an running. >> >> Do you have a backtrace of an actual instance of this issue? Or was this >> identified as possibly broken by code inspection? >> > > This is identified as possibly broken by code inspection. IIUC, alloc_resource > is always used to allocate the resource. It has the below logic: > > if (bootmem_resource_free) { > res = bootmem_resource_free; > bootmem_resource_free = res->sibling; > } > > where bootmem_resource_free is used to reusing the resource entries allocated by boot > mem after the system is up: > > /* > * For memory hotplug, there is no way to free resource entries allocated > * by boot mem after the system is up. So for reusing the resource entry > * we need to remember the resource. > */ > static struct resource *bootmem_resource_free; > > So I think register_memory_resource() can reuse the resource allocated by bootmem. > Or am I miss anything? I think you're right, if we did a previous free_resource() of a resource allocated during boot we could end up reusing that here. My best guess is that this never really happens. Wow, that's ugly. It affects essentially anybody reserving+freeing a resource. E.g., dax/kmem.c similarly does a release_resource(res)+kfree(res) We could either a) Expose free_resource() and replace all kfree(res) instances by it b) Just simplify that. I don't think we care about saving a couple of bytes in corner cases. I might be wrong (IIRC primarily ppc64 really succeeds in unplugging boot memory) diff --git a/kernel/resource.c b/kernel/resource.c index 9c08d6e9eef2..fe91a72fd951 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -56,14 +56,6 @@ struct resource_constraint { static DEFINE_RWLOCK(resource_lock); -/* - * For memory hotplug, there is no way to free resource entries allocated - * by boot mem after the system is up. So for reusing the resource entry - * we need to remember the resource. - */ -static struct resource *bootmem_resource_free; -static DEFINE_SPINLOCK(bootmem_resource_lock); - static struct resource *next_resource(struct resource *p) { if (p->child) @@ -160,36 +152,19 @@ __initcall(ioresources_init); static void free_resource(struct resource *res) { - if (!res) - return; - - if (!PageSlab(virt_to_head_page(res))) { - spin_lock(&bootmem_resource_lock); - res->sibling = bootmem_resource_free; - bootmem_resource_free = res; - spin_unlock(&bootmem_resource_lock); - } else { + /* + * If the resource was allocated using memblock early during boot + * we'll leak it here: we can only return full pages back to the + * buddy and trying to be smart and reusing them eventually in + * alloc_resource() overcomplicates resource handling. + */ + if (res && PageSlab(virt_to_head_page(res))) kfree(res); - } } static struct resource *alloc_resource(gfp_t flags) { - struct resource *res = NULL; - - spin_lock(&bootmem_resource_lock); - if (bootmem_resource_free) { - res = bootmem_resource_free; - bootmem_resource_free = res->sibling; - } - spin_unlock(&bootmem_resource_lock); - - if (res) - memset(res, 0, sizeof(struct resource)); - else - res = kzalloc(sizeof(struct resource), flags); - - return res; + return kzalloc(sizeof(struct resource), flags); } /* Return the conflict entry if you can't request it */ -- Thanks, David / dhildenb