From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3DF0C07E97 for ; Fri, 2 Jul 2021 20:28:23 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3F04F613ED for ; Fri, 2 Jul 2021 20:28:23 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3F04F613ED Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=roeck-us.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B6CD26B007B; Fri, 2 Jul 2021 16:28:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B1CC58D0003; Fri, 2 Jul 2021 16:28:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9BD126B007E; Fri, 2 Jul 2021 16:28:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0178.hostedemail.com [216.40.44.178]) by kanga.kvack.org (Postfix) with ESMTP id 79A1C6B007B for ; Fri, 2 Jul 2021 16:28:22 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 24BBF17A99 for ; Fri, 2 Jul 2021 20:28:22 +0000 (UTC) X-FDA: 78318785244.29.3108E50 Received: from mail-oi1-f178.google.com (mail-oi1-f178.google.com [209.85.167.178]) by imf19.hostedemail.com (Postfix) with ESMTP id C2D88B0000A9 for ; Fri, 2 Jul 2021 20:28:21 +0000 (UTC) Received: by mail-oi1-f178.google.com with SMTP id q23so12681067oiw.11 for ; Fri, 02 Jul 2021 13:28:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:to:cc:references:from:subject:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=G3oeQWQfQg7R6ZTkuSFwzLdWju3dWQvTVitCiMSYhzA=; b=FyILO6ZXBGGJBsNmgS4gCCfKYz8ArhANiGesQ6ixptTTjSgqtT/NGnvn2uxFpob84O Sx2eVDvn8yMeU72VCUwKwKlzL8+67cjXQfQu9YRaopoeWTBn9eG0VwGGWmmgRRPFnpgt +Q4/MK7Zl7oCSgk/Q95E3RFmY6GTm4Z6PNZGLUurvfSTWKB2whMTi4HriXvtuyrUwM/+ XeVDCXWR8yNVDUbuoboll9uB2O+VYjfrPpCnyafGSSmC+kZqbpL0JGjgUU6MLuRRzup9 HWkQuKx5yxSlaluGXuw2WXVo7aMf4GKyjfSK9j/w5gN2rdUK7R2CbjG6T4kOy9pQQrcS EhjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:to:cc:references:from:subject:message-id :date:user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=G3oeQWQfQg7R6ZTkuSFwzLdWju3dWQvTVitCiMSYhzA=; b=RYvvJ1gK9OO5Oq9aHCZQzINyNsPKIFZw9ktyCqHTPzznmOBxr/b/PvHuEVw/+QCvrr L6fwg6XCHIRoTHfCBqyH8Hijtc9MCS35STP/UW2AZscYZ8BTGhnqgKWeJan1O4auSNno exhmrE9dvKCunP6tNxrPAsFwQM78MoMVfLLd2Fgn5tw5mouuDMoXL9IcsgSup79VjFxn gVF7dW2K5AG365/RULkZMjRGeemiDQ9Q3MuKUtqsU8DlRE48Gox9VGRhE/vpPBq47fE5 tW+RvZchi9Ei4tgCUBzuMo+mbwvA8g9cl5ny3n+6aW58OI83d+gpFqD6mpWe5UNQVZz0 BlnA== X-Gm-Message-State: AOAM530JMnENkShLRjBGxxbFc/gIfU8d3ECmZX8m3IFIzavTnL33Hvdq O7aqzqhM73PTx42tE8yM79o= X-Google-Smtp-Source: ABdhPJynPxNN4TOayB4uOuYJUOVcsddvIpO8SDqE9jSnUSRgcaWYA3OwnY498Dj4i+9OhXOPsKUcCA== X-Received: by 2002:a54:418c:: with SMTP id 12mr1052171oiy.42.1625257701153; Fri, 02 Jul 2021 13:28:21 -0700 (PDT) Received: from server.roeck-us.net ([2600:1700:e321:62f0:329c:23ff:fee3:9d7c]) by smtp.gmail.com with ESMTPSA id e74sm806771ote.14.2021.07.02.13.28.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 02 Jul 2021 13:28:20 -0700 (PDT) To: Dennis Zhou Cc: Tejun Heo , Christoph Lameter , Roman Gushchin , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20210419225047.3415425-1-dennis@kernel.org> <20210419225047.3415425-4-dennis@kernel.org> <20210702191140.GA3166599@roeck-us.net> From: Guenter Roeck Subject: Re: [PATCH 3/4] percpu: implement partial chunk depopulation Message-ID: Date: Fri, 2 Jul 2021 13:28:18 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: C2D88B0000A9 Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=FyILO6ZX; spf=pass (imf19.hostedemail.com: domain of groeck7@gmail.com designates 209.85.167.178 as permitted sender) smtp.mailfrom=groeck7@gmail.com; dmarc=none X-Stat-Signature: cxnr6ibua9d4m5upb9zdanjcz8wf44xn X-HE-Tag: 1625257701-721621 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 7/2/21 12:45 PM, Dennis Zhou wrote: > Hello, >=20 > On Fri, Jul 02, 2021 at 12:11:40PM -0700, Guenter Roeck wrote: >> Hi, >> >> On Mon, Apr 19, 2021 at 10:50:46PM +0000, Dennis Zhou wrote: >>> From: Roman Gushchin >>> >>> This patch implements partial depopulation of percpu chunks. >>> >>> As of now, a chunk can be depopulated only as a part of the final >>> destruction, if there are no more outstanding allocations. However >>> to minimize a memory waste it might be useful to depopulate a >>> partially filed chunk, if a small number of outstanding allocations >>> prevents the chunk from being fully reclaimed. >>> >>> This patch implements the following depopulation process: it scans >>> over the chunk pages, looks for a range of empty and populated pages >>> and performs the depopulation. To avoid races with new allocations, >>> the chunk is previously isolated. After the depopulation the chunk is >>> sidelined to a special list or freed. New allocations prefer using >>> active chunks to sidelined chunks. If a sidelined chunk is used, it i= s >>> reintegrated to the active lists. >>> >>> The depopulation is scheduled on the free path if the chunk is all of >>> the following: >>> 1) has more than 1/4 of total pages free and populated >>> 2) the system has enough free percpu pages aside of this chunk >>> 3) isn't the reserved chunk >>> 4) isn't the first chunk >>> If it's already depopulated but got free populated pages, it's a good >>> target too. The chunk is moved to a special slot, >>> pcpu_to_depopulate_slot, chunk->isolated is set, and the balance work >>> item is scheduled. On isolation, these pages are removed from the >>> pcpu_nr_empty_pop_pages. It is constantly replaced to the >>> to_depopulate_slot when it meets these qualifications. >>> >>> pcpu_reclaim_populated() iterates over the to_depopulate_slot until i= t >>> becomes empty. The depopulation is performed in the reverse direction= to >>> keep populated pages close to the beginning. Depopulated chunks are >>> sidelined to preferentially avoid them for new allocations. When no >>> active chunk can suffice a new allocation, sidelined chunks are first >>> checked before creating a new chunk. >>> >>> Signed-off-by: Roman Gushchin >>> Co-developed-by: Dennis Zhou >>> Signed-off-by: Dennis Zhou >> >> This patch results in a number of crashes and other odd behavior >> when trying to boot mips images from Megasas controllers in qemu. >> Sometimes the boot stalls, but I also see various crashes. >> Some examples and bisect logs are attached. >=20 > Ah, this doesn't look good.. Do you have a reproducer I could use to > debug this? >=20 I copied the relevant information to http://server.roeck-us.net/qemu/mips= /. run.sh - qemu command (I tried with qemu 6.0 and 4.2.1) rootfs.ext2 - root file system config - complete configuration defconfig - shortened configuration vmlinux - a crashing kernel image (v5.13-7637-g3dbdb38e2869, with above c= onfiguration) Interestingly, the crash doesn't always happen at the same location, even with the same image. Some memory corruption, maybe ? Hope this helps. Please let me know if I can provide anything else. Thanks, Guenter