From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9F3CC56202 for ; Wed, 25 Nov 2020 11:11:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DB253206E5 for ; Wed, 25 Nov 2020 11:10:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=mg.codeaurora.org header.i=@mg.codeaurora.org header.b="cBoQeoXF" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DB253206E5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BB77D6B006E; Wed, 25 Nov 2020 06:10:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B68076B0070; Wed, 25 Nov 2020 06:10:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A57566B0071; Wed, 25 Nov 2020 06:10:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0170.hostedemail.com [216.40.44.170]) by kanga.kvack.org (Postfix) with ESMTP id 8D1736B006E for ; Wed, 25 Nov 2020 06:10:58 -0500 (EST) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 5B513181AEF1E for ; Wed, 25 Nov 2020 11:10:58 +0000 (UTC) X-FDA: 77522673396.11.wound50_190136f27376 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin11.hostedemail.com (Postfix) with ESMTP id 3D176180F8B82 for ; Wed, 25 Nov 2020 11:10:58 +0000 (UTC) X-HE-Tag: wound50_190136f27376 X-Filterd-Recvd-Size: 5910 Received: from z5.mailgun.us (z5.mailgun.us [104.130.96.5]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Wed, 25 Nov 2020 11:10:56 +0000 (UTC) DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1606302657; h=Content-Transfer-Encoding: Content-Type: In-Reply-To: MIME-Version: Date: Message-ID: From: References: Cc: To: Subject: Sender; bh=PqJsQPfxrk5KprxWhwtPPwFZQ3ibAZyll4K/xztV6gA=; b=cBoQeoXFBCcqpr+C9KkCW1Lfu10oFWxbfnkHBX4diKwOlVjecS1jE+ePhmZ7M/qGkmDqVppd 4jeQYHdBYIdTsqDysSu4mkPzZqnupem25sIu4+z78+k6d8ujjZ3yVYciNEhGjx3sU8ErGQC2 Kmt0ljIZy5m5OBsazRdxccfRdWs= X-Mailgun-Sending-Ip: 104.130.96.5 X-Mailgun-Sid: WyIwY2Q3OCIsICJsaW51eC1tbUBrdmFjay5vcmciLCAiYmU5ZTRhIl0= Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n06.prod.us-east-1.postgun.com with SMTP id 5fbe3bb61b731a5d9c55319d (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Wed, 25 Nov 2020 11:10:46 GMT Received: by smtp.codeaurora.org (Postfix, from userid 1001) id BFACFC43462; Wed, 25 Nov 2020 11:10:45 +0000 (UTC) Received: from [192.168.29.110] (unknown [49.37.154.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: charante) by smtp.codeaurora.org (Postfix) with ESMTPSA id 136F8C43460; Wed, 25 Nov 2020 11:10:42 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 136F8C43460 Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: aws-us-west-2-caf-mail-1.web.codeaurora.org; spf=fail smtp.mailfrom=charante@codeaurora.org Subject: Re: [PATCH] mm: memory_hotplug: put migration failure information under DEBUG_VM To: Vlastimil Babka , Michal Hocko Cc: akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, "vinmenon@codeaurora.org" References: <1606140196-6053-1-git-send-email-charante@codeaurora.org> <20201123141354.GQ27488@dhcp22.suse.cz> <775a56a9-b301-31bb-cd6d-8b82b1dd4d65@suse.cz> From: Charan Teja Kalla Message-ID: <77fcf5d8-7fae-38bb-5bd4-930715163c07@codeaurora.org> Date: Wed, 25 Nov 2020 16:40:40 +0530 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.12.1 MIME-Version: 1.0 In-Reply-To: <775a56a9-b301-31bb-cd6d-8b82b1dd4d65@suse.cz> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Thanks Vlastimil! On 11/24/2020 7:09 PM, Vlastimil Babka wrote: > On 11/23/20 4:10 PM, Charan Teja Kalla wrote: >> >> Thanks Michal! >> On 11/23/2020 7:43 PM, Michal Hocko wrote: >>> On Mon 23-11-20 19:33:16, Charan Teja Reddy wrote: >>>> When the pages are failed to get isolate or migrate, the page owner >>>> information along with page info is dumped. If there are continuous >>>> failures in migration(say page is pinned) or isolation, the log buffer >>>> is simply getting flooded with the page owner information. As most of >>>> the times page info is sufficient to know the causes for failures of >>>> migration or isolation, place the page owner information under >>>> DEBUG_VM. >>> >>> I do not see why this path is any different from others that call >>> dump_page. Page owner can add a very valuable information to debug >>> the underlying reasons for failures here. It is an opt-in debugging >>> feature which needs to be enabled explicitly. So I would argue users >>> are ready to accept a lot of data in the kernel log. >> >> Just thinking how frequently failures can happen in those paths. In the >> memory hotplug path, we can flood the page owner logs just by making one >> page pinned. Say If it is anonymous page, the page owner information > > So you say it's flooded when page_owner info is included, but not > flooded when only the rest of __dump_page() is printed? (which is also > not just one or two lines). That has to be very specific rate of failures. > Anyway I don't like the solution with arbitrary config option. To > prevent flooding we generally have ratelimiting, how about that? > I can still say the logs are flooded with just the __dump_page() but they are lot lesser compare to dump_page_owner. The lines are something like below: page:ffffffff0b070b80 refcount:3 mapcount:1 mapping:ffffff80faf118e9 index:0xc0903 anon flags: 0x800000000008042c(uptodate|dirty|active|owner_priv_1|swapbacked) raw: 800000000008042c ffffffc047483a58 ffffffc047483a58 ffffff80faf118e9 raw: 00000000000c0903 00000000000985eb 0000000300000000 ffffff800b5f3000 page dumped because: migration failure page->mem_cgroup:ffffff800b5f3000 page_owner tracks the page as allocated Rate limiting the logs, both from __dump_page and dump_page_owner, looking nice. If it is okay for both of you and Michal, I can raise the V2 here. > Also agreed with Michal that page_owner is explicitly enabled debugging > option and if you use it in production, that's rather surprising to me, > and possibly more rare than DEBUG_VM, which IIRC Fedora kernels use. We just enable it on the internal debug systems but never on the production kernels. -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project