From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7F55C48BDF for ; Tue, 22 Jun 2021 22:52:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 947FC6135A for ; Tue, 22 Jun 2021 22:52:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 947FC6135A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 85AC26B0095; Tue, 22 Jun 2021 18:52:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E3DF6B0099; Tue, 22 Jun 2021 18:52:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 66E536B0095; Tue, 22 Jun 2021 18:52:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0228.hostedemail.com [216.40.44.228]) by kanga.kvack.org (Postfix) with ESMTP id 223EC6B0095 for ; Tue, 22 Jun 2021 18:52:29 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 643732122B for ; Tue, 22 Jun 2021 22:52:29 +0000 (UTC) X-FDA: 78282860418.26.BAF9FC6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf04.hostedemail.com (Postfix) with ESMTP id DB42937E for ; Tue, 22 Jun 2021 22:52:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1624402348; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kUnWSTpegV9V02wr/VObzXlEQjcaase6+yZNO7VFkA8=; b=e6LgHB0OULNZFuNZJbQoCKRnzzSKH9YKSYLOWnwznvW30Bw893Q9upjFmV1fNEK3FYQJ+u /MpAz6H5LbrbmN2066SemL3EolNuM9PGDTNUv548qvA74fy4W456wvFwkH8h8M0vLz9wSz +9m0R+EfgL4JImlaDs/GjDB85jQhCqM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-526-W_Ade6nlOHa38g0tBLbG2g-1; Tue, 22 Jun 2021 18:52:24 -0400 X-MC-Unique: W_Ade6nlOHa38g0tBLbG2g-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 55DFE81CCBC; Tue, 22 Jun 2021 22:42:57 +0000 (UTC) Received: from [10.64.54.84] (vpn2-54-84.bne.redhat.com [10.64.54.84]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 7BC3819C66; Tue, 22 Jun 2021 22:42:47 +0000 (UTC) Reply-To: Gavin Shan Subject: Re: [PATCH v2 2/3] mm/page_reporting: Allow driver to specify threshold To: Alexander Duyck Cc: linux-mm , LKML , David Hildenbrand , "Michael S. Tsirkin" , Andrew Morton , Anshuman Khandual , Catalin Marinas , Will Deacon , shan.gavin@gmail.com References: <20210622074926.333223-1-gshan@redhat.com> <20210622074926.333223-3-gshan@redhat.com> From: Gavin Shan Message-ID: Date: Wed, 23 Jun 2021 10:43:48 +1000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=e6LgHB0O; spf=none (imf04.hostedemail.com: domain of gshan@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=gshan@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: DB42937E X-Stat-Signature: gmc931jcjxhfz7sc5mbzf1mphrorj1kt X-HE-Tag: 1624402348-670483 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 6/23/21 3:39 AM, Alexander Duyck wrote: > On Mon, Jun 21, 2021 at 10:48 PM Gavin Shan wrote: >> >> The page reporting threshold is currently sticky to @pageblock_order. >> The page reporting can never be triggered because the freeing page >> can't come up with a free area like that huge. The situation becomes >> worse when the system memory becomes heavily fragmented. >> >> For example, the following configurations are used on ARM64 when 64KB >> base page size is enabled. In this specific case, the page reporting >> won't be triggered until the freeing page comes up with a 512MB free >> area. That's hard to be met, especially when the system memory becomes >> heavily fragmented. >> >> PAGE_SIZE: 64KB >> HPAGE_SIZE: 512MB >> pageblock_order: 13 (512MB) >> MAX_ORDER: 14 >> >> This allows the drivers to specify the threshold when the page >> reporting device is registered. The threshold falls back to >> @pageblock_order if it's not specified by the driver. The existing >> users (hv_balloon and virtio_balloon) don't specify the threshold >> and @pageblock_order is still taken as their page reporting order. >> So this shouldn't introduce functional changes. >> >> Signed-off-by: Gavin Shan >> --- >> include/linux/page_reporting.h | 3 +++ >> mm/page_reporting.c | 14 ++++++++++---- >> mm/page_reporting.h | 10 ++-------- >> 3 files changed, 15 insertions(+), 12 deletions(-) >> >> diff --git a/include/linux/page_reporting.h b/include/linux/page_reporting.h >> index 3b99e0ec24f2..fe648dfa3a7c 100644 >> --- a/include/linux/page_reporting.h >> +++ b/include/linux/page_reporting.h >> @@ -18,6 +18,9 @@ struct page_reporting_dev_info { >> >> /* Current state of page reporting */ >> atomic_t state; >> + >> + /* Minimal order of page reporting */ >> + unsigned int order; >> }; >> >> /* Tear-down and bring-up for page reporting devices */ >> diff --git a/mm/page_reporting.c b/mm/page_reporting.c >> index df9c5054e1b4..27670360bae6 100644 >> --- a/mm/page_reporting.c >> +++ b/mm/page_reporting.c > > > >> @@ -324,6 +324,12 @@ int page_reporting_register(struct page_reporting_dev_info *prdev) >> goto err_out; >> } >> >> + /* >> + * We need to choose the minimal order of page reporting if it's >> + * not specified by the driver. >> + */ >> + prdev->order = prdev->order ? prdev->order : pageblock_order; >> + >> /* initialize state and work structures */ >> atomic_set(&prdev->state, PAGE_REPORTING_IDLE); >> INIT_DELAYED_WORK(&prdev->work, &page_reporting_process); > > Rather than using prdev->order directly it might be better to have a > reporting_order value you could export for use by > page_reporting_notify_free. That way you avoid the overhead of having > to make a function call per page freed. > Yes, I obviously missed the point to reduce the overhead because of function call. In next revision, I will introduce @page_reporting_order for this. Besides, it will be exported as a module parameter so that it can be changed dynamically, as David suggested before. >> diff --git a/mm/page_reporting.h b/mm/page_reporting.h >> index 2c385dd4ddbd..d9f972e72649 100644 >> --- a/mm/page_reporting.h >> +++ b/mm/page_reporting.h >> @@ -10,11 +10,9 @@ >> #include >> #include >> >> -#define PAGE_REPORTING_MIN_ORDER pageblock_order >> - >> #ifdef CONFIG_PAGE_REPORTING >> DECLARE_STATIC_KEY_FALSE(page_reporting_enabled); >> -void __page_reporting_notify(void); >> +void __page_reporting_notify(unsigned int order); >> >> static inline bool page_reported(struct page *page) >> { >> @@ -37,12 +35,8 @@ static inline void page_reporting_notify_free(unsigned int order) >> if (!static_branch_unlikely(&page_reporting_enabled)) >> return; >> >> - /* Determine if we have crossed reporting threshold */ >> - if (order < PAGE_REPORTING_MIN_ORDER) >> - return; >> - >> /* This will add a few cycles, but should be called infrequently */ >> - __page_reporting_notify(); >> + __page_reporting_notify(order); >> } >> #else /* CONFIG_PAGE_REPORTING */ >> #define page_reported(_page) false > > With us making the function call per page freed we are likely to have > a much more significant impact on performance with page reporting > enabled. Ideally we want to limit this impact so that we only take the > cost for the conditional check on the lower order pages. > Yep, thanks for the explanation, Alex. Thanks, Gavin