From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 285E8C352A3 for ; Mon, 10 Feb 2020 12:15:15 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B956020838 for ; Mon, 10 Feb 2020 12:15:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="CcbooN0W" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B956020838 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 2792C6B00E8; Mon, 10 Feb 2020 07:15:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 22AF16B00E9; Mon, 10 Feb 2020 07:15:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1405D6B00EA; Mon, 10 Feb 2020 07:15:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0237.hostedemail.com [216.40.44.237]) by kanga.kvack.org (Postfix) with ESMTP id F181F6B00E8 for ; Mon, 10 Feb 2020 07:15:13 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9CFC62476 for ; Mon, 10 Feb 2020 12:15:13 +0000 (UTC) X-FDA: 76474112106.21.paper91_84e3f9a50cd1a X-HE-Tag: paper91_84e3f9a50cd1a X-Filterd-Recvd-Size: 5393 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Mon, 10 Feb 2020 12:15:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1581336912; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=XKiSrA88A/K9ba5g4Hk39s41LnQI95Gma7I2QCWt6eY=; b=CcbooN0Wj4ZE5BbkFxnq7kX5vINVDeYbdO3Bt7w9IBqhuVhCjgrjV5MOrFh4LQsPBBhpCG g90372ZvBCA9MVZz/NnwM5h8F5CZfqQarxddi9KoqMG+3aTss0jDFKHa1ZyWdJYeb7kH2s CepERQ/0aNwjGCB3gIS85kXPa910Olw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-267-o1KpWfUrO6GdWPKji7R_hQ-1; Mon, 10 Feb 2020 07:15:08 -0500 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 23D1D800D5C; Mon, 10 Feb 2020 12:15:07 +0000 (UTC) Received: from localhost.localdomain.com (vpn2-54-83.bne.redhat.com [10.64.54.83]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4A0DB60BF1; Mon, 10 Feb 2020 12:14:58 +0000 (UTC) From: Gavin Shan To: linux-mm@kvack.org Cc: drjones@redhat.com, david@redhat.com, bhe@redhat.com, hannes@cmpxchg.org, guro@fb.com Subject: [RFC PATCH] mm/vmscan: Don't round up scan size for online memory cgroup Date: Mon, 10 Feb 2020 23:14:45 +1100 Message-Id: <20200210121445.711819-1-gshan@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-MC-Unique: o1KpWfUrO6GdWPKji7R_hQ-1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: commit 68600f623d69 ("mm: don't miss the last page because of round-off error") makes the scan size round up to @denominator regardless of the memory cgroup's state, online or offline. This affects the overall reclaiming behavior: The corresponding LRU list is eligible for reclaiming only when its size logically right shifted by @sc->priority is bigger than zero in the former formula (non-roundup one). For example, the inactive anonymous LRU list should have at least 0x4000 pages to be eligible for reclaiming when we have 60/12 for swappiness/priority and without taking scan/rotation ratio into account. After the roundup is applied, the inactive anonymous LRU list becomes eligible for reclaiming when its size is bigger than or equal to 0x1000 in the same condition. (0x4000 >> 12) * 60 / (60 + 140 + 1) =3D 1 ((0x1000 >> 12) * 60) + 200) / (60 + 140 + 1) =3D 1 aarch64 has 512MB huge page size when the base page size is 64KB. The memory cgroup that has a huge page is always eligible for reclaiming in that case. The reclaiming is likely to stop after the huge page is reclaimed, meaing the subsequent @sc->priority and memory cgroups will be skipped. It changes the overall reclaiming behavior. This fixes the issue by applying the roundup to offlined memory cgroups only, to give more preference to reclaim memory from offlined memory cgroup. It sounds reasonable as those memory is likely to be useless. The issue was found by starting up 8 VMs on a Ampere Mustang machine, which has 8 CPUs and 16 GB memory. Each VM is given with 2 vCPUs and 2GB memory. 784MB swap space is consumed after these 8 VMs are completely up. Note that KSM is disable while THP is enabled in the testing. With this applied, the consumed swap space decreased to 60MB. total used free shared buff/cache available Mem: 16196 10065 2049 16 4081 3749 Swap: 8175 784 7391 total used free shared buff/cache available Mem: 16196 11324 3656 24 1215 2936 Swap: 8175 60 8115 Fixes: 68600f623d69 ("mm: don't miss the last page because of round-off err= or") Cc: # v4.20+ Signed-off-by: Gavin Shan --- mm/vmscan.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index c05eb9efec07..876370565455 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2415,10 +2415,13 @@ static void get_scan_count(struct lruvec *lruvec, s= truct scan_control *sc, =09=09=09/* =09=09=09 * Scan types proportional to swappiness and =09=09=09 * their relative recent reclaim efficiency. -=09=09=09 * Make sure we don't miss the last page -=09=09=09 * because of a round-off error. +=09=09=09 * Make sure we don't miss the last page on +=09=09=09 * the offlined memory cgroups because of a +=09=09=09 * round-off error. =09=09=09 */ -=09=09=09scan =3D DIV64_U64_ROUND_UP(scan * fraction[file], +=09=09=09scan =3D mem_cgroup_online(memcg) ? +=09=09=09 div64_u64(scan * fraction[file], denominator) : +=09=09=09 DIV64_U64_ROUND_UP(scan * fraction[file], =09=09=09=09=09=09 denominator); =09=09=09break; =09=09case SCAN_FILE: --=20 2.23.0