From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=0NEp=ZA=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS,
	MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no
	version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 8258AFA372C
	for <linux-mm@archiver.kernel.org>; Fri,  8 Nov 2019 16:43:41 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id 36BFD206DF
	for <linux-mm@archiver.kernel.org>; Fri,  8 Nov 2019 16:43:41 +0000 (UTC)
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 36BFD206DF
Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.intel.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id 94D906B0005; Fri,  8 Nov 2019 11:43:39 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 8FE096B0006; Fri,  8 Nov 2019 11:43:39 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 813896B0007; Fri,  8 Nov 2019 11:43:39 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0243.hostedemail.com [216.40.44.243])
	by kanga.kvack.org (Postfix) with ESMTP id 69B6A6B0005
	for <linux-mm@kvack.org>; Fri,  8 Nov 2019 11:43:39 -0500 (EST)
Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay03.hostedemail.com (Postfix) with SMTP id 1F42682499A8
	for <linux-mm@kvack.org>; Fri,  8 Nov 2019 16:43:39 +0000 (UTC)
X-FDA: 76133681358.26.laugh19_3ce31b9d70747
X-HE-Tag: laugh19_3ce31b9d70747
X-Filterd-Recvd-Size: 6273
Received: from mga12.intel.com (mga12.intel.com [192.55.52.136])
	by imf14.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Fri,  8 Nov 2019 16:43:37 +0000 (UTC)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga008.jf.intel.com ([10.7.209.65])
  by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Nov 2019 08:43:35 -0800
X-IronPort-AV: E=Sophos;i="5.68,282,1569308400"; 
   d="scan'208";a="196952089"
Received: from ahduyck-desk1.jf.intel.com ([10.7.198.76])
  by orsmga008-auth.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 08 Nov 2019 08:43:35 -0800
Message-ID: <7cf988ff87c51d70538a99eb0f5b8181857ad341.camel@linux.intel.com>
Subject: Re: + mm-introduce-reported-pages.patch added to -mm tree
From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Dave Hansen <dave.hansen@intel.com>, David Hildenbrand
 <david@redhat.com>,  akpm@linux-foundation.org, aarcange@redhat.com,
 dan.j.williams@intel.com,  konrad.wilk@oracle.com, lcapitulino@redhat.com,
 mgorman@techsingularity.net,  mm-commits@vger.kernel.org, mst@redhat.com,
 osalvador@suse.de, pagupta@redhat.com,  pbonzini@redhat.com,
 riel@surriel.com, vbabka@suse.cz, wei.w.wang@intel.com, 
 willy@infradead.org, yang.zhang.wz@gmail.com, linux-mm@kvack.org
Date: Fri, 08 Nov 2019 08:43:35 -0800
In-Reply-To: <20191108095713.GC15658@dhcp22.suse.cz>
References: <95a78ac2-73bf-2985-9769-e269e8d13d68@intel.com>
	 <92C323F0-41BE-4988-8C39-1513FAC40458@redhat.com>
	 <ec585925-1f72-fbbe-3e24-63aed83b3836@intel.com>
	 <20191107174644.GA8314@dhcp22.suse.cz>
	 <91ccd1e4a9077e22379edbaac2fd8c16897b1f7a.camel@linux.intel.com>
	 <20191108095713.GC15658@dhcp22.suse.cz>
Content-Type: text/plain; charset="UTF-8"
User-Agent: Evolution 3.30.5 (3.30.5-1.fc29) 
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On Fri, 2019-11-08 at 10:57 +0100, Michal Hocko wrote:
> On Thu 07-11-19 10:12:21, Alexander Duyck wrote:
> > On Thu, 2019-11-07 at 18:46 +0100, Michal Hocko wrote:
> [...]
> > > I have asked several times why there is such a push and received no
> > > answer but "this is taking too long" which I honestly do not care much.
> > > Especially when other virt people tend to agree that there is no need to
> > > rush here.
> > 
> > Part of the rush, at least from my perspective, is that I don't have
> > indefinite time to work on this.
> 
> I fully understand this! And I also feel the frustration. Been through
> that several times.
> 
> > I am sure you are aware that maintaining
> > an external patch set can be a real chore and I would prefer to have it
> > merged and then maintain it as a part of the tree.
> 
> Sure, keeping the code in sync is an additional burden. Having the code
> in just pushes the burden to everybody touching that subsystem in the
> future though. This is the maintenance cost we have to consider. Your
> approach of integrating a very narrow feature into the core allocator
> will require considering that usecase for future changes in the
> allocator. Maintaining metadata elsewhere doesn't impose that
> maintenance cost.
> 
> Can we agree on this at least? Because feel we are circling around in
> this and previous discussions.

Not really. The problem as I see it with external metadata is that it
creates the opportunity for issues like what I ran into with compaction.
It isn't necessarily maintaining any metadata in the page but instead is
has a massive effect on things since it is essentially churning the free
lists. In my mind the main difference is just how visible the
intrusiveness is, not necessarily if it is intrusive or not. The end
result is things are becoming more ossified either way.

> > Then other changes can
> > be rebased on it instead of having to rebase it around other changes that
> > are going on.
> 
> Well, that is not a real argument because alternatives are not an
> incremental change from the allocator POV. It is a different approach of
> maintaining metadata. Sure a different approach could replace your
> implementation (if it was merged) but what is the point of merging an
> approach that would be replaced? Just because you do not want to
> maintain your implmentation off tree? That is a poor argument to me.

It is more than the maintenance cost though. So one thing having the code
in the mm tree and linux-next gets me is more visibility and more review.
At this point the code just rots if I am sitting on it waiting for a
better alternative. What is the point in writing it if I am just sitting
on it? I am writing it with the goal of getting it upstream. I need to see
that there is a path for the patch set that ends in that direction or I am
just wasting time.

> I completely agree with Mel. Let's start with a simple solution first
> (using existing page isolation interfaces sound like a good start to
> interact with the page allocator), establish a decent API for virtio
> and start optimizing from there.

This has been brought up a few times but I don't recall seeing it
discussed anywhere. How do you see the page isolation interfaces being
used to handle the free page reporting case?

> Last but not least, I would also recommend to be more explicit about
> workloads which are going to benefit from those performance optimizations.
> So far I have only seen some micro benchmarks results. Do we have any
> real workloads and see how your approach behaves so that we can compare
> that to the other approach?

I think with some of my early versions I had a fairly simple test that
demonstrated the advantage of the approach by basically just starting up
enough VMs to create an overcommit situation and then running memhog on
each one in series, and then timing the second pass though them. If I
recall it was a matter of something like 6 to 7 seconds versus 45 seconds
per VM tested. Would something like that work or do you have another
suggestion?

I'm still somewhat new to doing development in the mm area, so of my tests
have consisted of variants of things from will-it-scale and the like. I'm
just wondering if there is anything you would recommend?