From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=vePq=O3=vger.kernel.org=linux-kernel-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-4.0 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED,
	DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,
	SPF_PASS autolearn=ham autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 195D9C43387
	for <linux-kernel@archiver.kernel.org>; Tue, 18 Dec 2018 19:08:10 +0000 (UTC)
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by mail.kernel.org (Postfix) with ESMTP id D27FE21873
	for <linux-kernel@archiver.kernel.org>; Tue, 18 Dec 2018 19:08:08 +0000 (UTC)
Authentication-Results: mail.kernel.org;
	dkim=pass (2048-bit key) header.d=intel-com.20150623.gappssmtp.com header.i=@intel-com.20150623.gappssmtp.com header.b="xrTzBWzr"
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1726952AbeLRTIH (ORCPT
        <rfc822;linux-kernel@archiver.kernel.org>);
        Tue, 18 Dec 2018 14:08:07 -0500
Received: from mail-ot1-f67.google.com ([209.85.210.67]:39520 "EHLO
        mail-ot1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1726639AbeLRTIH (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 18 Dec 2018 14:08:07 -0500
Received: by mail-ot1-f67.google.com with SMTP id n8so16719270otl.6
        for <linux-kernel@vger.kernel.org>; Tue, 18 Dec 2018 11:08:06 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=intel-com.20150623.gappssmtp.com; s=20150623;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc;
        bh=7C9qK6ZnGrnhIYBcuzbzCgP7C9CgQGqy2EGw/nx6yD8=;
        b=xrTzBWzrT6wsl3MijPV/2h945llENr46VN3J8AQGgHc5lobX1FoGihRtW+aFZluOUE
         prnSWmd7WvNdZVt89qhiHGQibKc+rGJ5st1ppqxBgUWRogmStnh5RUJOU1+Mv43UeKNB
         x35phab8o0JpYf9UAm5PI8nB+ozU91IXFlfCxd3Am3HOGxQOyg70G61f7nMKTzzYDJWg
         x153EWn/4MW5zcdg8vLaQJC7bCU+TF1lq+OEejXFPqeHn5RB2n5eu//wrWsOWuuurLNe
         Dug1i4Y8MzXlNT9nuXKDDOqWiyESKXLcWc8wFha1p5NdsdJKlZm3IqXZK/rVb1AnX7bu
         4AFw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc;
        bh=7C9qK6ZnGrnhIYBcuzbzCgP7C9CgQGqy2EGw/nx6yD8=;
        b=kh2qjAej7rBnmx/ddNd984oESgEy1dcqEyJxq6nYXWGTCRVPa3ou/WPAGAdfTkvXhN
         dhvo5xflt6cz2re6wEBx9qHTyIucBAoyYDAvdzBmgHFnKVcO4GlBFM/3VL6XdSVHaz7V
         3BGbcQMgklZU3YNp+jImSB8ULYVVveOUHsTwGUB3QlzykAnJNao9/j6oJa4xPBDFSnpO
         gSwODdWyfPVsc+AWCT571iJclbb1W+X6kMvmu4ZgpBRObJQ0J9bAB5kBxNr6atuDQoHY
         1l0vBWCb8IK/0uDSBgjcxn0rymFVqP/EYabvtPC9hCmlnjxNtywNPkG5XFBAoVkIwb98
         214w==
X-Gm-Message-State: AA+aEWYb9lvgIh0gPgde3eNCcORSVDSixihjoaj9aRuIwcouu6nWl6QS
        c0Px2oLa0bCL7V8INjP8ul68I6Bi0F0BldyPjAXnSw==
X-Google-Smtp-Source: AFSGD/WBrSTyTvdB7ZEFeOdrbyJMhNX5NI3eoXB+h5HzcbkbHUYoHkKpqlqeJ3s3YYpu5KW8tGc4ygoeQWuMJIFmwBY=
X-Received: by 2002:a9d:5cc2:: with SMTP id r2mr13173113oti.367.1545160086563;
 Tue, 18 Dec 2018 11:08:06 -0800 (PST)
MIME-Version: 1.0
References: <154483851047.1672629.15001135860756738866.stgit@dwillia2-desk3.amr.corp.intel.com>
 <154483852617.1672629.2068988045031389440.stgit@dwillia2-desk3.amr.corp.intel.com>
 <20181216124335.GB30212@rapoport-lnx> <CAPcyv4hXPm4GnBheTZ5WN6s5Kiw02MW1aWA-s2qC8BqfthT3Yg@mail.gmail.com>
 <20181218091121.GA25499@rapoport-lnx>
In-Reply-To: <20181218091121.GA25499@rapoport-lnx>
From:   Dan Williams <dan.j.williams@intel.com>
Date:   Tue, 18 Dec 2018 11:07:55 -0800
Message-ID: <CAPcyv4iDkEo+xG-AJetOfp12RO8qDV0t=AF3rvoq5GKc5VFuzw@mail.gmail.com>
Subject: Re: [PATCH v5 3/5] mm: Shuffle initial free memory to improve
 memory-side-cache utilization
To:     Mike Rapoport <rppt@linux.ibm.com>
Cc:     Andrew Morton <akpm@linux-foundation.org>,
        Michal Hocko <mhocko@suse.com>,
        Kees Cook <keescook@chromium.org>,
        Dave Hansen <dave.hansen@linux.intel.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Linux MM <linux-mm@kvack.org>, X86 ML <x86@kernel.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Content-Type: text/plain; charset="UTF-8"
Sender: linux-kernel-owner@vger.kernel.org
Precedence: bulk
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Dec 18, 2018 at 1:11 AM Mike Rapoport <rppt@linux.ibm.com> wrote:
>
> On Mon, Dec 17, 2018 at 11:56:36AM -0800, Dan Williams wrote:
> > On Sun, Dec 16, 2018 at 4:43 AM Mike Rapoport <rppt@linux.ibm.com> wrote:
> > >
> > > On Fri, Dec 14, 2018 at 05:48:46PM -0800, Dan Williams wrote:
> > > > Randomization of the page allocator improves the average utilization of
> > > > a direct-mapped memory-side-cache. Memory side caching is a platform
> > > > capability that Linux has been previously exposed to in HPC
> > > > (high-performance computing) environments on specialty platforms. In
> > > > that instance it was a smaller pool of high-bandwidth-memory relative to
> > > > higher-capacity / lower-bandwidth DRAM. Now, this capability is going to
> > > > be found on general purpose server platforms where DRAM is a cache in
> > > > front of higher latency persistent memory [1].
> > [..]
> > > > diff --git a/mm/memblock.c b/mm/memblock.c
> > > > index 185bfd4e87bb..fd617928ccc1 100644
> > > > --- a/mm/memblock.c
> > > > +++ b/mm/memblock.c
> > > > @@ -834,8 +834,16 @@ int __init_memblock memblock_set_sidecache(phys_addr_t base, phys_addr_t size,
> > > >               return ret;
> > > >
> > > >       for (i = start_rgn; i < end_rgn; i++) {
> > > > -             type->regions[i].cache_size = cache_size;
> > > > -             type->regions[i].direct_mapped = direct_mapped;
> > > > +             struct memblock_region *r = &type->regions[i];
> > > > +
> > > > +             r->cache_size = cache_size;
> > > > +             r->direct_mapped = direct_mapped;
> > >
> > > I think this change can be merged into the previous patch
> >
> > Ok, will do.
> >
> > > > +             /*
> > > > +              * Enable randomization for amortizing direct-mapped
> > > > +              * memory-side-cache conflicts.
> > > > +              */
> > > > +             if (r->size > r->cache_size && r->direct_mapped)
> > > > +                     page_alloc_shuffle_enable();
> > >
> > > It seems that this is the only use for ->direct_mapped in the memblock
> > > code. Wouldn't cache_size != 0 suffice? I.e., in the code that sets the
> > > memblock region attributes, the cache_size can be set to 0 for the non
> > > direct mapped caches, isn't it?
> > >
> >
> > The HMAT specification allows for other cache-topologies, so it's not
> > sufficient to just look for non-zero size when a platform implements a
> > set-associative cache. The expectation is that a set-associative cache
> > would not need the kernel to perform memory randomization to improve
> > the cache utilization.
> >
> > The check for memory size > cache-size is a sanity check for a
> > platform BIOS or system configuration that mis-reports or mis-sizes
> > the cache.
>
> Apparently I didn't explain my point well.
>
> The acpi_numa_memory_affinity_init() already knows whether the cache is
> direct mapped or a set-associative. It can just skip calling
> memblock_set_sidecache() for the set-associative case.
>
> Another thing I've noticed only now, is that memory randomization is
> enabled if there is at least one memory region with a direct mapped side
> cache attached and once the randomization is on the cache size and the
> mapping mode do not matter. So, I think it's not necessary to store them in
> the memory region at all.

Fair enough. I was anticipating the case when non-ACPI systems gain
this capability, but you're right no need to design that now. The size
sanity check has some small value, but given there is an override and
broken platform firmware would need to be fixed I don't think we lose
much by getting rid of it. Will re-flow without memblock integration.