From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.5 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E36B8C64EB1 for ; Fri, 7 Dec 2018 00:20:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id AF4CD2146D for ; Fri, 7 Dec 2018 00:20:53 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AF4CD2146D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725997AbeLGAUw (ORCPT ); Thu, 6 Dec 2018 19:20:52 -0500 Received: from mx1.redhat.com ([209.132.183.28]:36876 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725944AbeLGAUw (ORCPT ); Thu, 6 Dec 2018 19:20:52 -0500 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 92BD5E3E1C; Fri, 7 Dec 2018 00:20:50 +0000 (UTC) Received: from redhat.com (ovpn-122-74.rdu2.redhat.com [10.10.122.74]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 2FE4019C7B; Fri, 7 Dec 2018 00:20:47 +0000 (UTC) Date: Thu, 6 Dec 2018 19:20:45 -0500 From: Jerome Glisse To: Logan Gunthorpe Cc: Dave Hansen , linux-mm@kvack.org, Andrew Morton , linux-kernel@vger.kernel.org, "Rafael J . Wysocki" , Matthew Wilcox , Ross Zwisler , Keith Busch , Dan Williams , Haggai Eran , Balbir Singh , "Aneesh Kumar K . V" , Benjamin Herrenschmidt , Felix Kuehling , Philip Yang , Christian =?iso-8859-1?Q?K=F6nig?= , Paul Blinzer , John Hubbard , Ralph Campbell , Michal Hocko , Jonathan Cameron , Mark Hairgrove , Vivek Kini , Mel Gorman , Dave Airlie , Ben Skeggs , Andrea Arcangeli , Rik van Riel , Ben Woodard , linux-acpi@vger.kernel.org Subject: Re: [RFC PATCH 00/14] Heterogeneous Memory System (HMS) and hbind() Message-ID: <20181207002044.GI3544@redhat.com> References: <20181206192050.GC3544@redhat.com> <20181206223935.GG3544@redhat.com> <935fc14d-91f2-bc2a-f8b5-665e4145e148@deltatee.com> <5e6c87d5-e4ef-12e7-32bf-c163f7ff58d7@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 07 Dec 2018 00:20:51 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 06, 2018 at 04:48:57PM -0700, Logan Gunthorpe wrote: > > > On 2018-12-06 4:38 p.m., Dave Hansen wrote: > > On 12/6/18 3:28 PM, Logan Gunthorpe wrote: > >> I didn't think this was meant to describe actual real world performance > >> between all of the links. If that's the case all of this seems like a > >> pipe dream to me. > > > > The HMAT discussions (that I was a part of at least) settled on just > > trying to describe what we called "sticker speed". Nobody had an > > expectation that you *really* had to measure everything. > > > > The best we can do for any of these approaches is approximate things. > > Yes, though there's a lot of caveats in this assumption alone. > Specifically with PCI: the bus may run at however many GB/s but P2P > through a CPU's root complexes can slow down significantly (like down to > MB/s). > > I've seen similar things across QPI: I can sometimes do P2P from > PCI->QPI->PCI but the performance doesn't even come close to the sticker > speed of any of those buses. > > I'm not sure how anyone is going to deal with those issues, but it does > firmly place us in world view #2 instead of #1. But, yes, I agree > exposing information like in #2 full out to userspace, especially > through sysfs, seems like a nightmare and I don't see anything in HMS to > help with that. Providing an API to ask for memory (or another resource) > that's accessible by a set of initiators and with a set of requirements > for capabilities seems more manageable. Note that in #1 you have bridge that fully allow to express those path limitation. So what you just describe can be fully reported to userspace. I explained and given examples on how program adapt their computation to the system topology it does exist today and people are even developing new programming langage with some of those idea baked in. So they are people out there that already rely on such information they just do not get it from the kernel but from a mix of various device specific API and they have to stich everything themself and develop a database of quirk and gotcha. My proposal is to provide a coherent kernel API where we can sanitize that informations and report it to userspace in a single and coherent description. Cheers, Jérôme