All of lore.kernel.org
 help / color / mirror / Atom feed
From: Logan Gunthorpe <logang@deltatee.com>
To: Jerome Glisse <jglisse@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
	Andi Kleen <ak@linux.intel.com>, Linux MM <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	Haggai Eran <haggaie@mellanox.com>,
	balbirs@au1.ibm.com,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	"Kuehling, Felix" <felix.kuehling@amd.com>,
	Philip.Yang@amd.com, "Koenig,
	Christian" <christian.koenig@amd.com>,
	"Blinzer, Paul" <Paul.Blinzer@amd.com>,
	John Hubbard <jhubbard@nvidia.com>,
	rcampbell@nvidia.com
Subject: Re: [RFC PATCH 02/14] mm/hms: heterogenenous memory system (HMS) documentation
Date: Tue, 4 Dec 2018 18:15:08 -0700	[thread overview]
Message-ID: <b77849e1-e05a-1071-7c48-ac93191e3134@deltatee.com> (raw)
In-Reply-To: <20181204235630.GQ2937@redhat.com>



On 2018-12-04 4:56 p.m., Jerome Glisse wrote:
> One example i have is 4 nodes (CPU socket) each nodes with 8 GPUs and
> two 8 GPUs node connected through each other with fast mesh (ie each
> GPU can peer to peer to each other at the same bandwidth). Then this
> 2 blocks are connected to the other block through a share link.
> 
> So it looks like:
>     SOCKET0----SOCKET1-----SOCKET2----SOCKET3
>     |          |           |          |
>     S0-GPU0====S1-GPU0     S2-GPU0====S1-GPU0
>     ||     \\//            ||     \\//
>     ||     //\\            ||     //\\
>     ...    ====...    -----...    ====...
>     ||     \\//            ||     \\//
>     ||     //\\            ||     //\\
>     S0-GPU7====S1-GPU7     S2-GPU7====S3-GPU7

Well the existing NUMA node stuff tells userspace which GPU belongs to
which socket (every device in sysfs already has a numa_node attribute).
And if that's not good enough we should work to improve how that works
for all devices. This problem isn't specific to GPUS or devices with
memory and seems rather orthogonal to an API to bind to device memory.

> How the above example would looks like ? I fail to see how to do it
> inside current sysfs. Maybe by creating multiple virtual device for
> each of the inter-connect ? So something like
> 
> link0 -> device:00 which itself has S0-GPU0 ... S0-GPU7 has child
> link1 -> device:01 which itself has S1-GPU0 ... S1-GPU7 has child
> link2 -> device:02 which itself has S2-GPU0 ... S2-GPU7 has child
> link3 -> device:03 which itself has S3-GPU0 ... S3-GPU7 has child

I think the "links" between GPUs themselves would be a bus. In the same
way a NUMA node is a bus. Each device in sysfs would then need a
directory or something to describe what "link bus(es)" they are a part
of. Though there are other ways to do this: a GPU driver could simply
create symlinks to other GPUs inside a "neighbours" directory under the
device path or something like that.

The point is that this seems like it is specific to GPUs and could
easily be solved in the GPU community without any new universal concepts
or big APIs.

And for applications that need topology information, a lot of it is
already there, we just need to fill in the gaps with small changes that
would be much less controversial. Then if you want to create a libhms
(or whatever) to help applications parse this information out of
existing sysfs that would make sense.

> My proposal is to do HMS behind staging for a while and also avoid
> any disruption to existing code path. See with people living on the
> bleeding edge if they get interested in that informations. If not then
> i can strip down my thing to the bare minimum which is about device
> memory.

This isn't my area or decision to make, but it seemed to me like this is
not what staging is for. Staging is for introducing *drivers* that
aren't up to the Kernel's quality level and they all reside under the
drivers/staging path. It's not meant to introduce experimental APIs
around the kernel that might be revoked at anytime.

DAX introduced itself by marking the config option as EXPERIMENTAL and
printing warnings to dmesg when someone tries to use it. But, to my
knowledge, DAX also wasn't creating APIs with the intention of changing
or revoking them -- it was introducing features using largely existing
APIs that had many broken corner cases.

Do you know of any precedents where big APIs were introduced and then
later revoked or radically changed like you are proposing to do?

Logan




  reply	other threads:[~2018-12-05  1:15 UTC|newest]

Thread overview: 171+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-03 23:34 [RFC PATCH 00/14] Heterogeneous Memory System (HMS) and hbind() jglisse
2018-12-03 23:34 ` jglisse
2018-12-03 23:34 ` [RFC PATCH 01/14] mm/hms: heterogeneous memory system (sysfs infrastructure) jglisse
2018-12-03 23:34 ` [RFC PATCH 02/14] mm/hms: heterogenenous memory system (HMS) documentation jglisse
2018-12-04 17:06   ` Andi Kleen
2018-12-04 17:06     ` Andi Kleen
2018-12-04 18:24     ` Jerome Glisse
2018-12-04 18:24       ` Jerome Glisse
2018-12-04 18:31       ` Dan Williams
2018-12-04 18:57         ` Jerome Glisse
2018-12-04 18:57           ` Jerome Glisse
2018-12-04 19:11           ` Logan Gunthorpe
2018-12-04 19:22             ` Jerome Glisse
2018-12-04 19:22               ` Jerome Glisse
2018-12-04 19:41               ` Logan Gunthorpe
2018-12-04 20:13                 ` Jerome Glisse
2018-12-04 20:13                   ` Jerome Glisse
2018-12-04 20:30                   ` Logan Gunthorpe
2018-12-04 20:59                     ` Jerome Glisse
2018-12-04 20:59                       ` Jerome Glisse
2018-12-04 21:19                       ` Logan Gunthorpe
2018-12-04 21:51                         ` Jerome Glisse
2018-12-04 21:51                           ` Jerome Glisse
2018-12-04 22:16                           ` Logan Gunthorpe
2018-12-04 23:56                             ` Jerome Glisse
2018-12-04 23:56                               ` Jerome Glisse
2018-12-05  1:15                               ` Logan Gunthorpe [this message]
2018-12-05  2:31                                 ` Jerome Glisse
2018-12-05  2:31                                   ` Jerome Glisse
2018-12-05 17:41                                   ` Logan Gunthorpe
2018-12-05 18:07                                     ` Jerome Glisse
2018-12-05 18:07                                       ` Jerome Glisse
2018-12-05 18:20                                       ` Logan Gunthorpe
2018-12-05 18:33                                         ` Jerome Glisse
2018-12-05 18:33                                           ` Jerome Glisse
2018-12-05 18:48                                           ` Logan Gunthorpe
2018-12-05 18:55                                             ` Jerome Glisse
2018-12-05 18:55                                               ` Jerome Glisse
2018-12-05 19:10                                               ` Logan Gunthorpe
2018-12-05 22:58                                                 ` Jerome Glisse
2018-12-05 22:58                                                   ` Jerome Glisse
2018-12-05 23:09                                                   ` Logan Gunthorpe
2018-12-05 23:20                                                     ` Jerome Glisse
2018-12-05 23:20                                                       ` Jerome Glisse
2018-12-05 23:23                                                       ` Logan Gunthorpe
2018-12-05 23:27                                                         ` Jerome Glisse
2018-12-06  0:08                                                           ` Dan Williams
2018-12-05  2:34                                 ` Dan Williams
2018-12-05  2:37                                   ` Jerome Glisse
2018-12-05  2:37                                     ` Jerome Glisse
2018-12-05 17:25                                     ` Logan Gunthorpe
2018-12-05 18:01                                       ` Jerome Glisse
2018-12-05 18:01                                         ` Jerome Glisse
2018-12-04 20:14             ` Andi Kleen
2018-12-04 20:47               ` Logan Gunthorpe
2018-12-04 21:15                 ` Jerome Glisse
2018-12-04 21:15                   ` Jerome Glisse
2018-12-05  0:54             ` Kuehling, Felix
2018-12-04 19:19           ` Dan Williams
2018-12-04 19:32             ` Jerome Glisse
2018-12-04 19:32               ` Jerome Glisse
2018-12-04 20:12       ` Andi Kleen
2018-12-04 20:41         ` Jerome Glisse
2018-12-04 20:41           ` Jerome Glisse
2018-12-05  4:36       ` Aneesh Kumar K.V
2018-12-05  4:41         ` Jerome Glisse
2018-12-05  4:41           ` Jerome Glisse
2018-12-05 10:52   ` Mike Rapoport
2018-12-05 10:52     ` Mike Rapoport
2018-12-03 23:34 ` [RFC PATCH 03/14] mm/hms: add target memory to heterogeneous memory system infrastructure jglisse
2018-12-03 23:34 ` [RFC PATCH 04/14] mm/hms: add initiator " jglisse
2018-12-03 23:35 ` [RFC PATCH 05/14] mm/hms: add link " jglisse
2018-12-03 23:35 ` [RFC PATCH 06/14] mm/hms: add bridge " jglisse
2018-12-03 23:35 ` [RFC PATCH 07/14] mm/hms: register main memory with heterogenenous memory system jglisse
2018-12-03 23:35 ` [RFC PATCH 08/14] mm/hms: register main CPUs " jglisse
2018-12-03 23:35 ` [RFC PATCH 09/14] mm/hms: hbind() for heterogeneous memory system (aka mbind() for HMS) jglisse
2018-12-03 23:35 ` [RFC PATCH 10/14] mm/hbind: add heterogeneous memory policy tracking infrastructure jglisse
2018-12-03 23:35 ` [RFC PATCH 11/14] mm/hbind: add bind command to heterogeneous memory policy jglisse
2018-12-03 23:35 ` [RFC PATCH 12/14] mm/hbind: add migrate command to hbind() ioctl jglisse
2018-12-03 23:35 ` [RFC PATCH 13/14] drm/nouveau: register GPU under heterogeneous memory system jglisse
2018-12-03 23:35 ` [RFC PATCH 14/14] test/hms: tests for " jglisse
2018-12-04  7:44 ` [RFC PATCH 00/14] Heterogeneous Memory System (HMS) and hbind() Aneesh Kumar K.V
2018-12-04  7:44   ` Aneesh Kumar K.V
2018-12-04 14:44   ` Jerome Glisse
2018-12-04 14:44     ` Jerome Glisse
2018-12-04 14:44     ` Jerome Glisse
2018-12-04 18:02 ` Dave Hansen
2018-12-04 18:02   ` Dave Hansen
2018-12-04 18:49   ` Jerome Glisse
2018-12-04 18:49     ` Jerome Glisse
2018-12-04 18:49     ` Jerome Glisse
2018-12-04 18:54     ` Dave Hansen
2018-12-04 18:54       ` Dave Hansen
2018-12-04 19:11       ` Jerome Glisse
2018-12-04 19:11         ` Jerome Glisse
2018-12-04 19:11         ` Jerome Glisse
2018-12-04 21:37     ` Dave Hansen
2018-12-04 21:37       ` Dave Hansen
2018-12-04 21:57       ` Jerome Glisse
2018-12-04 21:57         ` Jerome Glisse
2018-12-04 21:57         ` Jerome Glisse
2018-12-04 23:58         ` Dave Hansen
2018-12-04 23:58           ` Dave Hansen
2018-12-05  0:29           ` Jerome Glisse
2018-12-05  0:29             ` Jerome Glisse
2018-12-05  0:29             ` Jerome Glisse
2018-12-05  1:22         ` Kuehling, Felix
2018-12-05  1:22           ` Kuehling, Felix
2018-12-05  1:22           ` Kuehling, Felix
2018-12-05 11:27     ` Aneesh Kumar K.V
2018-12-05 11:27       ` Aneesh Kumar K.V
2018-12-05 16:09       ` Jerome Glisse
2018-12-05 16:09         ` Jerome Glisse
2018-12-05 16:09         ` Jerome Glisse
2018-12-04 23:54 ` Dave Hansen
2018-12-04 23:54   ` Dave Hansen
2018-12-05  0:15   ` Jerome Glisse
2018-12-05  0:15     ` Jerome Glisse
2018-12-05  0:15     ` Jerome Glisse
2018-12-05  1:06     ` Dave Hansen
2018-12-05  1:06       ` Dave Hansen
2018-12-05  2:13       ` Jerome Glisse
2018-12-05  2:13         ` Jerome Glisse
2018-12-05  2:13         ` Jerome Glisse
2018-12-05 17:27         ` Dave Hansen
2018-12-05 17:27           ` Dave Hansen
2018-12-05 17:53           ` Jerome Glisse
2018-12-05 17:53             ` Jerome Glisse
2018-12-05 17:53             ` Jerome Glisse
2018-12-06 18:25             ` Dave Hansen
2018-12-06 18:25               ` Dave Hansen
2018-12-06 19:20               ` Jerome Glisse
2018-12-06 19:20                 ` Jerome Glisse
2018-12-06 19:20                 ` Jerome Glisse
2018-12-06 19:31                 ` Dave Hansen
2018-12-06 19:31                   ` Dave Hansen
2018-12-06 20:11                   ` Logan Gunthorpe
2018-12-06 20:11                     ` Logan Gunthorpe
2018-12-06 22:04                     ` Dave Hansen
2018-12-06 22:04                       ` Dave Hansen
2018-12-06 22:39                       ` Jerome Glisse
2018-12-06 22:39                         ` Jerome Glisse
2018-12-06 22:39                         ` Jerome Glisse
2018-12-06 23:09                         ` Dave Hansen
2018-12-06 23:09                           ` Dave Hansen
2018-12-06 23:28                           ` Logan Gunthorpe
2018-12-06 23:28                             ` Logan Gunthorpe
2018-12-06 23:34                             ` Dave Hansen
2018-12-06 23:34                               ` Dave Hansen
2018-12-06 23:38                             ` Dave Hansen
2018-12-06 23:38                               ` Dave Hansen
2018-12-06 23:48                               ` Logan Gunthorpe
2018-12-06 23:48                                 ` Logan Gunthorpe
2018-12-07  0:20                                 ` Jerome Glisse
2018-12-07  0:20                                   ` Jerome Glisse
2018-12-07  0:20                                   ` Jerome Glisse
2018-12-07 15:06                                   ` Jonathan Cameron
2018-12-07 15:06                                     ` Jonathan Cameron
2018-12-07 15:06                                     ` Jonathan Cameron
2018-12-07 19:37                                     ` Jerome Glisse
2018-12-07 19:37                                       ` Jerome Glisse
2018-12-07 19:37                                       ` Jerome Glisse
2018-12-07  0:15                           ` Jerome Glisse
2018-12-07  0:15                             ` Jerome Glisse
2018-12-07  0:15                             ` Jerome Glisse
2018-12-06 20:27                   ` Jerome Glisse
2018-12-06 20:27                     ` Jerome Glisse
2018-12-06 20:27                     ` Jerome Glisse
2018-12-06 21:46                     ` Jerome Glisse
2018-12-06 21:46                       ` Jerome Glisse
2018-12-06 21:46                       ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b77849e1-e05a-1071-7c48-ac93191e3134@deltatee.com \
    --to=logang@deltatee.com \
    --cc=Paul.Blinzer@amd.com \
    --cc=Philip.Yang@amd.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=balbirs@au1.ibm.com \
    --cc=benh@kernel.crashing.org \
    --cc=christian.koenig@amd.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=felix.kuehling@amd.com \
    --cc=haggaie@mellanox.com \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=rafael@kernel.org \
    --cc=rcampbell@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.