linux-acpi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Tao Xu <tao3.xu@intel.com>
Cc: "Dan Williams" <dan.j.williams@intel.com>,
	"Linux MM" <linux-mm@kvack.org>,
	"Linux ACPI" <linux-acpi@vger.kernel.org>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Linux ARM" <linux-arm-kernel@lists.infradead.org>,
	"X86 ML" <x86@kernel.org>, "Keith Busch" <keith.busch@intel.com>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Rafael J . Wysocki" <rjw@rjwysocki.net>,
	Linuxarm <linuxarm@huawei.com>,
	"Andrew Morton" <akpm@linux-foundation.org>
Subject: Re: [PATCH V5 1/4] ACPI: Support Generic Initiator only domains
Date: Wed, 13 Nov 2019 17:48:45 +0000	[thread overview]
Message-ID: <20191113174845.000009d3@huawei.com> (raw)
In-Reply-To: <77b6a6e8-9d44-1e1c-3bf0-a8d04833598d@intel.com>

On Wed, 13 Nov 2019 21:57:24 +0800
Tao Xu <tao3.xu@intel.com> wrote:

> On 11/13/2019 5:47 PM, Jonathan Cameron wrote:
> > On Tue, 12 Nov 2019 09:55:17 -0800
> > Dan Williams <dan.j.williams@intel.com> wrote:
> >   
> >> [ add Tao Xu ]
> >>
> >> On Fri, Oct 4, 2019 at 4:45 AM Jonathan Cameron
> >> <Jonathan.Cameron@huawei.com> wrote:  
> >>>
> >>> Generic Initiators are a new ACPI concept that allows for the
> >>> description of proximity domains that contain a device which
> >>> performs memory access (such as a network card) but neither
> >>> host CPU nor Memory.
> >>>
> >>> This patch has the parsing code and provides the infrastructure
> >>> for an architecture to associate these new domains with their
> >>> nearest memory processing node.  
> >>
> >> Thanks for this Jonathan. May I ask how this was tested? Tao has been
> >> working on qemu support for HMAT [1]. I have not checked if it already
> >> supports generic initiator entries, but it would be helpful to include
> >> an example of how the kernel sees these configurations in practice.
> >>
> >> [1]: http://patchwork.ozlabs.org/cover/1096737/  
> > 
> > Tested against qemu with SRAT and SLIT table overrides from an
> > initrd to actually create the node and give it distances
> > (those all turn up correctly in the normal places).  DSDT override
> > used to move an emulated network card into the GI numa node.  That
> > currently requires the PCI patch referred to in the cover letter.
> > On arm64 tested both on qemu and real hardware (overrides on tables
> > even for real hardware as I can't persuade our BIOS team to implement
> > Generic Initiators until an OS is actually using them.)
> > 
> > Main real requirement is memory allocations then occur from one of
> > the nodes at the minimal distance when you are do a devm_ allocation
> > from a device assigned. Also need to be able to query the distances
> > to allow load balancing etc.  All that works as expected.
> > 
> > It only has a fairly tangential connection to HMAT in that HMAT
> > can provide information on GI nodes.  Given HMAT code is quite happy
> > with memoryless nodes anyway it should work.  QEMU doesn't currently
> > have support to create GI SRAT entries let alone HMAT using them.
> > 
> > Whilst I could look at adding such support to QEMU, it's not
> > exactly high priority to emulate something we can test easily
> > by overriding the tables before the kernel reads them.
> > 
> > I'll look at how hard it is to build an HMAT tables for my test
> > configs based on the ones I used to test your HMAT patches a while
> > back.  Should be easy if tedious.
> > 
> > Jonathan
> >   
> Indeed, HMAT can support Generic Initiator, but as far as I know, QEMU 
> only can emulate a node with cpu and memory, or memory-only. Even if we 
> assign a node with cpu only, qemu will raise error. Considering 
> compatibility, there are lots of work to do for QEMU if we change NUMA 
> or SRAT table.
> 

I faked up a quick HMAT table.

Used a configuration with 3x CPU and memory nodes, 1x memory only node
and 1x GI node.  Two test cases, one where the GI initiator is further than
the CPU containing nodes from the memory only node (realistic case for
existing hardware). That behaves as expected and there are no
/sys/node/bus/nodeX/access0 entries for the GI node
+ appropriate ones for the memory only node as normal.

The other case is more interesting we have the memory only node nearer
to the GI node than to any of the CPUs.  In that case for x86 at least
the HMAT code is happy to put an access0 directory GI in the GI node
with empty access0/initiators and the memory node under access0/targets

The memory only node is node4 and the GI node node3.

So relevant dirs under /sys/bus/nodes/devices

node3/access0/initators/ Empty
node3/access0/targets/node4

node4/access0/initators/[node3 read_bandwidth write_bandwith etc]
node4/access0/targets/ Empty

So the result current (I think - the HMAT interface still confuses
me :) is that a GI node is treated like a CPU node.  This might mean
there is no useful information available if you want to figure out
which CPU containing node is nearest to Memory when the GI node is
nearer still.

Is this a problem?  I'm not sure...  

If we don't want to include GI nodes then we can possibly
use the node_state(N_CPU, x) method to check before considering
them, or I guess parse SRAT to extract that info directly. 

I tried this and it seems to work so can add patch doing this
next version if we think this is the 'right' thing to do.

So what do you think 'should' happen? 

Jonathan







  parent reply	other threads:[~2019-11-13 17:49 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-04 11:43 [PATCH V5 0/4] ACPI: Support Generic Initiator proximity domains Jonathan Cameron
2019-10-04 11:43 ` [PATCH V5 1/4] ACPI: Support Generic Initiator only domains Jonathan Cameron
2019-10-18 10:18   ` Rafael J. Wysocki
2019-10-18 12:46     ` Jonathan Cameron
2019-11-07 14:54       ` Rafael J. Wysocki
2019-11-12 17:07         ` Jonathan Cameron
2019-11-12 17:55   ` Dan Williams
2019-11-13  9:47     ` Jonathan Cameron
2019-11-13 13:57       ` Tao Xu
2019-11-13 16:52         ` Dan Williams
2019-11-13 17:56           ` Jonathan Cameron
2019-11-13 17:48         ` Jonathan Cameron [this message]
2019-11-13 23:20           ` Dan Williams
2019-11-14 11:26             ` Jonathan Cameron
2019-11-16 20:45               ` Dan Williams
2019-11-18 17:18                 ` Brice Goglin
2019-10-04 11:43 ` [PATCH V5 2/4] arm64: " Jonathan Cameron
2019-10-04 11:43 ` [PATCH V5 3/4] x86: Support Generic Initiator only proximity domains Jonathan Cameron
2019-10-07 14:55   ` Ingo Molnar
2019-10-08 11:17     ` Jonathan Cameron
2019-10-04 11:43 ` [PATCH V5 4/4] ACPI: Let ACPI know we support Generic Initiator Affinity Structures Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191113174845.000009d3@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=jglisse@redhat.com \
    --cc=keith.busch@intel.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxarm@huawei.com \
    --cc=rjw@rjwysocki.net \
    --cc=tao3.xu@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).