linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Exporting physical topology information
@ 2004-03-17 21:37 Martin Hicks
  2004-03-18 17:44 ` Jesse Barnes
  2004-03-18 23:21 ` Greg KH
  0 siblings, 2 replies; 10+ messages in thread
From: Martin Hicks @ 2004-03-17 21:37 UTC (permalink / raw)
  To: linux-kernel; +Cc: greg

[-- Attachment #1: Type: text/plain, Size: 1044 bytes --]


Hi,

I'm trying to figure out what the best way is to export a minimal amount
of physical topology information to userland.  Would it be acceptable to
export this kind of information with sysfs?

I'm not proposing that we build an entire physical topology tree in
sysfs, but just providing an attribute file.  The two most obvious
examples of where this would be useful is for nodes and pci busses.  The
Altix platform is a modular system with CPU bricks and IO bricks.  We
currently have no method for locating where "node0" is, nor do we have a
method for locating pci bus 0000:20, for example.

If we could physically locate a PCI bus, then it would be much easier
to (for example) locate our defective SCSI disk that is target4 on the
SCSI controller that is on pci bus 0000:20.

The attached patch, care of Jesse Barnes, exports a physid attribute for
each node, which indicates the physical location of the node.  Altix
specific.

thanks
mh

-- 
Martin Hicks                Wild Open Source Inc.
mort@wildopensource.com     613-266-2296

[-- Attachment #2: physid.patch --]
[-- Type: text/plain, Size: 2541 bytes --]

===== arch/ia64/mm/numa.c 1.7 vs edited =====
--- 1.7/arch/ia64/mm/numa.c	Tue Feb  3 21:35:17 2004
+++ edited/arch/ia64/mm/numa.c	Mon Mar 15 11:14:51 2004
@@ -19,6 +19,8 @@
 #include <linux/bootmem.h>
 #include <asm/mmzone.h>
 #include <asm/numa.h>
+#include <asm/sn/nodepda.h>
+#include <asm/sn/module.h>
 
 static struct node *sysfs_nodes;
 static struct cpu *sysfs_cpus;
@@ -48,6 +50,13 @@
 			break;
 
 	return (i < num_node_memblks) ? node_memblk[i].nid : (num_node_memblks ? -1 : 0);
+}
+
+void node_to_physid(int node, char *buf)
+{
+     struct nodepda_s *nodeinfo = NODEPDA(node);
+
+     format_module_id(buf, nodeinfo->module->id, MODULE_FORMAT_BRIEF);
 }
 
 static int __init topology_init(void)
===== drivers/base/node.c 1.18 vs edited =====
--- 1.18/drivers/base/node.c	Thu Feb 12 22:35:40 2004
+++ edited/drivers/base/node.c	Mon Mar 15 11:12:14 2004
@@ -56,6 +56,17 @@
 static SYSDEV_ATTR(meminfo,S_IRUGO,node_read_meminfo,NULL);
 
 
+static ssize_t node_read_physid(struct sys_device * dev, char * buf)
+{
+	struct node *node_dev = to_node(dev);
+	int len;
+
+	len = snprintf(buf, NODE_MAX_PHYSID + 1, "%s\n", node_dev->physid);
+	return len;
+}
+
+static SYSDEV_ATTR(physid,S_IRUGO,node_read_physid,NULL);
+
 /*
  * register_node - Setup a driverfs device for a node.
  * @num - Node number to use when creating the device.
@@ -67,6 +78,7 @@
 	int error;
 
 	node->cpumap = node_to_cpumask(num);
+	node_to_physid(num, node->physid);
 	node->sysdev.id = num;
 	node->sysdev.cls = &node_class;
 	error = sysdev_register(&node->sysdev);
@@ -74,6 +86,7 @@
 	if (!error){
 		sysdev_create_file(&node->sysdev, &attr_cpumap);
 		sysdev_create_file(&node->sysdev, &attr_meminfo);
+		sysdev_create_file(&node->sysdev, &attr_physid);
 	}
 	return error;
 }
===== include/asm-ia64/topology.h 1.10 vs edited =====
--- 1.10/include/asm-ia64/topology.h	Tue Feb  3 21:35:17 2004
+++ edited/include/asm-ia64/topology.h	Mon Mar 15 11:12:15 2004
@@ -45,6 +45,8 @@
 
 void build_cpu_to_node_map(void);
 
+extern void node_to_physid(int node, char *buf);
+
 #endif /* CONFIG_NUMA */
 
 #include <asm-generic/topology.h>
===== include/linux/node.h 1.5 vs edited =====
--- 1.5/include/linux/node.h	Mon Aug 18 19:46:23 2003
+++ edited/include/linux/node.h	Mon Mar 15 11:12:17 2004
@@ -22,8 +22,11 @@
 #include <linux/sysdev.h>
 #include <linux/cpumask.h>
 
+#define NODE_MAX_PHYSID 80
+
 struct node {
 	cpumask_t cpumap;	/* Bitmap of CPUs on the Node */
+	char physid[NODE_MAX_PHYSID]; /* Physical ID of node */
 	struct sys_device	sysdev;
 };
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Exporting physical topology information
  2004-03-17 21:37 Exporting physical topology information Martin Hicks
@ 2004-03-18 17:44 ` Jesse Barnes
  2004-03-18 23:21 ` Greg KH
  1 sibling, 0 replies; 10+ messages in thread
From: Jesse Barnes @ 2004-03-18 17:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: Martin Hicks, greg

On Wednesday 17 March 2004 1:37 pm, Martin Hicks wrote:
> I'm not proposing that we build an entire physical topology tree in
> sysfs, but just providing an attribute file.  The two most obvious
> examples of where this would be useful is for nodes and pci busses.  The
> Altix platform is a modular system with CPU bricks and IO bricks.  We
> currently have no method for locating where "node0" is, nor do we have a
> method for locating pci bus 0000:20, for example.

I'm curious how other arches deal with this too.  Like on ppc64 when
you want to remove a CPU or set of CPUs, you have to bring it (or all
of the cores on a given module) down via software, then go into the
lab and find the module to pull it out.  Is there a mapping somewhere
that the user is expected to use?  A hypervisor call of some sort to
make some lights blink?

> If we could physically locate a PCI bus, then it would be much easier
> to (for example) locate our defective SCSI disk that is target4 on the
> SCSI controller that is on pci bus 0000:20.

This seems like one of the main uses--find components that went bad.
Physically locating a CPU, DIMM, PCI board, or disk would all be
easier if we provided some sort of physical identifier and
logical->physical mapping information.  On IRIX, we actually expose
the whole physical hierarchy of the system in /hw.  One of the
problems with that approach is that everytime a new system
configuration is released the kernel has to be updated to know about
it, resulting in /hw paths that change over time, and from system to
system...

Jesse


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Exporting physical topology information
  2004-03-17 21:37 Exporting physical topology information Martin Hicks
  2004-03-18 17:44 ` Jesse Barnes
@ 2004-03-18 23:21 ` Greg KH
  2004-03-19 17:48   ` Martin Hicks
  2004-03-19 17:51   ` Jesse Barnes
  1 sibling, 2 replies; 10+ messages in thread
From: Greg KH @ 2004-03-18 23:21 UTC (permalink / raw)
  To: Martin Hicks; +Cc: linux-kernel

On Wed, Mar 17, 2004 at 04:37:14PM -0500, Martin Hicks wrote:
> 
> Hi,
> 
> I'm trying to figure out what the best way is to export a minimal amount
> of physical topology information to userland.  Would it be acceptable to
> export this kind of information with sysfs?
> 
> I'm not proposing that we build an entire physical topology tree in
> sysfs, but just providing an attribute file.  The two most obvious
> examples of where this would be useful is for nodes and pci busses.  The
> Altix platform is a modular system with CPU bricks and IO bricks.  We
> currently have no method for locating where "node0" is, nor do we have a
> method for locating pci bus 0000:20, for example.
> 
> If we could physically locate a PCI bus, then it would be much easier
> to (for example) locate our defective SCSI disk that is target4 on the
> SCSI controller that is on pci bus 0000:20.

Um, what's wrong with the current /sys/class/pci_bus/*/cpuaffinity files
for determining this topology information?  That is why it was added.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Exporting physical topology information
  2004-03-18 23:21 ` Greg KH
@ 2004-03-19 17:48   ` Martin Hicks
  2004-03-19 17:57     ` Greg KH
  2004-03-19 17:51   ` Jesse Barnes
  1 sibling, 1 reply; 10+ messages in thread
From: Martin Hicks @ 2004-03-19 17:48 UTC (permalink / raw)
  To: Greg KH; +Cc: linux-kernel


On Thu, Mar 18, 2004 at 03:21:39PM -0800, Greg KH wrote:
> On Wed, Mar 17, 2004 at 04:37:14PM -0500, Martin Hicks wrote:
> > 
> > Hi,
> > 
> > If we could physically locate a PCI bus, then it would be much easier
> > to (for example) locate our defective SCSI disk that is target4 on the
> > SCSI controller that is on pci bus 0000:20.
> 
> Um, what's wrong with the current /sys/class/pci_bus/*/cpuaffinity files
> for determining this topology information?  That is why it was added.

This gives us more logical topology information.  It still doesn't tell
us where in the room the specific piece of equipment is.

mh

-- 
Martin Hicks                Wild Open Source Inc.
mort@wildopensource.com     613-266-2296

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Exporting physical topology information
  2004-03-18 23:21 ` Greg KH
  2004-03-19 17:48   ` Martin Hicks
@ 2004-03-19 17:51   ` Jesse Barnes
  2004-03-19 17:59     ` Greg KH
  1 sibling, 1 reply; 10+ messages in thread
From: Jesse Barnes @ 2004-03-19 17:51 UTC (permalink / raw)
  To: linux-kernel, greg

On Thursday 18 March 2004 3:21 pm, Greg KH wrote:
> > If we could physically locate a PCI bus, then it would be much easier
> > to (for example) locate our defective SCSI disk that is target4 on the
> > SCSI controller that is on pci bus 0000:20.
> 
> Um, what's wrong with the current /sys/class/pci_bus/*/cpuaffinity files
> for determining this topology information?  That is why it was added.

Nothing, except that it only provides logical information.  In a large
system, it's really useful to be able to physically locate a component
somehow.  That was the idea behind adding 'physid'.  For example:

[jbarnes@spamtin pci0000:02]$ pwd
/sys/devices/pci0000:02
[jbarnes@spamtin pci0000:02]$ cat physid
rack: 5
module: 12
slot: 3

or for nodes:

[jbarnes@spamtin node2]$ cat physid
rack: 1
module: 3
slot: 1

Then you could walk into the lab and know exactly which device to
kick.  Obviously, these values would be platform specific, though on
ia64 and some x86 platforms, we could probably use the ACPI namespace
to access some of the info, and on ppc the OF namespace might have it.

Thanks,
Jesse


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Exporting physical topology information
  2004-03-19 17:48   ` Martin Hicks
@ 2004-03-19 17:57     ` Greg KH
  0 siblings, 0 replies; 10+ messages in thread
From: Greg KH @ 2004-03-19 17:57 UTC (permalink / raw)
  To: Martin Hicks; +Cc: linux-kernel

On Fri, Mar 19, 2004 at 12:48:26PM -0500, Martin Hicks wrote:
> 
> On Thu, Mar 18, 2004 at 03:21:39PM -0800, Greg KH wrote:
> > On Wed, Mar 17, 2004 at 04:37:14PM -0500, Martin Hicks wrote:
> > > 
> > > Hi,
> > > 
> > > If we could physically locate a PCI bus, then it would be much easier
> > > to (for example) locate our defective SCSI disk that is target4 on the
> > > SCSI controller that is on pci bus 0000:20.
> > 
> > Um, what's wrong with the current /sys/class/pci_bus/*/cpuaffinity files
> > for determining this topology information?  That is why it was added.
> 
> This gives us more logical topology information.  It still doesn't tell
> us where in the room the specific piece of equipment is.

True, but isn't that what labels on your CPU nodes are for?

:)

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Exporting physical topology information
  2004-03-19 17:51   ` Jesse Barnes
@ 2004-03-19 17:59     ` Greg KH
  2004-03-19 18:53       ` Jesse Barnes
  0 siblings, 1 reply; 10+ messages in thread
From: Greg KH @ 2004-03-19 17:59 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: linux-kernel

On Fri, Mar 19, 2004 at 09:51:52AM -0800, Jesse Barnes wrote:
> On Thursday 18 March 2004 3:21 pm, Greg KH wrote:
> > > If we could physically locate a PCI bus, then it would be much easier
> > > to (for example) locate our defective SCSI disk that is target4 on the
> > > SCSI controller that is on pci bus 0000:20.
> > 
> > Um, what's wrong with the current /sys/class/pci_bus/*/cpuaffinity files
> > for determining this topology information?  That is why it was added.
> 
> Nothing, except that it only provides logical information.  In a large
> system, it's really useful to be able to physically locate a component
> somehow.  That was the idea behind adding 'physid'.  For example:
> 
> [jbarnes@spamtin pci0000:02]$ pwd
> /sys/devices/pci0000:02
> [jbarnes@spamtin pci0000:02]$ cat physid
> rack: 5
> module: 12
> slot: 3

Hm, that looks to violate the "one value per file" mandate of sysfs,
right?  Right now PCI Hotplug slots have a LED on them that you can
flash from userspace to help locate the physical slot that you want to
change.  I also know of large PCI drawers that have LEDs that flash to
locate them.

Also, this is _very_ hardware/platform specific.  If you want to try to
implement this, I'd be interested in what the patch would look like.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Exporting physical topology information
  2004-03-19 17:59     ` Greg KH
@ 2004-03-19 18:53       ` Jesse Barnes
  2004-05-07 20:13         ` Jesse Barnes
  0 siblings, 1 reply; 10+ messages in thread
From: Jesse Barnes @ 2004-03-19 18:53 UTC (permalink / raw)
  To: linux-kernel, greg

On Friday 19 March 2004 9:59 am, Greg KH wrote:
> Hm, that looks to violate the "one value per file" mandate of sysfs,
> right?  Right now PCI Hotplug slots have a LED on them that you can

Yeah... my original patch to implement node physids used the Altix
module id, which looks like rrrtss, where rrr is a rack id, t is a
brick type, and ss is the rack slot, e.g. 001c12, so it was one value.
The example above was just brainstorming, I'm sure there are better
ways to do it.

> flash from userspace to help locate the physical slot that you want to
> change.  I also know of large PCI drawers that have LEDs that flash to
> locate them.

Yeah, that makes things easy, but it would be nice to cover CPUs and
memory banks too, so you can go remove the DIMM with a persistent
single or double bit error, or a CPU with a bad cache or whatever.  I
imagine some hardware has blinking lights for that too.

> Also, this is _very_ hardware/platform specific.  If you want to try to
> implement this, I'd be interested in what the patch would look like.

Here's the (very platform specific) patch I did for Altix, just to see
what it would look like, and to solicit comments.  There's some other
per-node stuff that would be nice to have available to userspace too,
mostly for administrative purposes, like the chipset revision and
type, firmware revision, and other hardware specific details.  One way
to export that sort of thing is with some sort of arbitrary data blob,
but like you said, that violates the sysfs one file, one value
principle.

Thanks,
Jesse


===== arch/ia64/mm/numa.c 1.6 vs edited =====
--- 1.6/arch/ia64/mm/numa.c	Sun Jan 11 22:54:38 2004
+++ edited/arch/ia64/mm/numa.c	Fri Jan 23 11:56:48 2004
@@ -20,6 +20,8 @@
 #include <linux/bootmem.h>
 #include <asm/mmzone.h>
 #include <asm/numa.h>
+#include <asm/sn/nodepda.h>
+#include <asm/sn/module.h>
 
 static struct memblk *sysfs_memblks;
 static struct node *sysfs_nodes;
@@ -50,6 +52,13 @@
 			break;
 
 	return (i < num_memblks) ? node_memblk[i].nid : (num_memblks ? -1 : 0);
+}
+
+void node_to_physid(int node, char *buf)
+{
+	struct nodepda_s *nodeinfo = NODEPDA(node);
+
+	format_module_id(buf, nodeinfo->module->id, MODULE_FORMAT_BRIEF);
 }
 
 static int __init topology_init(void)
===== drivers/base/node.c 1.16 vs edited =====
--- 1.16/drivers/base/node.c	Mon Dec 29 13:37:47 2003
+++ edited/drivers/base/node.c	Fri Jan 23 12:25:44 2004
@@ -56,6 +56,17 @@
 static SYSDEV_ATTR(meminfo,S_IRUGO,node_read_meminfo,NULL);
 
 
+static ssize_t node_read_physid(struct sys_device * dev, char * buf)
+{
+	struct node *node_dev = to_node(dev);
+	int len;
+
+	len = snprintf(buf, NODE_MAX_PHYSID + 1, "%s\n", node_dev->physid);
+	return len;
+}
+
+static SYSDEV_ATTR(physid,S_IRUGO,node_read_physid,NULL);
+
 /*
  * register_node - Setup a driverfs device for a node.
  * @num - Node number to use when creating the device.
@@ -67,6 +78,7 @@
 	int error;
 
 	node->cpumap = node_to_cpumask(num);
+	node_to_physid(num, node->physid);
 	node->sysdev.id = num;
 	node->sysdev.cls = &node_class;
 	error = sys_device_register(&node->sysdev);
@@ -74,6 +86,7 @@
 	if (!error){
 		sysdev_create_file(&node->sysdev, &attr_cpumap);
 		sysdev_create_file(&node->sysdev, &attr_meminfo);
+		sysdev_create_file(&node->sysdev, &attr_physid);
 	}
 	return error;
 }
===== include/asm-ia64/topology.h 1.9 vs edited =====
--- 1.9/include/asm-ia64/topology.h	Wed Jun 18 18:38:50 2003
+++ edited/include/asm-ia64/topology.h	Fri Jan 23 11:40:51 2004
@@ -60,6 +60,8 @@
 
 void build_cpu_to_node_map(void);
 
+extern void node_to_physid(int node, char *buf);
+
 #endif /* CONFIG_NUMA */
 
 #include <asm-generic/topology.h>
===== include/asm-ia64/sn/module.h 1.10 vs edited =====
--- 1.10/include/asm-ia64/sn/module.h	Sun Jan 18 22:36:15 2004
+++ edited/include/asm-ia64/sn/module.h	Fri Jan 23 11:38:34 2004
@@ -14,6 +14,7 @@
 
 
 #include <linux/config.h>
+#include <asm/sn/sgi.h>
 #include <asm/sn/klconfig.h>
 #include <asm/sn/ksys/elsc.h>
 
===== include/linux/node.h 1.5 vs edited =====
--- 1.5/include/linux/node.h	Mon Aug 18 19:46:23 2003
+++ edited/include/linux/node.h	Fri Jan 23 11:32:44 2004
@@ -22,8 +22,11 @@
 #include <linux/sysdev.h>
 #include <linux/cpumask.h>
 
+#define NODE_MAX_PHYSID 80
+
 struct node {
 	cpumask_t cpumap;	/* Bitmap of CPUs on the Node */
+	char physid[NODE_MAX_PHYSID]; /* Physical ID of node */
 	struct sys_device	sysdev;
 };
 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Exporting physical topology information
  2004-03-19 18:53       ` Jesse Barnes
@ 2004-05-07 20:13         ` Jesse Barnes
  2004-05-07 22:38           ` Greg KH
  0 siblings, 1 reply; 10+ messages in thread
From: Jesse Barnes @ 2004-05-07 20:13 UTC (permalink / raw)
  To: linux-kernel; +Cc: greg

On Friday, March 19, 2004 10:53 am, Jesse Barnes wrote:
> Here's the (very platform specific) patch I did for Altix, just to see
> what it would look like, and to solicit comments.  There's some other
> per-node stuff that would be nice to have available to userspace too,
> mostly for administrative purposes, like the chipset revision and
> type, firmware revision, and other hardware specific details.  One way
> to export that sort of thing is with some sort of arbitrary data blob,
> but like you said, that violates the sysfs one file, one value
> principle.

Greg, any comments on this?  (Sorry I was gone for much of last month and lost 
track of this thread.)  Maybe a whole subdirectory containing physical tag, 
version files and such would make more sense?

Jesse

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Exporting physical topology information
  2004-05-07 20:13         ` Jesse Barnes
@ 2004-05-07 22:38           ` Greg KH
  0 siblings, 0 replies; 10+ messages in thread
From: Greg KH @ 2004-05-07 22:38 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: linux-kernel

On Fri, May 07, 2004 at 01:13:43PM -0700, Jesse Barnes wrote:
> On Friday, March 19, 2004 10:53 am, Jesse Barnes wrote:
> > Here's the (very platform specific) patch I did for Altix, just to see
> > what it would look like, and to solicit comments.  There's some other
> > per-node stuff that would be nice to have available to userspace too,
> > mostly for administrative purposes, like the chipset revision and
> > type, firmware revision, and other hardware specific details.  One way
> > to export that sort of thing is with some sort of arbitrary data blob,
> > but like you said, that violates the sysfs one file, one value
> > principle.
> 
> Greg, any comments on this?  (Sorry I was gone for much of last month and lost 
> track of this thread.)  Maybe a whole subdirectory containing physical tag, 
> version files and such would make more sense?

I think my main comments was you should work with the other numa people
so that you all agree on one common format to export this data in sysfs.
If you all agree that this is the way to do it, I'll accept the patch :)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2004-05-08  0:29 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-03-17 21:37 Exporting physical topology information Martin Hicks
2004-03-18 17:44 ` Jesse Barnes
2004-03-18 23:21 ` Greg KH
2004-03-19 17:48   ` Martin Hicks
2004-03-19 17:57     ` Greg KH
2004-03-19 17:51   ` Jesse Barnes
2004-03-19 17:59     ` Greg KH
2004-03-19 18:53       ` Jesse Barnes
2004-05-07 20:13         ` Jesse Barnes
2004-05-07 22:38           ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).