From: Ingo Molnar <mingo@elte.hu>
To: Peter Zijlstra <peterz@infradead.org>, Greg KH <greg@kroah.com>
Cc: Lin Ming <ming.m.lin@intel.com>,
Corey Ashford <cjashfor@linux.vnet.ibm.com>,
Frederic Weisbecker <fweisbec@gmail.com>,
Paul Mundt <lethal@linux-sh.org>,
"eranian@gmail.com" <eranian@gmail.com>,
"Gary.Mohr@Bull.com" <Gary.Mohr@bull.com>,
"arjan@linux.intel.com" <arjan@linux.intel.com>,
"Zhang, Yanmin" <yanmin_zhang@linux.intel.com>,
Paul Mackerras <paulus@samba.org>,
"David S. Miller" <davem@davemloft.net>,
Russell King <rmk+kernel@arm.linux.org.uk>,
Arnaldo Carvalho de Melo <acme@redhat.com>,
Will Deacon <will.deacon@arm.com>,
Maynard Johnson <mpjohn@us.ibm.com>, Carl Love <carll@us.ibm.com>,
Kay Sievers <kay.sievers@vrfy.org>,
lkml <linux-kernel@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>
Subject: [rfc] Describe events in a structured way via sysfs
Date: Fri, 21 May 2010 11:40:53 +0200 [thread overview]
Message-ID: <20100521094053.GA4658@elte.hu> (raw)
In-Reply-To: <1274429038.1674.1684.camel@laptop>
* Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, 2010-05-20 at 16:12 -0700, Greg KH wrote:
> > How deep in the device tree are you really going to be
> > caring about? It sounds like the large majority of
> > events are only going to be coming from the "system"
> > type objects (cpu, nodes, memory, etc.) and very few
> > would be from things that we consider a 'struct
> > device' today (like a pci, usb, scsi, or input, etc.)
>
> The general noise I hear from the hardware people is
> that we'll see more and more device-level stuff - bus
> bridges/controller and actual devices (GPUs, NICs etc.)
> will be wanting to export performance metrics.
There's (much) more:
- laptops want to provide power level/usage metrics,
- we could express a lot of special, lower level
(transport specific) disk IO stats via events as well -
without having to push those stats to a higher level
(where it might not make sense). Currently such kinds
of stats/metrics are very device/subsystem specific
way, if they are provided at all.
Also, we already have quite a few per device tracepoints
upstream. Here are a few examples:
- GPU tracepoints (trace_i915_gem_request_submit(), etc.)
- WIFI tracepoints (trace_iwlwifi_dev_ioread32(), etc.)
- block tracepoints (trace_block_bio_complete())
So these would be attached to:
# GEM events of drm/card0:
/sys/devices/pci0000:00/0000:00:02.0/drm/card0/events/i915_gem_request_submit/
# Wifi-ioread events of wlan0:
/sys/devices/pci0000:00/0000:00:1c.1/0000:03:00.0/net/wlan0/events/iwlwifi_dev_ioread32/
# whole sdb disk events:
/sys/block/sdb/events/block_bio_complete/
# sdb1 partition events:
/sys/block/sdb/sdb1/events/block_bio_complete/
And we also have 'software nodes' in /sys that have events
upstream here and today. For example for SLAB we already
have kmalloc/kfree tracepoints (trace_kmalloc() and
trace_kfree()):
# all kmalloc events:
/sys/kernel/slab/events/
# kmalloc events for sighand_cache:
/sys/kernel/slab/sighand_cache/events/kmalloc/
# kfree events for sighand_cache:
/sys/kernel/slab/sighand_cache/events/kfree/
In general the set of events we have upstream is growing
along an exponential curve (there's over a hundred now,
via tracepoints).
They are either logically attached to the hardware
topology of the system (as in the first set of examples
above), or ae attached to the software/subsystem object
topology of the kernel (some examples of which are
described in the second set of examples above).
Sometimes there are aliasing/filtering relationship
between events, which is expressed very well via the
hierarchy and granularity of /sysfs.
New events would go into that topology there in a natural
way.
For example general hugepage tracepoints (should we
introduce any) would go into the existing hugepage node:
/sys/kernel/mm/hugepages/events/...
All in one, all these existing and future events, both of
hardware and software type, are literally begging to be
attached to nodes in /sys :-)
If we created a separate eventfs for it we'd have to start
with duplicating all the topology/hiearchy/structure that
is present in sysfs already. (and dilluting /sys's
utility)
That would be a bad thing, so it would be nice if we found
a workable solution here. We could split up the record
format some more:
/sys/kernel/sched/events/sched_wakeup/format/
/sys/kernel/sched/events/sched_wakeup/format/common_type/
/sys/kernel/sched/events/sched_wakeup/format/common_flags/
/sys/kernel/sched/events/sched_wakeup/format/common_preempt_count/
/sys/kernel/sched/events/sched_wakeup/format/common_pid/
/sys/kernel/sched/events/sched_wakeup/format/common_lock_depth/
/sys/kernel/sched/events/sched_wakeup/format/comm/
/sys/kernel/sched/events/sched_wakeup/format/pid/
/sys/kernel/sched/events/sched_wakeup/format/prio/
/sys/kernel/sched/events/sched_wakeup/format/success/
/sys/kernel/sched/events/sched_wakeup/format/target_cpu/
Into single-value files. But this would add significant
parsing overhead (plus significant allocation overhead),
for no tangible benefit.
The problem with /proc was always the lack of standard
structure and the lack of performance - while the format
file is about _more_ structure.
Increasing structure parsing overhead does not look like
the right answer to that problem.
Hm?
Ingo
next prev parent reply other threads:[~2010-05-21 9:41 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-05-19 1:46 [RFC][PATCH v2 06/11] perf: core, export pmus via sysfs Lin Ming
2010-05-18 20:05 ` Greg KH
2010-05-19 2:34 ` Lin Ming
2010-05-19 2:48 ` Greg KH
2010-05-19 3:40 ` Lin Ming
2010-05-19 5:00 ` Greg KH
2010-05-19 6:32 ` Lin Ming
2010-05-19 7:14 ` Peter Zijlstra
2010-05-20 18:42 ` Greg KH
2010-05-20 19:52 ` Peter Zijlstra
2010-05-20 20:19 ` Greg KH
2010-05-20 20:14 ` Ingo Molnar
2010-05-20 23:12 ` Greg KH
2010-05-21 8:03 ` Peter Zijlstra
2010-05-21 9:40 ` Ingo Molnar [this message]
[not found] ` <AANLkTinJeYJtCg2aRWhHTcf5E2-dN2-oAfEJ8tAtFjb9@mail.gmail.com>
2010-06-01 2:34 ` [rfc] Describe events in a structured way " Lin Ming
2010-06-08 18:43 ` Ingo Molnar
[not found] ` <AANLkTimf1Z0N9cv2Pu2qTTUscn4utC37zOPelCbqQoPv@mail.gmail.com>
2010-06-21 8:55 ` Lin Ming
[not found] ` <1277112858.3618.16.camel@jlt3.sipsolutions.net>
[not found] ` <1277187920.4467.3.camel@minggr.sh.intel.com>
[not found] ` <1277189971.3637.5.camel@jlt3.sipsolutions.net>
2010-06-22 7:22 ` Lin Ming
2010-06-22 7:33 ` Johannes Berg
2010-06-22 7:39 ` Johannes Berg
2010-06-22 8:04 ` Lin Ming
2010-06-22 8:16 ` Johannes Berg
2010-06-22 7:47 ` Lin Ming
2010-06-22 7:52 ` Johannes Berg
2010-06-24 9:36 ` Ingo Molnar
2010-06-24 16:14 ` Johannes Berg
2010-06-24 17:33 ` Ingo Molnar
2010-06-29 6:15 ` Lin Ming
2010-06-29 8:55 ` Ingo Molnar
2010-06-29 9:20 ` Lin Ming
2010-06-29 10:26 ` Ingo Molnar
2010-07-02 8:06 ` Lin Ming
2010-07-03 12:54 ` Ingo Molnar
2010-07-17 0:20 ` Corey Ashford
2010-07-20 5:48 ` Lin Ming
2010-07-20 15:19 ` Robert Richter
2010-07-20 17:50 ` Corey Ashford
2010-07-20 18:30 ` Robert Richter
2010-07-20 21:18 ` Corey Ashford
2010-07-20 17:43 ` Corey Ashford
2010-05-19 7:06 ` [RFC][PATCH v2 06/11] perf: core, export pmus " Borislav Petkov
2010-05-19 7:17 ` Peter Zijlstra
2010-05-19 7:23 ` Ingo Molnar
2010-05-18 20:07 ` Greg KH
2010-05-19 2:37 ` Lin Ming
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100521094053.GA4658@elte.hu \
--to=mingo@elte.hu \
--cc=Gary.Mohr@bull.com \
--cc=acme@redhat.com \
--cc=arjan@linux.intel.com \
--cc=carll@us.ibm.com \
--cc=cjashfor@linux.vnet.ibm.com \
--cc=davem@davemloft.net \
--cc=eranian@gmail.com \
--cc=fweisbec@gmail.com \
--cc=greg@kroah.com \
--cc=kay.sievers@vrfy.org \
--cc=lethal@linux-sh.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ming.m.lin@intel.com \
--cc=mpjohn@us.ibm.com \
--cc=paulus@samba.org \
--cc=peterz@infradead.org \
--cc=rmk+kernel@arm.linux.org.uk \
--cc=tglx@linutronix.de \
--cc=will.deacon@arm.com \
--cc=yanmin_zhang@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).