linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tom Zanussi <zanussi@us.ibm.com>
To: "Perez-Gonzalez, Inaky" <inaky.perez-gonzalez@intel.com>
Cc: "'karim@opersys.com'" <karim@opersys.com>,
	"'Martin Hicks'" <mort@wildopensource.com>,
	"'Daniel Stekloff'" <dsteklof@us.ibm.com>,
	"'Patrick Mochel'" <mochel@osdl.org>,
	"'Randy.Dunlap'" <rddunlap@osdl.org>,
	"'hpa@zytor.com'" <hpa@zytor.com>,
	"'pavel@ucw.cz'" <pavel@ucw.cz>,
	"'jes@wildopensource.com'" <jes@wildopensource.com>,
	"'linux-kernel@vger.kernel.org'" <linux-kernel@vger.kernel.org>,
	"'wildos@sgi.com'" <wildos@sgi.com>,
	"'Tom Zanussi'" <zanussi@us.ibm.com>
Subject: RE: [patch] printk subsystems
Date: Fri, 18 Apr 2003 02:21:06 -0500	[thread overview]
Message-ID: <16031.42850.146902.895382@lepton.softprops.com> (raw)
In-Reply-To: <A46BBDB345A7D5118EC90002A5072C780C2630D5@orsmsx116.jf.intel.com>

Perez-Gonzalez, Inaky writes:
 > 
 > 
 > 
 > Well, the total overhead for queuing an event is strictly O(1),
 > bar the acquisition of the queue's semaphore in the middle [I
 > still hadn't time to finish this and post it, btw]. I think it
 > is pretty scalable assuming you don't have the whole system 
 > delivering to a single queue.
 > 
 > Total is four lines if I unfold __kue_queue(), and the list_add_tail()
 > is not that complex. That's versus relay_write(), that I think is the
 > equivalent function [bar the extra goodies] is more complex
 > [disclaimer: this is just looking over the 030317 patch's shoulder,
 > I am in kind of a rush - feel free to correct me here].
 > 

It seems to me that when comparing apples to apples, namely
considering the complete lifecycle of an event, kue and relayfs are
very similar wrt performance and memory usage; whether kue is
scaleable or not I couldn't say, but we've previously published
benchmarks for LTT on this list showing that the relayfs logging code
(the same as that used by LTT) scales very well to logging millions
upon millions of events with low overhead.  

While kue_send_event() in itself is very simple and efficient, it's
only part of the story, the other parts being the copy_to_user() that
must be done to get each event to user space and the subsequent
bookeeping necessary to remove it from the queue and make destructor
calls.  Only if we include all of the above is relayfs' relay_write()
equivalent - once relay_write() returns, that's the end of the story
as far as that event is concerned - at that point the data is directly
available to a client that has the buffer mmapped, and nothing more
remains to be done.  So yes, relay_write() is more complex code-wise
because it's doing more.  As far as algorithmic complexity goes, the
time to log an event via relay_write() is also pretty much constant,
the only variables being that it may take more than one iteration to
reserve a slot in case of a reserve collision with another writer,
which should happen fairly rarely, and the fact that if a given event
is the last event in a buffer, the end-of-buffer slow path is
triggered, which is also relatively speaking a rare occurrence.
Actually, the time it takes to memcpy the event into the relayfs
buffer should also be factored in, as it depends on the size of the
event.  While kue can avoid this kernel-side copy, it's not possible
for it to avoid the copy_to_user() since its design precludes mmapping
the kernel data.  Again, six of one, half dozen of another.  kue looks
like a nice elegant way of logging small bits of data and I'm sure it
has its advantages, though I think the same thing could be
accomplished in a slightly different way with a relayfs channel.

Anyway, to address the original topic, I'm working on a drop-in
replacement of printk that replaces the static printk buffer with a
dynamically resizeable relayfs channel (a new relayfs capability that
will be available to all relayfs clients).  In addition to being
resizeable manually (probably via commands to the syslog system call),
it will also have an 'auto-resize' capability that allows the printk
channel to adapt to printk traffic levels - increase as necessary when
an overflow condition is detected, and fall back to a more reasonable
level when the excess capacity is no longer needed.  Init-time printks
will still use the static printk buffer, but because the static buffer
is marked as __initdata, it can be made large enough to handle lots of
init-time data, all of which is atomically copied over to the the
dynamic relayfs channel before init data is discarded.  Once klogd has
logged all the init data then present in the temporarily enlarged
relay channel, the channel would then resize itself to to a normal
working size.  Hopefully this will solve the problem of lost printks
both at boot-time and during normal operation and isn't a stopgap
measure.

-- 
Regards,

Tom Zanussi <zanussi@us.ibm.com>
IBM Linux Technology Center/RAS


  parent reply	other threads:[~2003-04-18 14:51 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-04-17 19:58 [patch] printk subsystems Perez-Gonzalez, Inaky
2003-04-17 20:34 ` Karim Yaghmour
2003-04-17 21:03   ` Perez-Gonzalez, Inaky
2003-04-17 21:37     ` Tom Zanussi
2003-04-18  7:21     ` Tom Zanussi [this message]
2003-04-18  7:42     ` Greg KH
2003-04-21 15:56     ` Karim Yaghmour
  -- strict thread matches above, loose matches on Subject: below --
2003-04-24 18:56 Manfred Spraul
2003-04-24 19:10 ` bob
2003-04-23  0:28 Perez-Gonzalez, Inaky
2003-04-22 22:53 Perez-Gonzalez, Inaky
2003-04-23  3:58 ` Tom Zanussi
2003-04-22 19:02 Perez-Gonzalez, Inaky
2003-04-22 19:03 ` H. Peter Anvin
2003-04-22 21:52 ` Tom Zanussi
2003-04-22 18:46 Perez-Gonzalez, Inaky
2003-04-22 23:28 ` Karim Yaghmour
2003-04-22  5:09 Perez-Gonzalez, Inaky
2003-04-24 18:22 ` bob
2003-04-22  4:02 Perez-Gonzalez, Inaky
2003-04-22  5:52 ` Karim Yaghmour
2003-04-22  6:04 ` Tom Zanussi
2003-04-22  3:04 Perez-Gonzalez, Inaky
2003-04-22  6:00 ` Tom Zanussi
2003-04-22  2:49 Perez-Gonzalez, Inaky
2003-04-22  4:34 ` Karim Yaghmour
2003-04-21 18:42 Perez-Gonzalez, Inaky
2003-04-21 18:23 Perez-Gonzalez, Inaky
2003-04-21 18:30 ` H. Peter Anvin
2003-04-08 23:15 Chuck Ebbert
2003-04-07 20:13 Martin Hicks
2003-04-08 18:41 ` Pavel Machek
2003-04-08 20:02   ` Jes Sorensen
2003-04-08 21:02     ` Pavel Machek
2003-04-08 21:10       ` H. Peter Anvin
2003-04-08 21:57         ` Pavel Machek
2003-04-08 22:02           ` Jes Sorensen
2003-04-08 22:05           ` H. Peter Anvin
2003-04-08 22:55             ` Martin Hicks
2003-04-08 23:10               ` Randy.Dunlap
2003-04-14 18:33                 ` Patrick Mochel
2003-04-14 22:33                   ` Daniel Stekloff
2003-04-16 18:42                     ` Patrick Mochel
2003-04-16 12:35                       ` Daniel Stekloff
2003-04-16 19:16                       ` Martin Hicks
2003-04-16 12:43                         ` Daniel Stekloff
2003-04-17 15:56                           ` Martin Hicks
2003-04-17 13:58                             ` Karim Yaghmour
2003-04-15 13:27                   ` Martin Hicks
2003-04-15 14:40                     ` Karim Yaghmour
2003-04-08 22:00       ` Jes Sorensen
2003-04-11 19:21 ` Martin Hicks

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=16031.42850.146902.895382@lepton.softprops.com \
    --to=zanussi@us.ibm.com \
    --cc=dsteklof@us.ibm.com \
    --cc=hpa@zytor.com \
    --cc=inaky.perez-gonzalez@intel.com \
    --cc=jes@wildopensource.com \
    --cc=karim@opersys.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mochel@osdl.org \
    --cc=mort@wildopensource.com \
    --cc=pavel@ucw.cz \
    --cc=rddunlap@osdl.org \
    --cc=wildos@sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).