linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* (no subject)
@ 2002-09-21  5:32 Greg KH
  2002-09-23 18:35 ` your mail Patrick Mochel
  0 siblings, 1 reply; 2+ messages in thread
From: Greg KH @ 2002-09-21  5:32 UTC (permalink / raw)
  To: Rhoads, Rob, linux-kernel, hardeneddrivers-discuss, cgl_discussion

cgl_discussion@lists.osdl.org, hardeneddrivers-discuss@lists.sourceforge.net
Cc: 
Bcc: 
Subject: 
Reply-To: 

hardeneddrivers-discuss@lists.sourceforge.net,
cgl_discussion@lists.osdl.org
Bcc: 
Subject: my review of the Device Driver Hardening Design Spec
Reply-To: 
In-Reply-To: <20020921014054.GA25665@kroah.com>

On Fri, Sep 20, 2002 at 06:40:54PM -0700, Greg KH wrote:
> Hi,
> 
> I've just started to read over the published spec, and will reserve
> comment on it, and the example code you've created after I'm done
> reading it.

Ok, here's some comments on the 0.5h release of the Device Driver
Hardening Design Specification:

(I'll skip the intro, and feel good sections and get into the details
that you lay out, starting in section 2)

Section 2:
2.1:
	- do NOT use /proc for driver info.  Use driverfs.
	- If you are using a kernel version that does not have driverfs,
	  put all /proc driver info under /proc/drivers, which is where
	  it belongs.
	- Only have 1 value per file, and no binary data in the files.
	- Do not put the "kernel version for which the driver was
	  compiled", as that _always_ much match the kernel version that
	  is running, so is redundant.

2.2:
	- do NOT use typedef

2.5.5:
	- you do not have to always check data returned from functions,
	  if you wrote the functions in the first place.  Redundant
	  checking of all data within the kernel, slows things down.
	  Sure, some checking is good, but do not say that it is a
	  requirement, or no one will want to use your driver.

The majority of section 2 is very nice, it's a good list of things that
drivers should do.


Section 3:

Wow, where to start...

The Common Statistic Manager:
	- why does this have to live in the kernel?  It should be in
	  userspace, grabbing all of the data from the /proc files you
	  just specified in section 2.1.
	  
POSIX event logging:
	- wow, not much I can say here, that hasn't already been said
	  before :(

Diagnostics:
	- now these are a good idea.  A common subsystem that drivers
	  can register what kind of diagnostics they can run on their
	  hardware, nice.

3.1.1:
	- UUIDs!!!???  You have got to be kidding.  Here, for the
	  benefit of those who have not read this, I'll quote:
	  	"Each subsystem, and each resource contained within each
		subsystem, needs to be uniquely identified.  In order to
		do this a hardened driver developer shall pre-assign a
		Universally Unique Identifier (UUID) as the Subsystem ID
		for each subsystem, and shall provide a means to assign
		a unique Resource ID string for each resource within a
		subsystem."
	
	 So for every resource, a string shall be associated with it.
	 But that means for most resources, the string will take up more
	 memory than the resource itself does.  Does that make sense?

	 It's also up to the driver to create these resource ids at
	 runtime and guarantee their uniqueness over the lifetime of the
	 kernel.  How in the world can you expect every driver author to
	 do this?  Any example code out there?

	 And what are these UUIDs going to be used for, ah, event
	 logging.  Enough said.

3.2 Statistics:
	You actually want every driver to support SNMP compliant
	statistics groups within themselves?  Why?  What a bloat of a
	kernel.

	All of this should be done (if at all) from userspace.
	

3.2.5.2:
(I'm not condoning ANY of these functions or code, just trying to point out how
you should, if they were to be in the kernel, done properly.)
	- do not use typedef
	- struct stat_info does not need *unit, as that is already
	  specified in the scale field, right?
	- the stat_value_t union is just a horrible abomination, don't
	  do that.

3.3 Diagnostics:
	- not a bad idea, but some work could be done on the
	  implementation.  Would fit in nicely with the device driver
	  model in 2.5.  For 2.4, it would be another subsystem a driver
	  would register with.

3.3.3.2:
	- no typedefs
	- run() is horrible, you are trying to fit all kinds of possible
	  diagnosis into one function callback.  Not a good idea.
	  Break the different kinds of callbacks out into different
	  functions.  That ensures type safety, right now you are just
	  creating another ioctl() type mess.

3.4 Event logging:
	- I'm not even going to touch this, sorry.

4: High Availability
	- are you all working with the existing HA group?

4.1:
	- um, what are you trying to say here.  This section is
	  pointless.  Yes we all think Hot Swap is a good idea, that's
	  why Linux currently supports it.

4.2:
	- RAID and ethernet bonding is nice. Again, Linux already has
	  projects and support for these things.  Why mention them?


The rest of this section is fine, and I welcome any test harnesses that
are created to do this kind of fault injection for driver testing.

5:
	- Here you back-pedal on everything you said up till now.  Let
	  me summarize what is said in these 3 paragraphs in 1 sentence:
	  	"Yes, all these things are well and good, but don't let
		them effect the currently great performance Linux has
		today."
	  Sorry, but you can't have it both ways.

5.1:
	- do NOT use #ifdef in the .c files.  Only in .h files.
	- why is CONFIG_DRIVER_HOTSWAP an option.  What does it do that
	  CONFIG_HOTPLUG does not do today?
	- actually, what do any of these CONFIG_ options do, and why
	  would someone not want the CONFIG_DRIVER_ROBUST to be always
	  enabled?


In summary, I think that a lot of people have spent a lot of time in
creating this document, and the surrounding code that matches this
document.  I really wish that a tiny bit of that effort had gone into
contacting the Linux kernel development community, and asking to work
with them on a project like this.  Due to that not happening, and by
looking at the resultant spec and code, I'm really afraid the majority
of that time and effort will have been wasted.

What do I think can be salvaged?  Diagnostics are a good idea, and I
think they fit into the driver model in 2.5 pretty well.  A lot of
kernel janitoring work could be done by the CG team to clean up, and
harden (by applying the things in section 2) the existing kernel
drivers.  That effort alone would go a long way in helping the stability
of Linux, and also introduce the CG developers into the kernel community
as active, helping developers.  It would allow the CG developers to
learn from the existing developers, as we must be doing something right
for Linux to be working as well as it does :)

Also, open specs for the hardware the CG members produce, to allow
existing kernel drivers to be enhanced (instead of having to be reverse
engineered), and new kernel drivers to be created, would also go a long
way in helping out both the CG's members and the entire Linux
community's cause of having a robust, stable kernel be achived easier.
Closed specs, and closed drivers do not help anyone.


thanks for reading this far,

greg k-h

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: your mail
  2002-09-21  5:32 Greg KH
@ 2002-09-23 18:35 ` Patrick Mochel
  0 siblings, 0 replies; 2+ messages in thread
From: Patrick Mochel @ 2002-09-23 18:35 UTC (permalink / raw)
  To: Rhoads, Rob
  Cc: Greg KH, linux-kernel, hardeneddrivers-discuss, cgl_discussion


In general, I agree completely with what Greg says (as usual), but I do 
have a few additional comments.

> (I'll skip the intro, and feel good sections and get into the details
> that you lay out, starting in section 2)
> 
> Section 2:
> 2.1:
> 	- do NOT use /proc for driver info.  Use driverfs.
> 	- If you are using a kernel version that does not have driverfs,
> 	  put all /proc driver info under /proc/drivers, which is where
> 	  it belongs.

Actually, they mention using driverfs in Section 3: Instrumentation. I 
can't tell if this was around before, or this was just added. The date is 
the same (16 Aug), but there is no changelog information about the spec. 

I would suggest not using procfs at all, even if driverfs is not avaiable.  
If you're using 2.4, backport driverfs, or clone it for your own
filesystem. It's not dependent on the driver model at all, and has been
done at least once before (Greg's pcihpfs).

> Section 3:

> The Common Statistic Manager:

Please drop the term 'Manager' from your nomenclature. It is ambiguous, 
because of the context in which its generally used in. Windows uses the 
term for any collection of kernel or device data and/or kernel policy. 
It's not a bad term, but it fails to make a clear distinction between 
kernel space and user space, which we insist on. 

Only the mechanism for setting the policy should exist in the kernel, and
itself my be very intelligent. But, the policy itself should exist outside
of the kernel.


> 3.2.5.2:
> (I'm not condoning ANY of these functions or code, just trying to point out how
> you should, if they were to be in the kernel, done properly.)
> 	- do not use typedef
> 	- struct stat_info does not need *unit, as that is already
> 	  specified in the scale field, right?
> 	- the stat_value_t union is just a horrible abomination, don't
> 	  do that.

Please do not pass void *. You should only pass type-safe structures. If 
you cannot get that information, you should redesign the API. 

> 3.4 Event logging:
> 	- I'm not even going to touch this, sorry.

There are a lot of topics in this spec, most of which are irrelevant to 
actually hardening drivers. They may be features dependent on your APIs, 
but they are completely optional and may hinder acceptance of your primary 
objectives. 

Event logging is definitely one of them, esp. with a function like

evl_log_event_string(  
	ME_EVENT_BUCKET_EMPTY, 
	LOG_WARNING, 
	"Leaky bucket exception (bucket empty):\ 
	Bucket_Level <= Observed_Value - Last_Value\ 
	|%s=%s|%s=%s|%s=%s|%s=%s|%s=%s|%s=%s|%s=%s|%s=%s\ 
	|%s=%u|%s=%u|%s=%u|%s=%u|%s=%u|%s=%u|", 
	RMGT_FacilityIDAttrStr,         RMUUID, 
	RMGT_SubsystemIDAttrStr,    SUBSYSTEM_UUID, 
	RMGT_SubsystemNameAttrStr,  subsystem_name, 
	RMGT_ResourceIDAttrStr,         resource_id, 
	RMGT_ResourceNameAttrStr,   resource_name, 
	ME_MonitorIDAttrStr,        monitor_uuid,  
	ME_StatisticIDAttrStr,         statistic_id, 
	ME_StatisticNameAttrStr,    statistic_name,  
	ME_BucketSizeAttrStr,       bucketsz,  
	ME_FillValueAttrStr,            fillval, 
	ME_FillIntervalAttrStr,         fillint, 
	ME_BucketLevelAttrStr,      bucketlvl, 
	ME_ObservedValueAttrStr,    obsval, 
	ME_LastValueAttrStr,        lastval); 


> In summary, I think that a lot of people have spent a lot of time in
> creating this document, and the surrounding code that matches this
> document.  I really wish that a tiny bit of that effort had gone into
> contacting the Linux kernel development community, and asking to work
> with them on a project like this.  Due to that not happening, and by
> looking at the resultant spec and code, I'm really afraid the majority
> of that time and effort will have been wasted.

I completely agree. There is definitely good intention in some aspects of 
the spec, and definitely in the effort put forth to support this type of 
work. But, in order to gain the support of kernel developers, or even the 
blessing of a few, you should be working with them on the design from the 
beginnging.

Designing APIs is hard. Doing it well is very hard. I'm not claiming I've 
done a stellar job, but I have at least learned that. I've made a lot of 
poor design decisions, many of which are also evident in your code 
descriptions and examples. I can't tell you how many times I've rewritten 
things over and over and over because someone hated them (usually Linus or 
Greg).

There are people that are willing to help, as we are trying to do. But,
it's much easier if you do things gradually and get that help from the
beginning.

> What do I think can be salvaged?  Diagnostics are a good idea, and I
> think they fit into the driver model in 2.5 pretty well.  A lot of
> kernel janitoring work could be done by the CG team to clean up, and
> harden (by applying the things in section 2) the existing kernel
> drivers.  That effort alone would go a long way in helping the stability
> of Linux, and also introduce the CG developers into the kernel community
> as active, helping developers.  It would allow the CG developers to
> learn from the existing developers, as we must be doing something right
> for Linux to be working as well as it does :)

Which kernel are you targeting? I didn't see it in the spec, though I 
could have easily missed it. CGL is based on 2.4, so I would assume that. 
But, I would think the ideal choice would be to start in 2.5 and backport 
it to 2.4. 

If that's the case, how do you intend to work with the driver model? 
There will be quite a bit of code and interface duplication between
your code and the driver model. I can see ways to support many of the
things you want in a relatively easy manner, and not punish the common
user or developer; but the margin is to small to write the answer... ;)

Also, there are many projects in areas similar to what your doing: 
diganostics, HA, etc, etc. It would be nice to see some collaboration 
between the developers of those projects instead of having many disparate 
projects with similar goals. 


	-pat







^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2002-09-23 18:29 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-09-21  5:32 Greg KH
2002-09-23 18:35 ` your mail Patrick Mochel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).