linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Linux Kernel bug report (includes fix)
@ 2004-08-07 12:51 Joerg Schilling
  2004-08-07 13:26 ` Måns Rullgård
  2004-08-08  1:18 ` Horst von Brand
  0 siblings, 2 replies; 87+ messages in thread
From: Joerg Schilling @ 2004-08-07 12:51 UTC (permalink / raw)
  To: axboe, schilling; +Cc: linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1433 bytes --]

-	Linux Kernel include files (starting with Linux-2.5) are buggy and 
	prevent compilation. Many files may be affected but let me name
	the most important files for me:

	-	/usr/src/linux/include/scsi/scsi.h depends on a nonexistant
		type "u8". The correct way to fix this would be to replace
		any "u8" by "uint8_t". A quick and dirty fix is to call:

			"change u8 __u8 /usr/src/linux/include/scsi/scsi.h"

		ftp://ftp.berlios.de/pub/change/

	-	/usr/src/linux/include/scsi/sg.h includes "extra text" "__user"
		in some structure definitions. This may be fixed by adding
		#include <linux/compiler.h> somewhere at the beginning of
		/usr/src/linux/include/scsi/sg.h

	This bug has been reported several times (starting with Linux-2.5).

	Time to fix: 5 minutes.
	
I did spend far to much time with the discussion on LKML..... so I need a cue
whether it makes sense to continue this discussion.

You now again have the bug report _and_ the fix in a single short mail.

If the bug mentioned above is not fixed in Linux-2.6.8, I will asume that it 
makes no sense to spend further time in discussions with LKML.

Best regards

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Linux Kernel bug report (includes fix)
  2004-08-07 12:51 Linux Kernel bug report (includes fix) Joerg Schilling
@ 2004-08-07 13:26 ` Måns Rullgård
  2004-08-07 19:32   ` Bernd Schubert
  2004-08-08  1:18 ` Horst von Brand
  1 sibling, 1 reply; 87+ messages in thread
From: Måns Rullgård @ 2004-08-07 13:26 UTC (permalink / raw)
  To: linux-kernel

Joerg Schilling <schilling@fokus.fraunhofer.de> writes:

> You now again have the bug report _and_ the fix in a single short mail.

I could see no patch contained in your mail.

-- 
Måns Rullgård
mru@kth.se


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Linux Kernel bug report (includes fix)
  2004-08-07 13:26 ` Måns Rullgård
@ 2004-08-07 19:32   ` Bernd Schubert
  0 siblings, 0 replies; 87+ messages in thread
From: Bernd Schubert @ 2004-08-07 19:32 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 305 bytes --]

On Saturday 07 August 2004 15:26, Måns Rullgård wrote:
> Joerg Schilling <schilling@fokus.fraunhofer.de> writes:
> > You now again have the bug report _and_ the fix in a single short mail.
>
> I could see no patch contained in your mail.

I guess thats what Joerg wants to have.

Cheers,
	Bernd

[-- Attachment #2: scsi_head_patch.out --]
[-- Type: text/x-csrc, Size: 1934 bytes --]

diff -ru linux-2.6.8-rc2-mm2.bak/include/scsi/scsi.h linux-2.6.8-rc2-mm2/include/scsi/scsi.h
--- linux-2.6.8-rc2-mm2.bak/include/scsi/scsi.h	2004-06-16 07:20:25.000000000 +0200
+++ linux-2.6.8-rc2-mm2/include/scsi/scsi.h	2004-08-07 20:51:27.000000000 +0200
@@ -214,25 +214,25 @@
  */
 
 struct ccs_modesel_head {
-	u8 _r1;			/* reserved */
-	u8 medium;		/* device-specific medium type */
-	u8 _r2;			/* reserved */
-	u8 block_desc_length;	/* block descriptor length */
-	u8 density;		/* device-specific density code */
-	u8 number_blocks_hi;	/* number of blocks in this block desc */
-	u8 number_blocks_med;
-	u8 number_blocks_lo;
-	u8 _r3;
-	u8 block_length_hi;	/* block length for blocks in this desc */
-	u8 block_length_med;
-	u8 block_length_lo;
+	uint8_t _r1;			/* reserved */
+	uint8_t medium;		/* device-specific medium type */
+	uint8_t _r2;			/* reserved */
+	uint8_t block_desc_length;	/* block descriptor length */
+	uint8_t density;		/* device-specific density code */
+	uint8_t number_blocks_hi;	/* number of blocks in this block desc */
+	uint8_t number_blocks_med;
+	uint8_t number_blocks_lo;
+	uint8_t _r3;
+	uint8_t block_length_hi;	/* block length for blocks in this desc */
+	uint8_t block_length_med;
+	uint8_t block_length_lo;
 };
 
 /*
  * ScsiLun: 8 byte LUN.
  */
 struct scsi_lun {
-	u8 scsi_lun[8];
+	uint8_t scsi_lun[8];
 };
 
 /*
diff -ru linux-2.6.8-rc2-mm2.bak/include/scsi/sg.h linux-2.6.8-rc2-mm2/include/scsi/sg.h
--- linux-2.6.8-rc2-mm2.bak/include/scsi/sg.h	2004-08-03 13:22:50.000000000 +0200
+++ linux-2.6.8-rc2-mm2/include/scsi/sg.h	2004-08-07 20:53:41.000000000 +0200
@@ -89,6 +89,8 @@
 
 /* New interface introduced in the 3.x SG drivers follows */
 
+#include <linux/compiler.h>
+
 typedef struct sg_iovec /* same structure as used by readv() Linux system */
 {                       /* call. It defines one scatter-gather element. */
     void __user *iov_base;      /* Starting address  */

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Linux Kernel bug report (includes fix)
  2004-08-07 12:51 Linux Kernel bug report (includes fix) Joerg Schilling
  2004-08-07 13:26 ` Måns Rullgård
@ 2004-08-08  1:18 ` Horst von Brand
  2004-08-08  5:22   ` Alexander E. Patrakov
  1 sibling, 1 reply; 87+ messages in thread
From: Horst von Brand @ 2004-08-08  1:18 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: axboe, linux-kernel

Joerg Schilling <schilling@fokus.fraunhofer.de> said:
> -	Linux Kernel include files (starting with Linux-2.5) are buggy and 
> 	prevent compilation.

They do not, the kernel compiles just fine. They are _not_ to be used for
random userspace programs.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: Linux Kernel bug report (includes fix)
  2004-08-08  1:18 ` Horst von Brand
@ 2004-08-08  5:22   ` Alexander E. Patrakov
  0 siblings, 0 replies; 87+ messages in thread
From: Alexander E. Patrakov @ 2004-08-08  5:22 UTC (permalink / raw)
  To: linux-kernel

Horst von Brand wrote:
> Joerg Schilling <schilling@fokus.fraunhofer.de> said:
> 
>>-	Linux Kernel include files (starting with Linux-2.5) are buggy and 
>>	prevent compilation.
> 
> 
> They do not, the kernel compiles just fine. They are _not_ to be used for
> random userspace programs.

You are supposed to either bring the needed "sanitized" kernel headers 
with your program, or have those provided by the linux-libc-headers 
(http://ep09.pld-linux.org/~mmazur/linux-libc-headers/) in /usr/include. 
Adding /usr/src/linux/include to the gcc search path is a bug in 
userspace programs.

-- 
Alexander E. Patrakov


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
       [not found]                 ` <2vDtS-bq-19@gated-at.bofh.it>
@ 2004-08-21 15:01                   ` Pascal Schmidt
  2004-08-21 15:57                     ` Joerg Schilling
  0 siblings, 1 reply; 87+ messages in thread
From: Pascal Schmidt @ 2004-08-21 15:01 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: linux-kernel

On Sat, 21 Aug 2004 14:50:08 +0200, you wrote in linux.kernel:

> If the owners and permissions of the filesystem have been set up correctly,
> then there is no security problem. 

The previous Linux implementation allowed users with *read* access
to the device to send arbitrary SG_IO commands. Giving read permission
to normal users is quite common, to allow them to run isosize or play
their freshly burned SVCDs with mplayer.

It violated the principle of least surprise that a user can screw
the device without even having write permission.

Yes, it breaks user-space programs, and yes, the kernel is to blame
for its previous behavior, not user-space. However, now we need to
get on, and going back to the previous behavior, which because
the discussion is now a well-known security hole, is not an option.

-- 
Ciao,
Pascal

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-21 15:01                   ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Pascal Schmidt
@ 2004-08-21 15:57                     ` Joerg Schilling
  2004-08-21 21:42                       ` Pascal Schmidt
  2004-08-22 11:56                       ` Joerg Schilling
  0 siblings, 2 replies; 87+ messages in thread
From: Joerg Schilling @ 2004-08-21 15:57 UTC (permalink / raw)
  To: schilling, der.eremit; +Cc: linux-kernel

Pascal Schmidt <der.eremit@email.de> wrote:

> On Sat, 21 Aug 2004 14:50:08 +0200, you wrote in linux.kernel:
>
> > If the owners and permissions of the filesystem have been set up correctly,
> > then there is no security problem. 
>
> The previous Linux implementation allowed users with *read* access
> to the device to send arbitrary SG_IO commands. Giving read permission

This is of course a kernel bug - but it could be easily fixed.
My scg driver for SunOS requires write permissions since it has been
created in August 1986.


> to normal users is quite common, to allow them to run isosize or play
> their freshly burned SVCDs with mplayer.

So changing the kernel to require write permissions would be a simple fix that
would help without breaking cdrtools as libscg of course opens the devices with 
O_RDWR.

I am not against a long term change that would require euid root too, but this 
should be announced early enough to allow prominent users of the interface to 
keep track of the interface changes.

BTW: the currely used errno EACCESS applies to file permissions while EPERM
applies to process permissions. So EPERM would be a more appropriate errno 
value.

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-21 15:57                     ` Joerg Schilling
@ 2004-08-21 21:42                       ` Pascal Schmidt
  2004-08-22 11:56                       ` Joerg Schilling
  1 sibling, 0 replies; 87+ messages in thread
From: Pascal Schmidt @ 2004-08-21 21:42 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: linux-kernel

On Sat, 21 Aug 2004, Joerg Schilling wrote:

> So changing the kernel to require write permissions would be a simple
> fix that would help without breaking cdrtools as libscg of course opens
> the devices with O_RDWR.

I agree, but Linus obviously thought otherwise. Reverting that and
doing the above fix instead would create three different behaviours
for different 2.6.x kernel versions, which is also undesirable.

> I am not against a long term change that would require euid root too,
> but this should be announced early enough to allow prominent users of
> the interface to keep track of the interface changes.

Too late for that now, no matter whether we like it or not... however,
at least the discussion now has shown that changes to this interface
need to be considered carefully, so maybe the future will be
bright. ;)

-- 
Ciao,
Pascal

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-21 15:57                     ` Joerg Schilling
  2004-08-21 21:42                       ` Pascal Schmidt
@ 2004-08-22 11:56                       ` Joerg Schilling
  2004-08-22 12:14                         ` Joerg Schilling
  2004-08-22 13:13                         ` Pascal Schmidt
  1 sibling, 2 replies; 87+ messages in thread
From: Joerg Schilling @ 2004-08-22 11:56 UTC (permalink / raw)
  To: schilling, der.eremit; +Cc: linux-kernel

Let me give some additional remarks to clear up things:

Joerg Schilling <schilling@fokus.fraunhofer.de> wrote:

> Pascal Schmidt <der.eremit@email.de> wrote:

> > The previous Linux implementation allowed users with *read* access
> > to the device to send arbitrary SG_IO commands. Giving read permission
>
> This is of course a kernel bug - but it could be easily fixed.
> My scg driver for SunOS requires write permissions since it has been
> created in August 1986.

Not checking for Write access permissions at this place is a typical mistake
made by novice programmers, so I never thought this could be in Linux.....


> > to normal users is quite common, to allow them to run isosize or play
> > their freshly burned SVCDs with mplayer.
>
> So changing the kernel to require write permissions would be a simple fix that
> would help without breaking cdrtools as libscg of course opens the devices with 
> O_RDWR.

If Linux still noes not check for write permissions, I would consider there is 
still a bug.

If there is a list of "aparently safe" SCSI commands that are allowed to be 
executed, then there is another bug in Linux. The only SCSI command that could 
be called safe if Test Unit Ready and even this only if not send more then once 
every few seconds.

There are several SCSI commands that look safe but would result in coasters
if issued while a CD or DVD is written.

Conclusion: It makes no sense to start implementing a fine grained security 
model before basic secutity has been done correctly.

The best immediate fix for the problem is to just check for read & write 
permissions on the file descriptor and otherwise revert to how it has been
before 2.6.8.

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 11:56                       ` Joerg Schilling
@ 2004-08-22 12:14                         ` Joerg Schilling
  2004-08-22 12:52                           ` Patrick McFarland
  2004-08-22 15:11                           ` Horst von Brand
  2004-08-22 13:13                         ` Pascal Schmidt
  1 sibling, 2 replies; 87+ messages in thread
From: Joerg Schilling @ 2004-08-22 12:14 UTC (permalink / raw)
  To: schilling, der.eremit; +Cc: linux-kernel

> > Pascal Schmidt <der.eremit@email.de> wrote:

>> I am not against a long term change that would require euid root too,
>> but this should be announced early enough to allow prominent users of
>> the interface to keep track of the interface changes.

>Too late for that now, no matter whether we like it or not... however,
>at least the discussion now has shown that changes to this interface
>need to be considered carefully, so maybe the future will be
>bright. ;)

Eveybody makes mistakes. Not being able to admid that and persisting to 
continue to go in a wrong direction is the real problem.

There is no problem to do what I did propose.

And the wrong decision could have even be avoided if people did contact me
before they did act....


Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 12:14                         ` Joerg Schilling
@ 2004-08-22 12:52                           ` Patrick McFarland
  2004-08-22 13:05                             ` Joerg Schilling
  2004-08-22 15:11                           ` Horst von Brand
  1 sibling, 1 reply; 87+ messages in thread
From: Patrick McFarland @ 2004-08-22 12:52 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: der.eremit, linux-kernel

On Sun, 22 Aug 2004 14:14:08 +0200, Joerg Schilling
<schilling@fokus.fraunhofer.de> wrote:
> Eveybody makes mistakes. Not being able to admid that and persisting to
> continue to go in a wrong direction is the real problem.
 
Yes, everyone does. Yours was flaming kernel developers over the lkml
about bugs in your own program; yet, you do not admit to this, and
continue to piss everyone off.

-- 
Patrick "Diablo-D3" McFarland || diablod3@gmail.com
"Computer games don't affect kids; I mean if Pac-Man affected us as kids, we'd 
all be running around in darkened rooms, munching magic pills and listening to
repetitive electronic music." -- Kristian Wilson, Nintendo, Inc, 1989

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 12:52                           ` Patrick McFarland
@ 2004-08-22 13:05                             ` Joerg Schilling
  2004-08-22 16:38                               ` Horst von Brand
  0 siblings, 1 reply; 87+ messages in thread
From: Joerg Schilling @ 2004-08-22 13:05 UTC (permalink / raw)
  To: schilling, diablod3; +Cc: linux-kernel, der.eremit

Patrick McFarland <diablod3@gmail.com> wrote:

> On Sun, 22 Aug 2004 14:14:08 +0200, Joerg Schilling
> <schilling@fokus.fraunhofer.de> wrote:
> > Eveybody makes mistakes. Not being able to admid that and persisting to
> > continue to go in a wrong direction is the real problem.
>  
> Yes, everyone does. Yours was flaming kernel developers over the lkml
> about bugs in your own program; yet, you do not admit to this, and
> continue to piss everyone off.

You seem to be unable to distinct between cause and effect.

	Some pleople at Linux kernel ML did start to flame me while I was
	trying to do my best to give technical based explanations.

As it has been proven that threre _are_ reasonable people in LKML, it would 
help LKML to regain credibility if they could try to do some self cleaning
and find a way to calm down the non-serious people.


You also seem to be unable to judge where bugs are located while looking at 
problems.

	It seems that we just agreed with the reasonable members of LKML
	that there was and still is a security related bug in Linux.
	The "fix" that used in hope to remove the security problems did just
	create new problems instead of removing old ones.

If you have nothing useful to say, please stay quiet.

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 11:56                       ` Joerg Schilling
  2004-08-22 12:14                         ` Joerg Schilling
@ 2004-08-22 13:13                         ` Pascal Schmidt
  2004-08-22 16:00                           ` Christer Weinigel
  2004-08-22 21:27                           ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Julien Oster
  1 sibling, 2 replies; 87+ messages in thread
From: Pascal Schmidt @ 2004-08-22 13:13 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: linux-kernel, Jens Axboe

On Sun, 22 Aug 2004, Joerg Schilling wrote:

> Not checking for Write access permissions at this place is a typical
> mistake made by novice programmers, so I never thought this could be in
> Linux.....

People will find this kind of language inflammatory. ;) However, exactly
because it is such a bad mistake did Linus put out what he deemed a
correct fix *immediately*.

> If Linux still noes not check for write permissions, I would consider
> there is still a bug.

The open question is whether write permission really is meaningful
enough to allow arbitrary SCSI commands. I personally think "being
able to wipe the drive firmware" is too much, and since filtering
of vendor commands is generally impossible to do right, sending SG_IO
should require CAP_SYS_RAWIO capability.

> If there is a list of "aparently safe" SCSI commands that are allowed to
> be executed, then there is another bug in Linux. The only SCSI command
> that could be called safe if Test Unit Ready and even this only if not
> send more then once every few seconds.

Currently (2.6.8.1), there is a list in the kernel. I agree that it
doesn't make sense. I would think read permission means to be able
to read from the device, write means you can write. I would even go
as far as *not* to have that mean "you can also read/write via SG_IO",
because for normal uses of the device, read(2) and write(2) should be
enough.

BTW, there are a number of people on the kernel list who believe a
filter list is bad and generally unmaintainable.

> There are several SCSI commands that look safe but would result in coasters
> if issued while a CD or DVD is written.

Good point.

> The best immediate fix for the problem is to just check for read & write
> permissions on the file descriptor and otherwise revert to how it has been
> before 2.6.8.

I don't think that's going to happen. You already said you'd be okay
with euid==0 being required for burning, if only the transition
period were longer. So if people complain to you that cdrecord is
broken with 2.6.8, you will have to tell them burning requires root
for the moment. Then in your next release, change your startup
code not to drop the CAP_SYS_RAWIO capability when you drop root
privileges.

Alternatively, provide a patch that changes the current code to just
require write permission or CAP_SYS_RAWIO to be able to send
arbitrary commands. Then, after a transition period, submit a patch
that changes it to just CAP_SYS_RAWIO. The patch would look like the
one below (untested).

Jens, since this seems to be your code, what do you think?


--- scsi_ioctl.c	2004-08-14 18:26:17.000000000 +0200
+++ scsi_ioctl.c.new	2004-08-22 15:08:36.000000000 +0200
@@ -105,70 +105,12 @@ static int sg_emulated_host(request_queu
 	return put_user(1, p);
 }

-#define CMD_READ_SAFE	0x01
-#define CMD_WRITE_SAFE	0x02
-#define safe_for_read(cmd)	[cmd] = CMD_READ_SAFE
-#define safe_for_write(cmd)	[cmd] = CMD_WRITE_SAFE
-
-static int verify_command(struct file *file, unsigned char *cmd)
+static int verify_command(struct file *file)
 {
-	static const unsigned char cmd_type[256] = {
-
-		/* Basic read-only commands */
-		safe_for_read(TEST_UNIT_READY),
-		safe_for_read(REQUEST_SENSE),
-		safe_for_read(READ_6),
-		safe_for_read(READ_10),
-		safe_for_read(READ_12),
-		safe_for_read(READ_16),
-		safe_for_read(READ_BUFFER),
-		safe_for_read(READ_LONG),
-		safe_for_read(INQUIRY),
-		safe_for_read(MODE_SENSE),
-		safe_for_read(MODE_SENSE_10),
-		safe_for_read(START_STOP),
-
-		/* Audio CD commands */
-		safe_for_read(GPCMD_PLAY_CD),
-		safe_for_read(GPCMD_PLAY_AUDIO_10),
-		safe_for_read(GPCMD_PLAY_AUDIO_MSF),
-		safe_for_read(GPCMD_PLAY_AUDIO_TI),
-
-		/* CD/DVD data reading */
-		safe_for_read(GPCMD_READ_CD),
-		safe_for_read(GPCMD_READ_CD_MSF),
-		safe_for_read(GPCMD_READ_DISC_INFO),
-		safe_for_read(GPCMD_READ_CDVD_CAPACITY),
-		safe_for_read(GPCMD_READ_DVD_STRUCTURE),
-		safe_for_read(GPCMD_READ_HEADER),
-		safe_for_read(GPCMD_READ_TRACK_RZONE_INFO),
-		safe_for_read(GPCMD_READ_SUBCHANNEL),
-		safe_for_read(GPCMD_READ_TOC_PMA_ATIP),
-		safe_for_read(GPCMD_REPORT_KEY),
-		safe_for_read(GPCMD_SCAN),
-
-		/* Basic writing commands */
-		safe_for_write(WRITE_6),
-		safe_for_write(WRITE_10),
-		safe_for_write(WRITE_VERIFY),
-		safe_for_write(WRITE_12),
-		safe_for_write(WRITE_VERIFY_12),
-		safe_for_write(WRITE_16),
-		safe_for_write(WRITE_BUFFER),
-		safe_for_write(WRITE_LONG),
-	};
-	unsigned char type = cmd_type[cmd[0]];
-
-	/* Anybody who can open the device can do a read-safe command */
-	if (type & CMD_READ_SAFE)
+	/* write access means being able to send any command (for now) */
+	if (file->f_mode & FMODE_WRITE)
 		return 0;

-	/* Write-safe commands just require a writable open.. */
-	if (type & CMD_WRITE_SAFE) {
-		if (file->f_mode & FMODE_WRITE)
-			return 0;
-	}
-
 	/* And root can do any command.. */
 	if (capable(CAP_SYS_RAWIO))
 		return 0;
@@ -181,7 +123,7 @@ static int sg_io(struct file *file, requ
 		struct gendisk *bd_disk, struct sg_io_hdr *hdr)
 {
 	unsigned long start_time;
-	int reading, writing;
+	int reading, writing, res;
 	struct request *rq;
 	struct bio *bio;
 	char sense[SCSI_SENSE_BUFFERSIZE];
@@ -193,8 +135,8 @@ static int sg_io(struct file *file, requ
 		return -EINVAL;
 	if (copy_from_user(cmd, hdr->cmdp, hdr->cmd_len))
 		return -EFAULT;
-	if (verify_command(file, cmd))
-		return -EPERM;
+	if (res = verify_command(file))
+		return res;

 	/*
 	 * we'll do that later

-- 
Ciao,
Pascal

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 12:14                         ` Joerg Schilling
  2004-08-22 12:52                           ` Patrick McFarland
@ 2004-08-22 15:11                           ` Horst von Brand
  2004-08-22 18:09                             ` Matthias Andree
  1 sibling, 1 reply; 87+ messages in thread
From: Horst von Brand @ 2004-08-22 15:11 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: der.eremit, linux-kernel

Joerg Schilling <schilling@fokus.fraunhofer.de> said:

[...]

> And the wrong decision could have even be avoided if people did contact me
> before they did act....

Exactly! They should also contact me and ask politely each time they
consider a change if I'd allow it. Really. The nerve these guys have.
Unbelievable.

In the end, I'd only say that I've been on LKML for a long, long time
(since it started, more or less). And each single time the head hackers
agreed on something, and there was a single dissenter, the dissenter was in
the wrong. Sure, this time could be different, but I have seen absolutely
no (yes, _no_) evidence here to the contrary.

The kernel changed, badly conceived interfases were (somewhat, perhaps
broken in another way) fixed. Some applications that depended on the
brokenness don't work now. Tough luck, fix the applications and
(optionally) ask _politely_, with _detailed discussion_, perhaps propose a
better fix for the kernel. Just whining that the application broke during
its "code freeze" won't get you anywhere (you just can't expect to hold the
kernel hostage to your random, mostly unrelated, program's development
schedule; that model just won't get anywhere real fast). 

Treating everybody as ignorant morons isn't exactly the best way to be
heard. And in this case there is ample evidence on hand that they are very
smart people who usually are right in regards to the techical matters they
have in their hands.

I.e., you are making a fool of yourself here.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 13:13                         ` Pascal Schmidt
@ 2004-08-22 16:00                           ` Christer Weinigel
  2004-08-22 16:32                             ` Joerg Schilling
                                               ` (3 more replies)
  2004-08-22 21:27                           ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Julien Oster
  1 sibling, 4 replies; 87+ messages in thread
From: Christer Weinigel @ 2004-08-22 16:00 UTC (permalink / raw)
  To: Pascal Schmidt; +Cc: Joerg Schilling, linux-kernel, Jens Axboe

Pascal Schmidt <der.eremit@email.de> writes:

> I would even go as far as *not* to have that mean "you can also
> read/write via SG_IO", because for normal uses of the device,
> read(2) and write(2) should be enough.

Ripping a CD is in my opinion a normal use of a CD.

> On Sun, 22 Aug 2004, Joerg Schilling wrote:
> > There are several SCSI commands that look safe but would result in coasters
> > if issued while a CD or DVD is written.
> 
> Good point.

Not really, if I have write permisson to a CD burner, being able to
burn a coaster by issuing strange commands is something I expect.
Being able to destroy the firmware of the drive is not something I
expect a normal user to be able to do.

There are at least three conflicting goals here:

1. Only someone with CAP_SYS_RAWIO (i.e. root) should be able to do
   possible destructive things to a device, and only root should be
   able to bypass the normal security checks in the kernel (e.g. get
   access to /dev/mem since access to it means that you can read and
   modify internal kernel structures).

2. A Linux system should have as few suid root binaries as possible.

3. A normal user should be able to perform most tasks without needing
   root.

As you said, since the old kernel behaviour is a gaping security hole,
Linus had no other choice than to add a CAP_SYS_RAWIO check to the
SG_IO call.  This fulfills goal 1.  Unfortunately it breaks just about
every application that expects to be able to send raw SCSI commands
without being root.

There are a couple of ways of fulfilling goal 3 and allow normal users
to burn a CDR:

One is to make cdrecord suid root and then make it drop all
capabilities except for SYS_CAP_RAWIO.  But even if cdrecord is
audited, there are a lot of other applications that need to be able to
send raw SCSI commands such as mt (to change the compression or tape
format of a streamer).  And this violates goal 2, every security guide
I've seen lately recommends minimizing the amount of suid binaries,
not adding more.

Another way is to add specialized ioctls in the kernel for everything,
such as the CDROMPLAYTRKIND to play a track.  Unfortunately, this gets
a bit unmaintainable with all the different devices out there.  It
would be akin to putting the whole of cdrecord into the kernel.

Yet another way is to try to filter the raw SCSI commands and only
allow through "known safe" commands, which is what some other people
have been trying to do.  

I think Joerg is being much too harsh, adding a check for
CAP_SYS_RAWIO fixes a bloody large security hole.  It broke a few
applications, but tough shit, that is what happens every now and then
when plugging security holes.  It would be much worse to leave the
hole open.  The timing may coincide badly with the release cycle of
cdrecord, but thats life.  For now users will have to run cdrecord as
root to be able to burn a CDR.

In the future, add a patch to cdrecord so that it can be run as suid
root and not drop CAP_SYS_RAWIO which will make most users happy.
It's still a violation of goal 2 but one has to do tradeoffs every now
and then.

For the future, well, I'm not sure, but personally I think that the
filter idea is a pretty good one.  It is a coarse sieve, but by
listing some "known safe" commands most applications should work, and
if somebody needs to send a command that isn't considered as safe yet,
he can just run the application as root instead.

In my opinion, the best way forward would be to only have a
CAP_SYS_RAWIO check in the kernel and an installable command filter
that can be configured from userspace.  So when the next version of
snazzycdwriter(tm) is released it can have a line in the README file
saying:

    If you want to be able to run snazzycdwriter(tm) as a normal user,
    add the following command to your rc.local file:

        /sbin/install-scsi-filter /dev/hdc snazzycdwriter.filter

And if you have a tape drive, it could have another list of safe
commands.

  /Christer

-- 
"Just how much can I get away with and still go to heaven?"

Freelance consultant specializing in device driver programming for Linux 
Christer Weinigel <christer@weinigel.se>  http://www.weinigel.se

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 16:33                             ` Christer Weinigel
@ 2004-08-22 16:19                               ` Alan Cox
  2004-08-22 17:31                                 ` Christer Weinigel
  2004-08-23 12:22                                 ` Adam Sampson
  0 siblings, 2 replies; 87+ messages in thread
From: Alan Cox @ 2004-08-22 16:19 UTC (permalink / raw)
  To: Christer Weinigel; +Cc: Pascal Schmidt, Linux Kernel Mailing List, Jens Axboe

On Sul, 2004-08-22 at 17:33, Christer Weinigel wrote:
> /me keeping to the bad habit of following up to myself
> 
> Regarding the current 2.6.8 kernel, wouldn't it be a better idea to
> move the CAP_SYS_RAWIO check to open time instead of when the ioctl is
> called?  This would require a new flag somewhere in the file structure
> I suppose, e.g. file->f_mode & FMODE_RAWIO.  

This leads to all sorts of bugs where descriptors owned by one process
are given to another less priviledged one. In the networking world
similar logic led to holes because rsh for example gave root opened fd's
to users.

Alan


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 16:00                           ` Christer Weinigel
@ 2004-08-22 16:32                             ` Joerg Schilling
  2004-08-22 17:18                               ` Christer Weinigel
                                                 ` (3 more replies)
  2004-08-22 16:33                             ` Christer Weinigel
                                               ` (2 subsequent siblings)
  3 siblings, 4 replies; 87+ messages in thread
From: Joerg Schilling @ 2004-08-22 16:32 UTC (permalink / raw)
  To: der.eremit, christer; +Cc: schilling, linux-kernel, axboe

Christer Weinigel <christer@weinigel.se> wrote:

> Pascal Schmidt <der.eremit@email.de> writes:
>
> > I would even go as far as *not* to have that mean "you can also
> > read/write via SG_IO", because for normal uses of the device,
> > read(2) and write(2) should be enough.
>
> Ripping a CD is in my opinion a normal use of a CD.

But in order to rip an audio CD, you need to use e.g. MODE SELECT.
If you start to distinct safe SCSI commands from possibly unsafe ones, then 
MODE SELECT could not be in the list of safe ones.

> > On Sun, 22 Aug 2004, Joerg Schilling wrote:
> > > There are several SCSI commands that look safe but would result in coasters
> > > if issued while a CD or DVD is written.
> > 
> > Good point.
>
> Not really, if I have write permisson to a CD burner, being able to
> burn a coaster by issuing strange commands is something I expect.
> Being able to destroy the firmware of the drive is not something I
> expect a normal user to be able to do.

At SCSI level, there is no real difference.


> There are at least three conflicting goals here:
>
> 1. Only someone with CAP_SYS_RAWIO (i.e. root) should be able to do
>    possible destructive things to a device, and only root should be
>    able to bypass the normal security checks in the kernel (e.g. get
>    access to /dev/mem since access to it means that you can read and
>    modify internal kernel structures).

A powerful CD/DVD recording program needs to sometimes issue "secret"
and vendor unique SCSI commands in order to give nice features.

On a Plextor drive, you need to be able to issue a vendor unique SCSI command
to know the recommended write speed for a specific medium. A SCSI command
from same list of vendor unique commands allows you to tell the drive to read
any medium at 52x. This could destroy the medium _and_ the drive.

As you see: you cannot have the needed knowledge inside the kernel.


> 2. A Linux system should have as few suid root binaries as possible.

If you like this completely, you would need to implement something RBAC and
getppriv(2)(setpiriv(2) on Solaris. If you have this, you have zero suid
root binaries on a 'Trusted OS' and one suid binary (/usr/bin/pfexec) on a non 
Trusted system.

> 3. A normal user should be able to perform most tasks without needing
>    root.

Duable if my remark to 2) has been implemented.

> As you said, since the old kernel behaviour is a gaping security hole,
> Linus had no other choice than to add a CAP_SYS_RAWIO check to the
> SG_IO call.  This fulfills goal 1.  Unfortunately it breaks just about

Not true: a simple check like in my scg driver:

        /* 
         * Must have read/write access to /dev/scgxx 
         * to send commands over SCSI Bus. 
         */ 
        if ((flag&(FREAD|FWRITE)) != (FREAD|FWRITE)) 
                return (EACCES); 

was sudfficient.

> every application that expects to be able to send raw SCSI commands
> without being root.
>
> There are a couple of ways of fulfilling goal 3 and allow normal users
> to burn a CDR:
>
> One is to make cdrecord suid root and then make it drop all
> capabilities except for SYS_CAP_RAWIO.  But even if cdrecord is
> audited, there are a lot of other applications that need to be able to
> send raw SCSI commands such as mt (to change the compression or tape
> format of a streamer).  And this violates goal 2, every security guide
> I've seen lately recommends minimizing the amount of suid binaries,
> not adding more.

A better way is to have services like this in /usr/bin/pfexec that 
do the ecirity related parts before calling the other binaries.

BTW: 'mt' should not need to send SCSI comands. THis shoul dbe handled via
specilized ioctls.


> I think Joerg is being much too harsh, adding a check for
> CAP_SYS_RAWIO fixes a bloody large security hole.  It broke a few
> applications, but tough shit, that is what happens every now and then

With checking for ((flag&(FREAD|FWRITE)) != (FREAD|FWRITE)) less applications
would break.

> when plugging security holes.  It would be much worse to leave the
> hole open.  The timing may coincide badly with the release cycle of
> cdrecord, but thats life.  For now users will have to run cdrecord as
> root to be able to burn a CDR.

The result will be that users "find solutions" that will be less secure as
when only a check for ((flag&(FREAD|FWRITE)) != (FREAD|FWRITE)) has been 
introduced and _later_ (in an agreement with prominent applications)
require root when issuing SCSI commands.

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 16:00                           ` Christer Weinigel
  2004-08-22 16:32                             ` Joerg Schilling
@ 2004-08-22 16:33                             ` Christer Weinigel
  2004-08-22 16:19                               ` Alan Cox
  2004-08-22 19:26                             ` Tonnerre
  2004-08-31 22:22                             ` (was: Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices) John Myers
  3 siblings, 1 reply; 87+ messages in thread
From: Christer Weinigel @ 2004-08-22 16:33 UTC (permalink / raw)
  To: Christer Weinigel
  Cc: Pascal Schmidt, Joerg Schilling, linux-kernel, Jens Axboe

/me keeping to the bad habit of following up to myself

Regarding the current 2.6.8 kernel, wouldn't it be a better idea to
move the CAP_SYS_RAWIO check to open time instead of when the ioctl is
called?  This would require a new flag somewhere in the file structure
I suppose, e.g. file->f_mode & FMODE_RAWIO.  

That would allow a suid root application to open the cdrom and then
drop all capabilities including RAWIO and would probably fit better
into how cdrecord expects things to work.

  /Christer

-- 
"Just how much can I get away with and still go to heaven?"

Freelance consultant specializing in device driver programming for Linux 
Christer Weinigel <christer@weinigel.se>  http://www.weinigel.se

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 13:05                             ` Joerg Schilling
@ 2004-08-22 16:38                               ` Horst von Brand
  0 siblings, 0 replies; 87+ messages in thread
From: Horst von Brand @ 2004-08-22 16:38 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: diablod3, linux-kernel, der.eremit

Joerg Schilling <schilling@fokus.fraunhofer.de> said:
> Patrick McFarland <diablod3@gmail.com> wrote:
> > On Sun, 22 Aug 2004 14:14:08 +0200, Joerg Schilling
> > <schilling@fokus.fraunhofer.de> wrote:
> > > Eveybody makes mistakes. Not being able to admid that and persisting to
> > > continue to go in a wrong direction is the real problem.
> >  
> > Yes, everyone does. Yours was flaming kernel developers over the lkml
> > about bugs in your own program; yet, you do not admit to this, and
> > continue to piss everyone off.
> 
> You seem to be unable to distinct between cause and effect.

Exactly right.

> 	Some pleople at Linux kernel ML did start to flame me while I was
> 	trying to do my best to give technical based explanations.
> 
> As it has been proven that threre _are_ reasonable people in LKML, it would 
> help LKML to regain credibility if they could try to do some self cleaning
> and find a way to calm down the non-serious people.

Bann you from the list would go a long way, true; but I oppose such
measures as a matter of principle. Better try to convince the people
out-of-line to do their own soul searching. Hasn't worked so far, sadly.

> You also seem to be unable to judge where bugs are located while looking at 
> problems.
> 
> 	It seems that we just agreed with the reasonable members of LKML
> 	that there was and still is a security related bug in Linux.
> 	The "fix" that used in hope to remove the security problems did just
> 	create new problems instead of removing old ones.
> 
> If you have nothing useful to say, please stay quiet.

There was a security problem, I think all agree on that. LKML says any
security problem has to be fixed ASAP, especially if it is well known and
easy to exploit. You say backward compatibility is more important. The
people in charge of the kernel are the ones who decide what to do, in this
case they overwhelmingly decided against you. Though luck.

[Yes, I fully expect you to tell me this is not useful. Perhaps it isn't.
 But continuing to pour gas on the flames (as you are so fond doing)
 doesn't help either.]
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 16:32                             ` Joerg Schilling
@ 2004-08-22 17:18                               ` Christer Weinigel
  2004-08-22 19:22                                 ` DTrace-like analysis possible with future Linux kernels? Joerg Schilling
  2004-08-22 20:27                               ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Giuseppe Bilotta
                                                 ` (2 subsequent siblings)
  3 siblings, 1 reply; 87+ messages in thread
From: Christer Weinigel @ 2004-08-22 17:18 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: der.eremit, christer, linux-kernel, axboe

Joerg Schilling <schilling@fokus.fraunhofer.de> writes:
> But in order to rip an audio CD, you need to use e.g. MODE SELECT.
> If you start to distinct safe SCSI commands from possibly unsafe ones, then 
> MODE SELECT could not be in the list of safe ones.

Yes, I'm quite aware of that.  

So a filter would have to be smarter than just checking the command
codes.  There would have to be a special case for the mode page
commands which filters on accessible mode pages.

Are there any other commands that would need filtering at a finer
grain than the command level?

Additionally, another thing that is really needed is to match
the different variants of hdr->dxfer_direction against the direction
of the commands, otherwise one could ask for a REQUEST_SENSE but with
a direction of SG_DXFER_TO_DEV.  This isn't a security problem in the
sense that it can destroy the drive itself, but it might hang the
IDE state machine in the kernel, motherboard or drive.  

> > Not really, if I have write permisson to a CD burner, being able to
> > burn a coaster by issuing strange commands is something I expect.
> > Being able to destroy the firmware of the drive is not something I
> > expect a normal user to be able to do.
> 
> At SCSI level, there is no real difference.

SG_IO does not have to work at the SCSI level, it can filter the
commands at a higher level.

> A powerful CD/DVD recording program needs to sometimes issue "secret"
> and vendor unique SCSI commands in order to give nice features.
> 
> On a Plextor drive, you need to be able to issue a vendor unique SCSI command
> to know the recommended write speed for a specific medium. A SCSI command
> from same list of vendor unique commands allows you to tell the drive to read
> any medium at 52x. This could destroy the medium _and_ the drive.
> 
> As you see: you cannot have the needed knowledge inside the kernel.

So guess why I suggested that the kernel should contain the mechanics
to filter commands (and yes, I was aware of the mode page problems but
didn't want to make a long mail even longer), and that the list of
commands would be uploaded to the kernel from userspace.  It was at
the end of the mail you replied to...

That way an application, such as cdrecord, could keep a list of safe
commands for each device and use the appropriate list for each kind of
device.  If the device is a tape, allow access to the mode page that
can control BPI and compression settings.  If it's a cdrom, allow
access to the mode page with CDDA settings.

Of course, if it isn't possible to do this at a mode page level, maybe
the access controls would have to be at the level of individual bits
in a mode page, then it gets trickier.  It might, or might not be
feasible to implement such a filte.  I don't know which is true or
which is not, I'm just trying to look at ways of solving the problem.

> > 2. A Linux system should have as few suid root binaries as possible.
> 
> If you like this completely, you would need to implement something RBAC and
> getppriv(2)(setpiriv(2) on Solaris. If you have this, you have zero suid
> root binaries on a 'Trusted OS' and one suid binary (/usr/bin/pfexec) on a non 
> Trusted system.

Which is not all that different from suid binaries.  Instead of
trusting an application, you're trusting a user or a role.  This isn't
much different from giving "trusted" users access to /dev/scd0.  

> > As you said, since the old kernel behaviour is a gaping security hole,
> > Linus had no other choice than to add a CAP_SYS_RAWIO check to the
> > SG_IO call.  This fulfills goal 1.  Unfortunately it breaks just about
> 
> Not true: a simple check like in my scg driver:
> 
>         /* 
>          * Must have read/write access to /dev/scgxx 
>          * to send commands over SCSI Bus. 
>          */ 
>         if ((flag&(FREAD|FWRITE)) != (FREAD|FWRITE)) 
>                 return (EACCES); 
>
> was sudfficient.

No.  Sure, you can redefine read access to a SCSI device to mean "may
only use normal read" and write access to "may use read, write and
send raw SCSI commands", but that is a rather bad fit to how
read/write normally are used.

What do you do if you want to allow users with read access to
read a SCSI tape (and to be able select the BPI)?  With your
suggestion, the user will need write access too, but I may just want
to give the user read access.

> BTW: 'mt' should not need to send SCSI comands. THis shoul dbe handled via
> specilized ioctls.

So why can't cdrecord use specialized ioctls then?

> With checking for ((flag&(FREAD|FWRITE)) != (FREAD|FWRITE)) less applications
> would break.

cdrecord probably wouldn't break.  Other applications that open
/dev/scd0 as readonly would break.  cdrecord isn't the only
application in the world you know.

The Linux philosophy is "do it right".  And when Linus has been
changing interfaces he has said that he prefers something to break
noisily (not compile) rather than to get compile fixes that leave the
bugs still in there.

  /Christer

-- 
"Just how much can I get away with and still go to heaven?"

Freelance consultant specializing in device driver programming for Linux 
Christer Weinigel <christer@weinigel.se>  http://www.weinigel.se

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 16:19                               ` Alan Cox
@ 2004-08-22 17:31                                 ` Christer Weinigel
  2004-08-22 20:47                                   ` Alan Cox
  2004-08-23 12:22                                 ` Adam Sampson
  1 sibling, 1 reply; 87+ messages in thread
From: Christer Weinigel @ 2004-08-22 17:31 UTC (permalink / raw)
  To: Alan Cox
  Cc: Christer Weinigel, Pascal Schmidt, Linux Kernel Mailing List, Jens Axboe

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> On Sul, 2004-08-22 at 17:33, Christer Weinigel wrote:
> > Regarding the current 2.6.8 kernel, wouldn't it be a better idea to
> > move the CAP_SYS_RAWIO check to open time instead of when the ioctl is
> > called?  This would require a new flag somewhere in the file structure
> > I suppose, e.g. file->f_mode & FMODE_RAWIO.  
> 
> This leads to all sorts of bugs where descriptors owned by one process
> are given to another less priviledged one. In the networking world
> similar logic led to holes because rsh for example gave root opened fd's
> to users.

On the other hand a bug in my favourite cd burner application could
give away SYS_CAP_RAWIO instead, and I think that is even worse.

Besides, checking SYS_CAP_RAWIO at open time is the way /dev/mem
works.  OTOH applications don't normally hand over /dev/mem to other
applications I suppose.  

I'm just tossing ideas around, please ignore me if they are stuipd :-)

  /Christer

-- 
"Just how much can I get away with and still go to heaven?"

Freelance consultant specializing in device driver programming for Linux 
Christer Weinigel <christer@weinigel.se>  http://www.weinigel.se

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 15:11                           ` Horst von Brand
@ 2004-08-22 18:09                             ` Matthias Andree
  0 siblings, 0 replies; 87+ messages in thread
From: Matthias Andree @ 2004-08-22 18:09 UTC (permalink / raw)
  To: Horst von Brand; +Cc: Linux-Kernel mailing list

On Sun, 22 Aug 2004, Horst von Brand wrote:

> In the end, I'd only say that I've been on LKML for a long, long time
> (since it started, more or less). And each single time the head hackers
> agreed on something, and there was a single dissenter, the dissenter was in
> the wrong. Sure, this time could be different, but I have seen absolutely
> no (yes, _no_) evidence here to the contrary.

There _are_ cases where a kernel patch sneaked to a subsystem maintainer
has made it even when some of the "heads" said it was impossible.

The key is convincing a subsystem maintainer that the patch helps and
doesn't hurt. And that doesn't work with a rant and can sometime take a
kernel patch to show how it works.  A decent patch with a more decent
description works wonders - usually.

-- 
Matthias Andree

NOTE YOU WILL NOT RECEIVE MY MAIL IF YOU'RE USING SPF!
Encrypted mail welcome: my GnuPG key ID is 0x052E7D95 (PGP/MIME preferred)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 17:18                               ` Christer Weinigel
@ 2004-08-22 19:22                                 ` Joerg Schilling
  0 siblings, 0 replies; 87+ messages in thread
From: Joerg Schilling @ 2004-08-22 19:22 UTC (permalink / raw)
  To: linux-kernel

Julien Oster wrote:

>> http://www.theregister.co.uk/2004/07/08/dtrace_user_take/:
>> "Sun sees DTrace as a big advantage for Solaris over other versions of Unix 
>> and Linux."

>That article is way too hypey.

The article is ay too pessimisctic compared to the real possibilities that 
Dtrace offers.


>The same applies to that article, I couldn't even read it completely,
>it was just too much.

If you did not read it completely, how cah you judge about it?

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 16:00                           ` Christer Weinigel
  2004-08-22 16:32                             ` Joerg Schilling
  2004-08-22 16:33                             ` Christer Weinigel
@ 2004-08-22 19:26                             ` Tonnerre
  2004-08-22 20:14                               ` DTrace-like analysis possible with future Linux kernels? Joerg Schilling
  2004-08-23 20:25                               ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Bill Davidsen
  2004-08-31 22:22                             ` (was: Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices) John Myers
  3 siblings, 2 replies; 87+ messages in thread
From: Tonnerre @ 2004-08-22 19:26 UTC (permalink / raw)
  To: Christer Weinigel
  Cc: Pascal Schmidt, Joerg Schilling, linux-kernel, Jens Axboe

[-- Attachment #1: Type: text/plain, Size: 498 bytes --]

Salut,

On Sun, Aug 22, 2004 at 06:00:01PM +0200, Christer Weinigel wrote:
>     If you want to be able to run snazzycdwriter(tm) as a normal user,
>     add the following command to your rc.local file:
> 
>         /sbin/install-scsi-filter /dev/hdc snazzycdwriter.filter

Well, for that it might be  a nice feature to register and delete such
filters  online, using  a  register/remove_scsi_filter interface,  but
well, otoh that might be undesirable security-wise.

			    Tonnerre

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 19:26                             ` Tonnerre
@ 2004-08-22 20:14                               ` Joerg Schilling
  2004-08-22 20:33                                 ` Tonnerre
  2004-08-23 17:40                                 ` Horst von Brand
  2004-08-23 20:25                               ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Bill Davidsen
  1 sibling, 2 replies; 87+ messages in thread
From: Joerg Schilling @ 2004-08-22 20:14 UTC (permalink / raw)
  To: linux-kernel

Alan Cox wrote:

>> In Solaris DTrace is enabled in _normal production_ kernel and you can 
>> hang any probe or probes set without restarting system or any runed
>> application which was compiled withoud debug info.
>
>Solaris only runs on large computers. You don't want kprobes randomly on
>your phone, pda, wireless router. Solaris deals with an extremely narrow
>market segment of "big computers for people with lots of money".
...
>> http://blogs.sun.com/roller/page/bmc/20040820#dtrace_on_lkml
>> Bryan blog is also yet another Dtrace knowledge source ..
>
>Coo I thought only the Sun CEO spent his life making inappropriate
>comments 8)

It seems that Alan does not like to miss a single day to degrade his 
credibiltiy :-(

A fact based discussion looks different...

-	What is a "large computer"?

-	What is an "extremely narrow market segment"?
	What is the evidence of this statement compared to Linux?

-	What are the minimum requirements for a machine to run Linux?

-	What are the minimum requirements for a machine to run Solaris?

People who cannot answer these questions should not try to start mad
speculations on derived conclusions.

The size of the loadable dtrace module is ~ 100 kB, this is nothing bad even 
for small appliances these days.

Guess what Brian Cantrill is running on his notebook?

Guess what machine Brian is using to run dtrace demos on shows?

And hey, Brian is even able to make a 4 hour demo within a single hour on this 
machine ;-)

Dtrace is a powerful idea that gives unbelievable new opportinities to 
developers, sysadmins and users. 


Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 16:32                             ` Joerg Schilling
  2004-08-22 17:18                               ` Christer Weinigel
@ 2004-08-22 20:27                               ` Giuseppe Bilotta
  2004-08-22 21:29                               ` Julien Oster
  2004-08-23 18:16                               ` Kai Makisara
  3 siblings, 0 replies; 87+ messages in thread
From: Giuseppe Bilotta @ 2004-08-22 20:27 UTC (permalink / raw)
  To: linux-kernel

Joerg Schilling wrote:
> A powerful CD/DVD recording program needs to sometimes issue "secret"
> and vendor unique SCSI commands in order to give nice features.
> 
> On a Plextor drive, you need to be able to issue a vendor unique SCSI command
> to know the recommended write speed for a specific medium. A SCSI command
> from same list of vendor unique commands allows you to tell the drive to read
> any medium at 52x. This could destroy the medium _and_ the drive.
> 
> As you see: you cannot have the needed knowledge inside the kernel.

Actually I was wondering about this exactly: why shouldn't this 
knowledge be built into the kernel? IMO it should be. Isn't the 
kernel purpose to do that, among other things? HAL?

-- 
Giuseppe "Oblomov" Bilotta

Can't you see
It all makes perfect sense
Expressed in dollar and cents
Pounds shillings and pence
                  (Roger Waters)


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 20:14                               ` DTrace-like analysis possible with future Linux kernels? Joerg Schilling
@ 2004-08-22 20:33                                 ` Tonnerre
  2004-08-22 20:38                                   ` Alan Cox
  2004-08-22 20:43                                   ` Joerg Schilling
  2004-08-23 17:40                                 ` Horst von Brand
  1 sibling, 2 replies; 87+ messages in thread
From: Tonnerre @ 2004-08-22 20:33 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1320 bytes --]

Salut,

On Sun, Aug 22, 2004 at 10:14:12PM +0200, Joerg Schilling wrote:
> -	What is a "large computer"?

I'd refer to a computer as a large computer when its calculation power
is several times  larger than the one home computers  of the same time
have (or if  it's a really large machine, such  as the Honeywell DDPs,
but that's another dimension of "large").

> -	What is an "extremely narrow market segment"?

A market  segment not  reaching waste portions  of all customers  in a
market. Think of catfood.

> 	What is the evidence of this statement compared to Linux?

Linux is  actually widely employed  in the home computer  *and* server
market,  plus embedded  devices.  I  mean, did  you  ever see  someone
running Solaris on their video decoder?

> -	What are the minimum requirements for a machine to run Linux?

Intel 8086  processor with  a few ko  of RAM,  with a floppy  drive, a
monitor and a floppy, I think. If you take only the normal kernel into
account that will be an 80386 processor.

> -	What are the minimum requirements for a machine to run Solaris?

At least more RAM and a more capable processor.

> And hey, Brian is even able to make a 4 hour demo within a single hour on this 
> machine ;-)

Greeeat. I can do that too on my Powerbook G5.

				Tonnerre

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 20:33                                 ` Tonnerre
@ 2004-08-22 20:38                                   ` Alan Cox
  2004-08-22 20:43                                   ` Joerg Schilling
  1 sibling, 0 replies; 87+ messages in thread
From: Alan Cox @ 2004-08-22 20:38 UTC (permalink / raw)
  To: Tonnerre; +Cc: Joerg Schilling, Linux Kernel Mailing List

On Sul, 2004-08-22 at 21:33, Tonnerre wrote:
> > -	What are the minimum requirements for a machine to run Linux?
> 
> Intel 8086  processor with  a few ko  of RAM,  with a floppy  drive, a
> monitor and a floppy, I think. If you take only the normal kernel into
> account that will be an 80386 processor.

Minimum for an x86 kernel is about 2Mb and 386 CPU. The 8086 subset
kernel isn't really "Linux", its more an escaped insanity. For non x86
you need a bottom end mmuless 32bit processor and a couple of Mb.

There are folks driving the size down (the -tiny patches) because
2Mb for the entire system is still too large for some users.

Alan


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 20:33                                 ` Tonnerre
  2004-08-22 20:38                                   ` Alan Cox
@ 2004-08-22 20:43                                   ` Joerg Schilling
  2004-08-22 21:37                                     ` Christer Weinigel
  1 sibling, 1 reply; 87+ messages in thread
From: Joerg Schilling @ 2004-08-22 20:43 UTC (permalink / raw)
  To: tonnerre, schilling; +Cc: linux-kernel

Tonnerre <tonnerre@thundrix.ch> wrote:

> > -	What are the minimum requirements for a machine to run Linux?
>
> Intel 8086  processor with  a few ko  of RAM,  with a floppy  drive, a
> monitor and a floppy, I think. If you take only the normal kernel into
> account that will be an 80386 processor.

A few k ?????

> > -	What are the minimum requirements for a machine to run Solaris?
>
> At least more RAM and a more capable processor.

Looks like a speculation. 

> > And hey, Brian is even able to make a 4 hour demo within a single hour on this 
> > machine ;-)
>
> Greeeat. I can do that too on my Powerbook G5.

Can you do it by typing in _all_ commands and dtrace programs in real time?

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 17:31                                 ` Christer Weinigel
@ 2004-08-22 20:47                                   ` Alan Cox
  2004-08-22 22:17                                     ` Christer Weinigel
  0 siblings, 1 reply; 87+ messages in thread
From: Alan Cox @ 2004-08-22 20:47 UTC (permalink / raw)
  To: Christer Weinigel; +Cc: Pascal Schmidt, Linux Kernel Mailing List, Jens Axboe

On Sul, 2004-08-22 at 18:31, Christer Weinigel wrote:
> On the other hand a bug in my favourite cd burner application could
> give away SYS_CAP_RAWIO instead, and I think that is even worse.

Its not an easy trade off- I don't know if there is a right answer.
Despite the UI problems in both cdrecord and its author the internal
code is actually quite rigorous so its something I'd be more comfortable
giving limited rawio access than quite a few other apps that touch
external public data.

Alan


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 13:13                         ` Pascal Schmidt
  2004-08-22 16:00                           ` Christer Weinigel
@ 2004-08-22 21:27                           ` Julien Oster
  1 sibling, 0 replies; 87+ messages in thread
From: Julien Oster @ 2004-08-22 21:27 UTC (permalink / raw)
  To: Pascal Schmidt; +Cc: Joerg Schilling, linux-kernel, Jens Axboe

Pascal Schmidt <der.eremit@email.de> writes:

Hello Pascal,

> The open question is whether write permission really is meaningful
> enough to allow arbitrary SCSI commands. I personally think "being
> able to wipe the drive firmware" is too much, and since filtering
> of vendor commands is generally impossible to do right, sending SG_IO
> should require CAP_SYS_RAWIO capability.

But what about the following (the first 3 points are already
familiar):

1. require read permission to do read()
2. require write premission to do write()
3. require CAP_SYS_RAWIO to do SG_IO
4. insert an initially blank (i.e. "drop everything") userspace
   controllable filter which allows the administrator to specify
   allowed SG_IO commands to the kernel at any time

That way there is no security problem, CD burning as root or generally
with CAP_SYS_RAWIO is always possible *and* admins are able to submit a
list of allowed commands to the kernel, so that CD burning as user is
possible again. This list might be specific to the CD writer hardware,
as we learned that some drives require vendor specific commands.

Prewritten filter lists for specific hardware can be published on
internet or even be submitted by cdrecord or other burning software,
i.e. with a switch "--install-filter" as root.

The filters should be separate for each SCSI device, so that you won't
enable dangerous commands on harddisk partitions when you just wanted
to enable CD burning.

If nobody else volunteers, I'll see if I can prepare a patch. I guess
sysfs is the right place for the userspace interface to the filters?

Regards,
Julien

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 16:32                             ` Joerg Schilling
  2004-08-22 17:18                               ` Christer Weinigel
  2004-08-22 20:27                               ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Giuseppe Bilotta
@ 2004-08-22 21:29                               ` Julien Oster
  2004-08-23 11:40                                 ` Joerg Schilling
  2004-08-23 18:16                               ` Kai Makisara
  3 siblings, 1 reply; 87+ messages in thread
From: Julien Oster @ 2004-08-22 21:29 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: der.eremit, christer, linux-kernel, axboe

Joerg Schilling <schilling@fokus.fraunhofer.de> writes:

> But in order to rip an audio CD, you need to use e.g. MODE SELECT.
> If you start to distinct safe SCSI commands from possibly unsafe ones, then 
> MODE SELECT could not be in the list of safe ones.

That is why I'm proposing an empty filter at boot time, which allows
no SG_IO except when having CAP_SYS_RAWIO (which enables everything)
and the possibility to open up certain commands from userspace later.

Julien

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 20:43                                   ` Joerg Schilling
@ 2004-08-22 21:37                                     ` Christer Weinigel
  2004-08-23 11:44                                       ` Joerg Schilling
  0 siblings, 1 reply; 87+ messages in thread
From: Christer Weinigel @ 2004-08-22 21:37 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: tonnerre, linux-kernel

Joerg Schilling <schilling@fokus.fraunhofer.de> writes:

> Tonnerre <tonnerre@thundrix.ch> wrote:
> 
> > > -	What are the minimum requirements for a machine to run Linux?
> >
> > Intel 8086  processor with  a few ko  of RAM,  with a floppy  drive, a
> > monitor and a floppy, I think. If you take only the normal kernel into
> > account that will be an 80386 processor.
> 
> A few k ?????

It depends on your definition of "a few k" :-)

    http://elks.sourceforge.net/

It will run fine on an 8086 with 512 kBytes of RAM, but I its possible
to get by with as little as 200kByte of RAM.

I work with embedded Linux systems and the standard configuration for
the stuff I do is with a small embedded processor such as the Motorola
MPC860 or the Axis Etrax 100 (about as fast as an i486) and 8MByte of
RAM and 4MByte of flash.  It's really no problem running in 2MByte of
RAM and 2MByte of flash but then the system really just does one thing
such as initializing a routing table and then routing data back and
forth.  To be able to get OpenSSL running in there and so on I really
need 8MByte of RAM.

> > > -	What are the minimum requirements for a machine to run Solaris?
> >
> > At least more RAM and a more capable processor.
> 
> Looks like a speculation. 

Well, I think Solaris is still supported on my SPARCclassic, but I
really really wouldn't like to try it with only 8MByte of RAM.  

  /Christer

-- 
"Just how much can I get away with and still go to heaven?"

Freelance consultant specializing in device driver programming for Linux 
Christer Weinigel <christer@weinigel.se>  http://www.weinigel.se

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 20:47                                   ` Alan Cox
@ 2004-08-22 22:17                                     ` Christer Weinigel
  0 siblings, 0 replies; 87+ messages in thread
From: Christer Weinigel @ 2004-08-22 22:17 UTC (permalink / raw)
  To: Alan Cox
  Cc: Christer Weinigel, Pascal Schmidt, Linux Kernel Mailing List, Jens Axboe

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

> Its not an easy trade off- I don't know if there is a right answer.
> Despite the UI problems in both cdrecord and its author the internal
> code is actually quite rigorous so its something I'd be more comfortable
> giving limited rawio access than quite a few other apps that touch
> external public data.

Another way would be to add a scsi ioctl such as ENABLE_SG_IO or an
open flag, e.g. open("/dev/hdc", ... | O_RAWIO) which needs
CAP_SYS_RAWIO.  That way it is much less likely that the RAWIO
permission is given away by mistake, but I must admit that it feels
kind of ugly.

  /Christer

-- 
"Just how much can I get away with and still go to heaven?"

Freelance consultant specializing in device driver programming for Linux 
Christer Weinigel <christer@weinigel.se>  http://www.weinigel.se

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 21:29                               ` Julien Oster
@ 2004-08-23 11:40                                 ` Joerg Schilling
  2004-08-23 13:15                                   ` Matthias Andree
  0 siblings, 1 reply; 87+ messages in thread
From: Joerg Schilling @ 2004-08-23 11:40 UTC (permalink / raw)
  To: schilling, lkml-7994; +Cc: linux-kernel, der.eremit, christer, axboe

Julien Oster <lkml-7994@mc.frodoid.org> wrote:

> Joerg Schilling <schilling@fokus.fraunhofer.de> writes:
>
> > But in order to rip an audio CD, you need to use e.g. MODE SELECT.
> > If you start to distinct safe SCSI commands from possibly unsafe ones, then 
> > MODE SELECT could not be in the list of safe ones.
>
> That is why I'm proposing an empty filter at boot time, which allows
> no SG_IO except when having CAP_SYS_RAWIO (which enables everything)
> and the possibility to open up certain commands from userspace later.

If the related /dev/* nodes are owned by root and set up rw-r-r or worse 
for others and requiring write access to send SCSI commands, then you get
the same kind of authentification, but cdrecord would continue to work.

Only if someone would chown the related /dev/* nodes to a user differen from 
root there would be a difference.

P.S.: UNIX philosohy is to allow the administrator to set up bad/wrong permissions.

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 21:37                                     ` Christer Weinigel
@ 2004-08-23 11:44                                       ` Joerg Schilling
  0 siblings, 0 replies; 87+ messages in thread
From: Joerg Schilling @ 2004-08-23 11:44 UTC (permalink / raw)
  To: schilling, christer; +Cc: tonnerre, linux-kernel

Christer Weinigel <christer@weinigel.se> wrote:

> It depends on your definition of "a few k" :-)
>
>     http://elks.sourceforge.net/
>
> It will run fine on an 8086 with 512 kBytes of RAM, but I its possible
> to get by with as little as 200kByte of RAM.

But this would not be a UNIX system... (see my other mail).

> I work with embedded Linux systems and the standard configuration for
> the stuff I do is with a small embedded processor such as the Motorola
> MPC860 or the Axis Etrax 100 (about as fast as an i486) and 8MByte of
> RAM and 4MByte of flash.  It's really no problem running in 2MByte of
> RAM and 2MByte of flash but then the system really just does one thing
> such as initializing a routing table and then routing data back and
> forth.  To be able to get OpenSSL running in there and so on I really
> need 8MByte of RAM.

If you don't try to run fancy stuff (like a GUI), I am sure that Solaris
will run with a machine that has something between 2 and 4 MB of RAM.

Note that if you design new embedded hardware, you typically think in
units of 16 MB.

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 16:19                               ` Alan Cox
  2004-08-22 17:31                                 ` Christer Weinigel
@ 2004-08-23 12:22                                 ` Adam Sampson
  1 sibling, 0 replies; 87+ messages in thread
From: Adam Sampson @ 2004-08-23 12:22 UTC (permalink / raw)
  To: Alan Cox
  Cc: Christer Weinigel, Pascal Schmidt, Linux Kernel Mailing List, Jens Axboe

Alan Cox <alan@lxorguk.ukuu.org.uk> writes:

>> Regarding the current 2.6.8 kernel, wouldn't it be a better idea to
>> move the CAP_SYS_RAWIO check to open time instead of when the ioctl is
>> called?
> This leads to all sorts of bugs where descriptors owned by one process
> are given to another less priviledged one.

Yes, but that's a class of bugs that are pretty well understood these
days; handing privileged FDs around is a moderately common and
pleasantly fine-grained way of doing things. Closing an FD is at least
as easy as dropping a capability, which is what you'd have to do
with the current scheme upon entering unprivileged code.

Besides, setuid CD-recording tools already have to worry about closing
unsafe FDs when they drop privileges, so this doesn't seem to add any
new security holes...

Thanks,

-- 
Adam Sampson <azz@us-lot.org>                        <http://offog.org/>

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-23 11:40                                 ` Joerg Schilling
@ 2004-08-23 13:15                                   ` Matthias Andree
  0 siblings, 0 replies; 87+ messages in thread
From: Matthias Andree @ 2004-08-23 13:15 UTC (permalink / raw)
  To: Joerg Schilling, Linux-Kernel mailing list

Joerg Schilling schrieb am 2004-08-23:

> Only if someone would chown the related /dev/* nodes to a user differen from 
> root there would be a difference.

...which actually happens a lot, with the devperm PAM junk that some,
particularly desktop/end-user oriented distros do, for instance SuSE
Linux twist device permissions. It is awful for shared computers in a
network.

-- 
Matthias Andree

NOTE YOU WILL NOT RECEIVE MY MAIL IF YOU'RE USING SPF!
Encrypted mail welcome: my GnuPG key ID is 0x052E7D95 (PGP/MIME preferred)

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 20:14                               ` DTrace-like analysis possible with future Linux kernels? Joerg Schilling
  2004-08-22 20:33                                 ` Tonnerre
@ 2004-08-23 17:40                                 ` Horst von Brand
  1 sibling, 0 replies; 87+ messages in thread
From: Horst von Brand @ 2004-08-23 17:40 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: linux-kernel

Joerg Schilling <schilling@fokus.fraunhofer.de> said:
> Alan Cox wrote:
> 
> >> In Solaris DTrace is enabled in _normal production_ kernel and you can 
> >> hang any probe or probes set without restarting system or any runed
> >> application which was compiled withoud debug info.
> >
> >Solaris only runs on large computers. You don't want kprobes randomly on
> >your phone, pda, wireless router. Solaris deals with an extremely narrow
> >market segment of "big computers for people with lots of money".
> ...
> >> http://blogs.sun.com/roller/page/bmc/20040820#dtrace_on_lkml
> >> Bryan blog is also yet another Dtrace knowledge source ..
> >
> >Coo I thought only the Sun CEO spent his life making inappropriate
> >comments 8)
> 
> It seems that Alan does not like to miss a single day to degrade his 
> credibiltiy :-(

Strangely, it doesn't seem to affect his credibility at all. Yours, OTOH...

> A fact based discussion looks different...
> 
> -	What is a "large computer"?

Current Sun Enterprise. Typically several CPUs, several GiB RAM, connected
via fiber to TiB storage array.

> -	What is an "extremely narrow market segment"?

The one for the above machines. Duh...

> 	What is the evidence of this statement compared to Linux?

Millions of machines vs a few tens of thousands?

> -	What are the minimum requirements for a machine to run Linux?

Palm Pilot V or thereabouts.

> -	What are the minimum requirements for a machine to run Solaris?

Out of my league. My Sparc Ultra 1 can't. It is running Linux (Aurora)
quite happily, BTW.

> People who cannot answer these questions should not try to start mad
> speculations on derived conclusions.

Great! Does that mean you will /finally/ shut up?

PS: I do know for a fact that Alan did/does meddle with this kind of hardware.

> The size of the loadable dtrace module is ~ 100 kB, this is nothing bad even 
> for small appliances these days.

Add that, and a lot of other similarly small random junk, and we are soon
talking serious MiBs... Larry McVoy has made it very clear here that
Slowlaris got that bloated way one tiny, unnoticeable, not too relevant
feature at a time.
-- 
Dr. Horst H. von Brand                   User #22616 counter.li.org
Departamento de Informatica                     Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria              +56 32 654239
Casilla 110-V, Valparaiso, Chile                Fax:  +56 32 797513

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 16:32                             ` Joerg Schilling
                                                 ` (2 preceding siblings ...)
  2004-08-22 21:29                               ` Julien Oster
@ 2004-08-23 18:16                               ` Kai Makisara
  2004-08-24 10:22                                 ` Christer Weinigel
  2004-08-24 15:34                                 ` Joerg Schilling
  3 siblings, 2 replies; 87+ messages in thread
From: Kai Makisara @ 2004-08-23 18:16 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: der.eremit, christer, linux-kernel, axboe

On Sun, 22 Aug 2004, Joerg Schilling wrote:

> Christer Weinigel <christer@weinigel.se> wrote:
> 
...
> > One is to make cdrecord suid root and then make it drop all
> > capabilities except for SYS_CAP_RAWIO.  But even if cdrecord is
> > audited, there are a lot of other applications that need to be able to
> > send raw SCSI commands such as mt (to change the compression or tape
> > format of a streamer).  And this violates goal 2, every security guide
> > I've seen lately recommends minimizing the amount of suid binaries,
> > not adding more.
> 
> A better way is to have services like this in /usr/bin/pfexec that 
> do the ecirity related parts before calling the other binaries.
> 
> BTW: 'mt' should not need to send SCSI comands. THis shoul dbe handled via
> specilized ioctls.
> 
There are already ioctls for changing the tape parameters. Christer, there 
is no need to introduce tapes into this discussion.

-- 
Kai

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-22 19:26                             ` Tonnerre
  2004-08-22 20:14                               ` DTrace-like analysis possible with future Linux kernels? Joerg Schilling
@ 2004-08-23 20:25                               ` Bill Davidsen
  2004-08-23 21:01                                 ` Doug Maxey
  2004-08-24  2:22                                 ` Nuno Silva
  1 sibling, 2 replies; 87+ messages in thread
From: Bill Davidsen @ 2004-08-23 20:25 UTC (permalink / raw)
  To: linux-kernel

Tonnerre wrote:

> Well, for that it might be  a nice feature to register and delete such
> filters  online, using  a  register/remove_scsi_filter interface,  but
> well, otoh that might be undesirable security-wise.

Let me throw out two ideas to see if anyone find them useful.

1 - loadable command filters in the kernel.

Each device could have a filter set, which could be empty to require 
RAWIO capability, or set to a kernel default. Access could be made to 
modify a filter via proc, sysfs, or ioctl. The set method is not 
relevant to the idea.

2 - a filter program.

This one can be done right now, no kernel mod needed. A program with 
appropriate permissions can be started, and will create a command/status 
fifo pair with permissions which allow only programs with group 
permission to open. This allows the admin to put in any filter desired, 
know about vendor commands, etc. It also allows various security setups, 
the group can be on the user (trusted users) or on a setgid program 
(which limits the security issues).

Note that the permissions on individual devices need not be the same; I 
can have one group for disk, another for CD/DVD. You caould even be anal 
and have the filter time sensitive, etc.

A 'standard" place for the fifos helps portability, /var/sgio/dev/hda 
might be a directory, with fifos command and status.


Okay, did I miss something, or can this be solved without any additional 
kernel hacks?

-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-23 20:25                               ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Bill Davidsen
@ 2004-08-23 21:01                                 ` Doug Maxey
  2004-08-25 18:29                                   ` Bill Davidsen
  2004-08-24  2:22                                 ` Nuno Silva
  1 sibling, 1 reply; 87+ messages in thread
From: Doug Maxey @ 2004-08-23 21:01 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: linux-kernel


On Mon, 23 Aug 2004 16:25:17 EDT, Bill Davidsen wrote:
>permission to open. This allows the admin to put in any filter desired,
> know about vendor commands, etc. It also allows various security
>setups,  the group can be on the user (trusted users) or on a setgid
>program  (which limits the security issues).

  Down such path lies madness :)   This list would have to be maintained for
  most every model, of every drive, for every manufacturer.  The list could
  conceivably change weekly, if not sooner.  This could change, of course, if
  the use of linux would become as ubiquitous as the dominant redmond produnt, 
  and the manufacturers would supply the "mini-port" driver bits, as it were.

  The theory is wonderful.  Until there is enough "clout" to change the 
  manufacturers participation, it is probably futile. :-/

++doug

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-23 20:25                               ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Bill Davidsen
  2004-08-23 21:01                                 ` Doug Maxey
@ 2004-08-24  2:22                                 ` Nuno Silva
  1 sibling, 0 replies; 87+ messages in thread
From: Nuno Silva @ 2004-08-24  2:22 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: linux-kernel

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi!

Bill Davidsen wrote:
| Tonnerre wrote:
|
|> Well, for that it might be  a nice feature to register and delete such
|> filters  online, using  a  register/remove_scsi_filter interface,  but
|> well, otoh that might be undesirable security-wise.
|
|
| Let me throw out two ideas to see if anyone find them useful.
|
| 1 - loadable command filters in the kernel.
|
| Each device could have a filter set, which could be empty to require
| RAWIO capability, or set to a kernel default. Access could be made to
| modify a filter via proc, sysfs, or ioctl. The set method is not
| relevant to the idea.
|
| 2 - a filter program.
|
| This one can be done right now, no kernel mod needed. A program with
| appropriate permissions can be started, and will create a command/status
| fifo pair with permissions which allow only programs with group
| permission to open. This allows the admin to put in any filter desired,
| know about vendor commands, etc. It also allows various security setups,
| the group can be on the user (trusted users) or on a setgid program
| (which limits the security issues).
|
| Note that the permissions on individual devices need not be the same; I
| can have one group for disk, another for CD/DVD. You caould even be anal
| and have the filter time sensitive, etc.
|
| A 'standard" place for the fifos helps portability, /var/sgio/dev/hda
| might be a directory, with fifos command and status.
|
|
| Okay, did I miss something, or can this be solved without any additional
| kernel hacks?

Sorry for jumping in this (hot) thread, but I just want to say something:

This is, IMHO, the way to go. Keeping static white-lists in the kernel
is bad and goes against the 2.6 moto: "do it in userspace".

Anyway, I can imagine that the distros are thinking about the problem
very hard. They can't just delete the cd-burn feature as non-root :-)

Also, many things can be affected by this, right? Scanners, jukeboxes,
ip-over-scsi, etc... A programmable kernel interface or a userspace
helper is the only way. To keep things _fast_, I'd be happy with a simple
# echo 1 > /sys/block/hdd/rawio/enable_rawio_if_user_can_write
brw-rw----  1 root disk 22, 64 Mar 14  2002 /dev/hdd
Now every member of @disk can trash your data and your cdrom's firmware.

If the admin sets this flag it's his responsability[*].

Peace,
Nuno Silva

[*] Once you start refusing to let root shoot himself in the foot
there's no way back. You must "fix" 60% of Linux! :-)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.5 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBKqZVOPig54MP17wRAgE0AJ9LjIKpK+S1nqBYYbOZywVontBdggCdGbF6
Uf2Ok3aFvCbXp6k4Wq7Pn2A=
=cEo2
-----END PGP SIGNATURE-----

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-23 18:16                               ` Kai Makisara
@ 2004-08-24 10:22                                 ` Christer Weinigel
  2004-08-24 15:34                                 ` Joerg Schilling
  1 sibling, 0 replies; 87+ messages in thread
From: Christer Weinigel @ 2004-08-24 10:22 UTC (permalink / raw)
  To: Kai Makisara; +Cc: Joerg Schilling, der.eremit, christer, linux-kernel, axboe

Kai Makisara <Kai.Makisara@kolumbus.fi> writes:

> On Sun, 22 Aug 2004, Joerg Schilling wrote:
> 
> > Christer Weinigel <christer@weinigel.se> wrote:
> > BTW: 'mt' should not need to send SCSI comands. THis shoul dbe handled via
> > specilized ioctls.
> > 
> There are already ioctls for changing the tape parameters. Christer, there 
> is no need to introduce tapes into this discussion.

It was en example of another application that needs to modify the mode
pages, and it's interesting to look at how we have solved similar
problems before.

So if we want to be consistent we ought to introduce specialized
ioctls for everything cdrecord wants to do.  Otoh, tape drives don't
seem to be such a fast moving target as CD and DVD burners.

  /Christer

-- 
"Just how much can I get away with and still go to heaven?"

Freelance consultant specializing in device driver programming for Linux 
Christer Weinigel <christer@weinigel.se>  http://www.weinigel.se

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-23 18:16                               ` Kai Makisara
  2004-08-24 10:22                                 ` Christer Weinigel
@ 2004-08-24 15:34                                 ` Joerg Schilling
  1 sibling, 0 replies; 87+ messages in thread
From: Joerg Schilling @ 2004-08-24 15:34 UTC (permalink / raw)
  To: schilling, Kai.Makisara; +Cc: linux-kernel, der.eremit, christer, axboe

Kai Makisara <Kai.Makisara@kolumbus.fi> wrote:

> > BTW: 'mt' should not need to send SCSI comands. THis shoul dbe handled via
> > specilized ioctls.
> > 
> There are already ioctls for changing the tape parameters. Christer, there 
> is no need to introduce tapes into this discussion.

This is my words....


Tape drives have a well known and simple and standardized interface since many
years (> 40). There exist ioctl()s to do anything you like.


CD/DVD writing ist still constantly evolving, so you cannot have it in the 
kernel.

BTW: I am strongly against any list of "safe commands" as this would only make 
things more complicated. Things that control security should be ket simple.

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices
  2004-08-23 21:01                                 ` Doug Maxey
@ 2004-08-25 18:29                                   ` Bill Davidsen
  0 siblings, 0 replies; 87+ messages in thread
From: Bill Davidsen @ 2004-08-25 18:29 UTC (permalink / raw)
  To: linux-kernel

Doug Maxey wrote:
> On Mon, 23 Aug 2004 16:25:17 EDT, Bill Davidsen wrote:
> 
>>permission to open. This allows the admin to put in any filter desired,
>>know about vendor commands, etc. It also allows various security
>>setups,  the group can be on the user (trusted users) or on a setgid
>>program  (which limits the security issues).
> 
> 
>   Down such path lies madness :)   This list would have to be maintained for
>   most every model, of every drive, for every manufacturer.  The list could
>   conceivably change weekly, if not sooner.  This could change, of course, if
>   the use of linux would become as ubiquitous as the dominant redmond produnt, 
>   and the manufacturers would supply the "mini-port" driver bits, as it were.
> 
>   The theory is wonderful.  Until there is enough "clout" to change the 
>   manufacturers participation, it is probably futile. :-/

But you don't need magic vendor commands to read and write disk (or 
tape), you can do it with the base commands defined in SCSI-II. You only 
need filter lists for special cases where (a) you really do want vendor 
commands and (b) there's some reason to allow this to normal users.

I doubt that you need magic for any of the other obvious devices like 
SCSI scanners, ZIP and LS120 drives using ATA access rather than 
ide-floppy or ide-scsi, etc. I could be wrong on scanners, the setup 
commands may be more dangerous than I think.

To write CD unfortunately does seem to take more than I want the average 
user to do.

-- 
    -bill davidsen (davidsen@tmr.com)
"The secret to procrastination is to put things off until the
  last possible moment - but no longer"  -me

^ permalink raw reply	[flat|nested] 87+ messages in thread

* (was: Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices)
  2004-08-22 16:00                           ` Christer Weinigel
                                               ` (2 preceding siblings ...)
  2004-08-22 19:26                             ` Tonnerre
@ 2004-08-31 22:22                             ` John Myers
  2004-09-02  9:44                               ` Joerg Schilling
  3 siblings, 1 reply; 87+ messages in thread
From: John Myers @ 2004-08-31 22:22 UTC (permalink / raw)
  To: Christer Weinigel
  Cc: Pascal Schmidt, Joerg Schilling, linux-kernel, Jens Axboe

[-- Attachment #1: Type: text/plain, Size: 2124 bytes --]

Christer Weinigel wrote:

> Pascal Schmidt <der.eremit@email.de> writes:

> [...] if I have write permisson to a CD burner, being able to
> burn a coaster by issuing strange commands is something I expect.
> Being able to destroy the firmware of the drive is not something I
> expect a normal user to be able to do.
> 
> There are at least three conflicting goals here:
> 
> 1. Only someone with CAP_SYS_RAWIO (i.e. root) should be able to do
>    possible destructive things to a device, and only root should be
>    able to bypass the normal security checks in the kernel (e.g. get
>    access to /dev/mem since access to it means that you can read and
>    modify internal kernel structures).
> 
> 2. A Linux system should have as few suid root binaries as possible.
> 
> 3. A normal user should be able to perform most tasks without needing
>    root.
> 

I hope this is not a stupid idea:

I propose a finer-grained approach to suid-root binaries. Perhaps, 
instead of having a single flag giving the binary all the rights and 
responsibilities of its owner, there could be a table/list/something of 
capabilities which we want to grant to the binary. This, of course, 
would be a privileged operation (perhaps a new capability?).

For example, we might want to grant cdrecord CAP_SYS_RAWIO. This way, we 
don't have to worry about cdrecord running as root and not dropping all 
the capabilities it doesn't need, by accident or by malice.

Further, and I realize that this would probably require major 
restructuring, perhaps there could be another field: for each capability 
we want to grant, a method to specify _where_ the binary can use that 
capability.

To extend the previous example: we might want to give cdrecord 
CAP_SYS_RAWIO just on, say, /dev/burner0 and /dev/burner1, but not 
/dev/hda. That way, some typo won't have us trying to burn cds with our 
hard disks.

Again, I hope it's not a stupid idea. I don't have a working 
implementation, and I'm not even sure if it's even possible, but it's a 
thought.
-- 
electronerd (jonathan s myers)
code poet and recycle bin monitor
programmer, monolith3d.com

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 254 bytes --]

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: (was: Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices)
  2004-08-31 22:22                             ` (was: Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices) John Myers
@ 2004-09-02  9:44                               ` Joerg Schilling
  2004-09-02 13:49                                 ` John Myers
  0 siblings, 1 reply; 87+ messages in thread
From: Joerg Schilling @ 2004-09-02  9:44 UTC (permalink / raw)
  To: electronerd, christer; +Cc: schilling, linux-kernel, der.eremit, axboe

John Myers <electronerd@monolith3d.com> wrote:

> I hope this is not a stupid idea:
>
> I propose a finer-grained approach to suid-root binaries. Perhaps, 
> instead of having a single flag giving the binary all the rights and 
> responsibilities of its owner, there could be a table/list/something of 
> capabilities which we want to grant to the binary. This, of course, 
> would be a privileged operation (perhaps a new capability?).
>
> For example, we might want to grant cdrecord CAP_SYS_RAWIO. This way, we 
> don't have to worry about cdrecord running as root and not dropping all 
> the capabilities it doesn't need, by accident or by malice.

cdrecord neither does drop the privileges by accident nor by malice.
What I however see is that a completely unneeded incompatible interface change 
has been applied to a _stable_ Kernel.

On a cleanly designed OS with fine grained permissions, a program like cdrecord
does not need to worry about the permissions as it gets exactly the needed 
permissions granted by the execution environment.

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: (was: Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices)
  2004-09-02  9:44                               ` Joerg Schilling
@ 2004-09-02 13:49                                 ` John Myers
  2004-09-02 15:40                                   ` Joerg Schilling
  0 siblings, 1 reply; 87+ messages in thread
From: John Myers @ 2004-09-02 13:49 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: christer, linux-kernel, der.eremit, axboe

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Joerg Schilling wrote:
| John Myers <electronerd@monolith3d.com> wrote:
|
|
|>I hope this is not a stupid idea:
|>
|>I propose a finer-grained approach to suid-root binaries. Perhaps,
|>instead of having a single flag giving the binary all the rights and
|>responsibilities of its owner, there could be a table/list/something of
|>capabilities which we want to grant to the binary. This, of course,
|>would be a privileged operation (perhaps a new capability?).
|>
|>For example, we might want to grant cdrecord CAP_SYS_RAWIO. This way, we
|>don't have to worry about cdrecord running as root and not dropping all
|>the capabilities it doesn't need, by accident or by malice.
|
|
| cdrecord neither does drop the privileges by accident nor by malice.

I wasn't trying to insult cdrecord, or even suggest it might have the
inkling of a possibility of this type of issue, and I am sorry if I made
it sound that way. I was merely trying to illustrate a use of my
proposal. I admit, I should have invented a name, like
cd-burning-fire-toaster-program to illustrate the separation of my
example from any actual existing implementation

| What I however see is that a completely unneeded incompatible
interface change
| has been applied to a _stable_ Kernel.

I really wasn't talking about that. I was, however, trying to offer a
solution that would, perhaps, allow both this change, and cdrecord, to
co-exist peacefully, without running cdrecord as root.

|
| On a cleanly designed OS with fine grained permissions, a program like
cdrecord
| does not need to worry about the permissions as it gets exactly the
needed
| permissions granted by the execution environment.
|
| Jörg
|

Which is exactly what I proposed...


So... could anyone comment on my proposal, rather than just flame my
examples?

- --
electronerd (jonathan s myers)
code poet and recycle bin monitor
programmer, monolith3d.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFBNyUBNh5QaxZowccRAtGYAJ4gLta/cmcRpDQoDf3u1bdEdx8vKwCgikzM
xVI2EyH2pwRbUI/KgLGP7YQ=
=Sxlq
-----END PGP SIGNATURE-----


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: (was: Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices)
  2004-09-02 13:49                                 ` John Myers
@ 2004-09-02 15:40                                   ` Joerg Schilling
  0 siblings, 0 replies; 87+ messages in thread
From: Joerg Schilling @ 2004-09-02 15:40 UTC (permalink / raw)
  To: schilling, electronerd; +Cc: linux-kernel, der.eremit, christer, axboe

John Myers <electronerd@monolith3d.com> wrote:

> | cdrecord neither does drop the privileges by accident nor by malice.
>
> I wasn't trying to insult cdrecord, or even suggest it might have the
> inkling of a possibility of this type of issue, and I am sorry if I made
> it sound that way. I was merely trying to illustrate a use of my
> proposal. I admit, I should have invented a name, like
> cd-burning-fire-toaster-program to illustrate the separation of my
> example from any actual existing implementation

It was not you, but other people did write that cdrecord is broken
although only the kernel did change in an incompatible way.

> | On a cleanly designed OS with fine grained permissions, a program like
> cdrecord
> | does not need to worry about the permissions as it gets exactly the
> needed
> | permissions granted by the execution environment.
> |
> | Jörg
> |
>
> Which is exactly what I proposed...
>
>
> So... could anyone comment on my proposal, rather than just flame my
> examples?

I did not flame your examples, but if you thought of the same thigs, you may 
have been not obvious enough with your explanation.

On Solaris, this is done by /usr/bin/pfexec (the only suid root binary) that 
calls /usr/bin/ppriv -e which executes a process with the privilleges that are 
in the privilleges database.

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-19 23:23 ` Julien Oster
                     ` (2 preceding siblings ...)
  2004-08-21  6:03   ` Tomasz Kłoczko
@ 2004-08-31 20:16   ` Timothy Miller
  3 siblings, 0 replies; 87+ messages in thread
From: Timothy Miller @ 2004-08-31 20:16 UTC (permalink / raw)
  To: Julien Oster; +Cc: Miles Lane, linux-kernel



Julien Oster wrote:
> Miles Lane <miles.lane@comcast.net> writes:
> 
> 
>>http://www.theregister.co.uk/2004/07/08/dtrace_user_take/:
>>"Sun sees DTrace as a big advantage for Solaris over other versions of Unix 
>>and Linux."
> 
> 
> That article is way too hypey.
> 
> It sounds like one of those strange american commercials you see
> sometimes at night, where two overenthusiastic persons are telling you
> how much that strange fruit juice machine has changed their lives,
> with making them loose 200 pounds in 6 days and improving their
> performance at beach volleyball a lot due to subneutronic antigravity
> manipulation. You usually can't watch those commercials for longer
> than 5 minutes.
> 
> The same applies to that article, I couldn't even read it completely,
> it was just too much.
> 
> And is it just me or did that article really take that long to
> mentioning what dtrace actually IS?
> 
> Come on, it's profiling. As presented by that article, it is even more
> micro optimization than one would think. What with tweaking the disk
> I/O improvements and all... If my harddisk accesses were a microsecond
> more immediate or my filesystem giving a quantum more transfer rate,
> it would be nice, but I certainly wouldn't get enthusiastic and I bet
> nobody would even notice.
> 
> Maybe, without that article, I would recognize it as a fine thing (and
> by "fine" I don't mean "the best thing since sliced bread"), but that
> piece of text was just too ridiculous to take anything serious.
> 
> I sure hope that article is meant sarcastically. By the way, did I
> miss something or is profiling suddenly a new thing again?
> 

[I have 4000 emails from lkml to read, so please forgive me if this 
discussion is dead.]

DTrace was exactly what we needed here to figure out what was making our 
E450 server perform so badly.  We managed to find and eliminate all 
sorts of bottlenecks, and now, all of our NFS activity is CPU bound on 
the server.

Perhaps Linux never suffers from these sorts of problems that require 
tuning things such as inode cache sizes, etc???


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-29 10:45               ` Tomasz Kłoczko
@ 2004-08-29 17:46                 ` David S. Miller
  0 siblings, 0 replies; 87+ messages in thread
From: David S. Miller @ 2004-08-29 17:46 UTC (permalink / raw)
  To: Tomasz K³oczko
  Cc: alan, milek, usenet-20040502, miles.lane, linux-kernel

On Sun, 29 Aug 2004 12:45:12 +0200 (CEST)
Tomasz K³oczko <kloczek@rudy.mif.pg.gda.pl> wrote:

> Even on two way systems Solaris (10) still *much more*
> *better* handles threads ..

Back this up with facts.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-29  5:30             ` David S. Miller
  2004-08-29 10:45               ` Tomasz Kłoczko
@ 2004-08-29 10:53               ` Robert Milkowski
  1 sibling, 0 replies; 87+ messages in thread
From: Robert Milkowski @ 2004-08-29 10:53 UTC (permalink / raw)
  To: David S. Miller
  Cc: Tomasz K³oczko, alan, usenet-20040502, miles.lane, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2580 bytes --]

On Sat, 28 Aug 2004, David S. Miller wrote:

> On Sun, 29 Aug 2004 02:14:03 +0200 (CEST)
> Tomasz Kłoczko <kloczek@rudy.mif.pg.gda.pl> wrote:
>
>> If fact Solaris works quite well on usual desktop size computer.
>
> Check out the Solaris driver selection on x86 these days,
> it still stinks.  It is unlikely they'll ever have the coverage
> Linux does any time soon.

You are right with that. However it's getting better.
On the other hand Solaris works on a really wide spectrum of x86 servers 
(Dell, HP, IBM). I've installed it on x86 servers from 1-way to 8-way 
Compaqs. I've installed it on several desktop systems too.
But you are right - Linux has more drivers on x86 for some 'exotic' 
hardware and home use. And has much more drivers for PCI RAID cards.


> Frankly, if the only specific technical feature Sun has to brag
> about in Solaris 10 is DTrace, that's pretty sad.  Even more so,
> most of the bugs I see being fixed in Solaris kernel patches
> are performance regressions against Linux.  This, given how things
> were 6 or 7 years ago and the things the Solaris folks used to
> flame us for, I find particularly amusing.

1. Why do you think that DTrace is the onle 'cool' feature in Solaris 10?
    Please stop FUD.

    Frankly, if the only specific technical feature Linux has to brag about
    in Linux 2.6 is KProbes, that's pretty sad.

    :P


2. "most of the bugs I see being fixed in Solaris kernel patches"

     1. there are no patches to Solaris 10 kernel so far, so you can't
        see them

     2. you are probably talking about some patches for Solaris 9 kernel
        coming from project ATLAS (performance improvements on x86 - to be
        as fast or faster then other OSes on the same hardware)

     3. and definitely you overstated saying these are most of the patches
        in fact I would be suprised if there are more then 10-20 such
        patches (and hundreds others). Maybe you see this 'coz you are
        looking for word 'Linux' in patches? :)))


3. "Solaris folks used to flame us for,"

     You know - there're trolls in every community.


And this thread was about DTrace and Linux tracing technologies (and not 
trolls, PDAs, other features)... and I know, Linux can run on PDAs.
Ok, so when it comes to profiling and debugging:

 	1. Solaris has DTrace, ptools ant others
 	2. Linux runs on PDAs

:)))))))

ps. sorry for that... on the other hand a little bit of humour is ok :)


-- 
 						Robert Milkowski
 						milek@rudy.mif.pg.gda.pl

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-29  5:30             ` David S. Miller
@ 2004-08-29 10:45               ` Tomasz Kłoczko
  2004-08-29 17:46                 ` David S. Miller
  2004-08-29 10:53               ` Robert Milkowski
  1 sibling, 1 reply; 87+ messages in thread
From: Tomasz Kłoczko @ 2004-08-29 10:45 UTC (permalink / raw)
  To: David S. Miller; +Cc: alan, milek, usenet-20040502, miles.lane, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2331 bytes --]


<disclaimer>
I'm not try to advocate on Solaris or on Linux. I'm interested to 
incorporate DTrace like solution in Liunux.
</disclaimer>

On Sat, 28 Aug 2004, David S. Miller wrote:

> On Sun, 29 Aug 2004 02:14:03 +0200 (CEST)
> Tomasz Kłoczko <kloczek@rudy.mif.pg.gda.pl> wrote:
>
>> If fact Solaris works quite well on usual desktop size computer.
>
> Check out the Solaris driver selection on x86 these days,
> it still stinks.  It is unlikely they'll ever have the coverage
> Linux does any time soon.
>
> Frankly, if the only specific technical feature Sun has to brag
> about in Solaris 10 is DTrace, that's pretty sad.  Even more so,
> most of the bugs I see being fixed in Solaris kernel patches
> are performance regressions against Linux.  This, given how things
> were 6 or 7 years ago and the things the Solaris folks used to
> flame us for, I find particularly amusing.

What about zoning (which seems is much more than simple jailing) ? What 
about zfs which will be probably will next comparable to DTrace step 
forward ? (probably will come with next express build). On big computers 
Solaris also have much more better scaleability than Linux (I'm offen 
smile when I'm see questions like "Is Linux enterprise ready ?" ;). On 
small servers Linux is good alternative or in many cases is comparable (on 
choosing OS can decide another not stricte technical things) or is 
slightly better solution (for exemple now much more easied find well 
skilled adminis on Linux than on Solaris) but on medium or large computers 
(workgroup and higher solutions) stll in mamny cases is worse or much more 
worse solution for example in hardware utilization and needed 
funcitionalities (I'm talking about Linux vs. Solaris on sparc and also on 
x86 platform). Even on two way systems Solaris (10) still *much more*
*better* handles threads ..

For me isn't so importand how many hardware correctly handles this or 
another OS. If choosen OS handles correctly my hardware I'm quite happy
(in my case Linux still can't handle SunSwift T1000 NIC on my E250 ;o)

kloczek
-- 
-----------------------------------------------------------
*Ludzie nie mają problemów, tylko sobie sami je stwarzają*
-----------------------------------------------------------
Tomasz Kłoczko, sys adm @zie.pg.gda.pl|*e-mail: kloczek@rudy.mif.pg.gda.pl*

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-28 19:16         ` Alan Cox
  2004-08-29  0:14           ` Tomasz Kłoczko
@ 2004-08-29 10:29           ` Robert Milkowski
  1 sibling, 0 replies; 87+ messages in thread
From: Robert Milkowski @ 2004-08-29 10:29 UTC (permalink / raw)
  To: Alan Cox
  Cc: Tomasz Kłoczko, Julien Oster, Miles Lane, Linux Kernel Mailing List

On Sat, 28 Aug 2004, Alan Cox wrote:

> On Llu, 2004-08-23 at 20:48, Robert Milkowski wrote:
>> Solaris runs on x86 platform, and runs quite well.
>> And guess what - DTrace runs on x86 like a charm.
>
> Larger x86 boxes. I can't seem to find PDA's with Solaris or phones
> with Solaris or $70 wireless routers with Solaris.

Yeah, I can agree with you that tools like DTrace aren't very usefull on 
PDA or on phone. :)

>> I must admit I don't know OProfile.
>> But can you profile already running application without interuption
>
> Yes
>
>> What about getting structure contents, function arguments and returns,
>> etc... all on the fly.
>
> ptrace. Actually there are folks who want to take ptrace a bit further
> for some things - at least one vendor posted some proposals which when
> recast into ptrace extensions look good.
>
>> I think you missed the point.
>
> Nope
>
>> Sure, you can make your own module on Linux, load it and trace whatever
>> you want. But:
>
> Why do that, why not use the existing functionality that the kernel
> provides built on the stuff Intel AMD and friends stuck in the CPU. I'm
> not claiming our debugging tools are as good as dtrace but most of it
> (especially with kprobes patches installed) is essentially a UI design
> issue.

ok, so maybe real example.

Let's say you want aggregate by user stack if given thread number of 
specified process is taken off cpu by scheduler and is in SLEEP state and 
is off cpu for more then one second. You want output evey 10s. With DTrace 
it's relly simple. Ususally if I want quick answer I'm gonna write like 
this:

bash-2.05b# dtrace -n sched:::off-cpu'/curlwpsinfo->pr_state == SSLEEP &&
                       pid == 18819 && tid == 3/{self->t1 = timestamp;}'
                    -n sched:::on-cpu'/self->t1 && (timestamp - self->t1)
                       > 1000000000/{@[ustack()]=count();self->t1=0;}'
                    -n tick-10s'{printa(@);}'

But this is not so readable so let put it in a more readable form (script)

#!/usr/sbin/dtace -s

sched:::off-cpu
/curlwpsinfo->pr_state == SSLEEP &&  pid == 18819 && tid == 3/
{
   {self->t1 = timestamp;
}

sched:::on-cpu
/self->t1 && (timestamp - self->t1) > 1000000000/
{
   @[ustack()]=count();
   self->t1 = 0;
}

tick-10s
{
   printa(@);
}


Here is what you get:

  25  34812                        :tick-10s

               libc.so.1`__pollsys+0x4
               libc.so.1`poll+0x88
               wpserver`wpio_loop_pool+0x9c
               libc.so.1`_lwp_start
                26


Which means you get 26 times the same stack.



Or maybe another example which shows how one can easly learn something 
about apps behaviour and about system.

Let's say you see a lot of fspgin's in vmstat and want to know which 
applications are cousing it. And to complicate things you have some 
applications running from inted like daemon (so fork() + exec() every 
request) and of course you want aggregate as a whole for such application.

So, with DTrace:

bash-2.05b# dtrace -n vminfo:::fspgin'{@[execname]=sum(arg0);}'
dtrace: description 'vminfo:::fspgin' matched 1 probe
^C

   Application-A                                                   282
   Application-B                                                   304
   Application-C                                                   335
   zsched                                                         1200
bash-2.05b#

Well, now yo can go further and see Application-C in more detail.
But instead let's say you don't know why zsched is cousing fspgins.
So lets' learn why.

bash-2.05b# dtrace -n vminfo:::fspgin'/execname == "zsched"/{@[stack()]=count();}'
dtrace: description 'vminfo:::fspgin' matched 1 probe
^C


               genunix`pageio_setup+0x1f8
               nfs`nfs3_readahead+0xc0
               nfs`nfs_async_start+0x2c8
               unix`thread_start+0x4
               405
bash-2.05b#


Well, it's doing nfs3 read aheads.
You can disable read aheads for nfs3 and see if it disappears - and it 
does. And all in a production without stopping anything.

Simple, easy and safe.

Now how much work and knowledge would it be needed to get the same results 
with KProbes, oprofile, ... and how much time will it take?

And how easy would it be with KProbes to panick kernel?
All examples above I've just did on a production server.

DTrace lets you correlate data from kernel level and application level 
which is really usuefull. It gives sys admins the power to answer to see
what is really happening in system and why.

And these are simple cases which do not show all DTrace features.
Real fun starts with more complicated examples :)

Of course you can do SOME things with DProbes or Oprofile that you 
could with DTrace but usually it 
will take MUCH more time with them then with DTrace. And there are things 
you can do 
with DTrace which you can't with DProbes and others in a reasonable time 
period.

btw: about UI - it's really important. DTrace agreagations for example
      save a lot of time in analyzing data. Especially when usually you
      get huge amount of data to analyze from a production systems. With
     'D' language you get most of data aggregation done during
     collectioning data. I know there's Perl, Awk, etc. but it takes time.

btw2: you mention ptrace, AFAIK it would have more performance impact then
       DTrace/KProbes technologies. Second, I'm not sure if it's still the
       case, but there were (are?) some problems using ptrace on threaded
       applications. And still all you see is application level - no
       correlation between kernel and app (for example you want to see
       if given code path in application is cousing fspgins or xcalls or
       something else...)

btw3: and Oprofile, Kprobes, ptrace are separate tools and you have
       then corelate data which will often be not possible.


-- 
 						Robert Milkowski
 						milek@rudy.mif.pg.gda.pl


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-29  0:14           ` Tomasz Kłoczko
@ 2004-08-29  5:30             ` David S. Miller
  2004-08-29 10:45               ` Tomasz Kłoczko
  2004-08-29 10:53               ` Robert Milkowski
  0 siblings, 2 replies; 87+ messages in thread
From: David S. Miller @ 2004-08-29  5:30 UTC (permalink / raw)
  To: Tomasz K³oczko
  Cc: alan, milek, usenet-20040502, miles.lane, linux-kernel

On Sun, 29 Aug 2004 02:14:03 +0200 (CEST)
Tomasz K³oczko <kloczek@rudy.mif.pg.gda.pl> wrote:

> If fact Solaris works quite well on usual desktop size computer.

Check out the Solaris driver selection on x86 these days,
it still stinks.  It is unlikely they'll ever have the coverage
Linux does any time soon.

Frankly, if the only specific technical feature Sun has to brag
about in Solaris 10 is DTrace, that's pretty sad.  Even more so,
most of the bugs I see being fixed in Solaris kernel patches
are performance regressions against Linux.  This, given how things
were 6 or 7 years ago and the things the Solaris folks used to
flame us for, I find particularly amusing.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-28 19:16         ` Alan Cox
@ 2004-08-29  0:14           ` Tomasz Kłoczko
  2004-08-29  5:30             ` David S. Miller
  2004-08-29 10:29           ` Robert Milkowski
  1 sibling, 1 reply; 87+ messages in thread
From: Tomasz Kłoczko @ 2004-08-29  0:14 UTC (permalink / raw)
  To: Alan Cox
  Cc: Robert Milkowski, Julien Oster, Miles Lane, Linux Kernel Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 997 bytes --]

On Sat, 28 Aug 2004, Alan Cox wrote:

> On Llu, 2004-08-23 at 20:48, Robert Milkowski wrote:
>> Solaris runs on x86 platform, and runs quite well.
>> And guess what - DTrace runs on x86 like a charm.
>
> Larger x86 boxes. I can't seem to find PDA's with Solaris or phones
> with Solaris or $70 wireless routers with Solaris.

I don't thing naming anything larger than PDA or phone as "large 
computer" is correct/acceptable :o)
If fact Solaris works quite well on usual desktop size computer.

Probalby after full porting Solaris to x86 using system on embeded 
system definitely will not be only potential solution.
Even if Solaris after this still can'd be useable on embeded-like systems
this can't matter on DTrace subject :)

kloczek
-- 
-----------------------------------------------------------
*Ludzie nie mają problemów, tylko sobie sami je stwarzają*
-----------------------------------------------------------
Tomasz Kłoczko, sys adm @zie.pg.gda.pl|*e-mail: kloczek@rudy.mif.pg.gda.pl*

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-23 19:48       ` Robert Milkowski
  2004-08-24  0:39         ` David S. Miller
@ 2004-08-28 19:16         ` Alan Cox
  2004-08-29  0:14           ` Tomasz Kłoczko
  2004-08-29 10:29           ` Robert Milkowski
  1 sibling, 2 replies; 87+ messages in thread
From: Alan Cox @ 2004-08-28 19:16 UTC (permalink / raw)
  To: Robert Milkowski
  Cc: Tomasz Kłoczko, Julien Oster, Miles Lane, Linux Kernel Mailing List

On Llu, 2004-08-23 at 20:48, Robert Milkowski wrote:
> Solaris runs on x86 platform, and runs quite well.
> And guess what - DTrace runs on x86 like a charm.

Larger x86 boxes. I can't seem to find PDA's with Solaris or phones
with Solaris or $70 wireless routers with Solaris.

> I must admit I don't know OProfile.
> But can you profile already running application without interuption

Yes

> What about getting structure contents, function arguments and returns, 
> etc... all on the fly.

ptrace. Actually there are folks who want to take ptrace a bit further
for some things - at least one vendor posted some proposals which when
recast into ptrace extensions look good.

> I think you missed the point.

Nope

> Sure, you can make your own module on Linux, load it and trace whatever 
> you want. But:

Why do that, why not use the existing functionality that the kernel
provides built on the stuff Intel AMD and friends stuck in the CPU. I'm
not claiming our debugging tools are as good as dtrace but most of it
(especially with kprobes patches installed) is essentially a UI design
issue.

Alan


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-24  4:14 Joerg Schilling
@ 2004-08-28 19:15 ` Alan Cox
  0 siblings, 0 replies; 87+ messages in thread
From: Alan Cox @ 2004-08-28 19:15 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: Linux Kernel Mailing List

On Maw, 2004-08-24 at 05:14, Joerg Schilling wrote:
> Dou you know of any other system where you can say:
> 
> 	Print me a strack trace with symbols for all processes on this
> 	computer (even stripped ones) that call gettimeofday() within the
> 	next few seconds.
> 
> Note that you do not need a special kernel, no reboot, no restart of 
> applications.

Linux, BSD since 1990 or so.... For that matter I can do the same for
dynamically linked applications at library level. The difference is that
I don't have a happy point-and-click UI for it I have to go write a
little bit of code and the efficiency level. The SuSE proposed patch for
syscall restriction conveniently offers a way to remove the overhead.



^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-24 13:04 ` DTrace-like analysis possible with future Linux kernels? Pascal Schmidt
@ 2004-08-24 13:07   ` Joerg Schilling
  0 siblings, 0 replies; 87+ messages in thread
From: Joerg Schilling @ 2004-08-24 13:07 UTC (permalink / raw)
  To: schilling, der.eremit; +Cc: linux-kernel

Pascal Schmidt <der.eremit@email.de> wrote:

> On Tue, 24 Aug 2004 06:20:06 +0200, you wrote in linux.kernel:
>
> > Dou you know of any other system where you can say:
> >
> > 	Print me a strack trace with symbols for all processes on this
> > 	computer (even stripped ones) that call gettimeofday() within the
> > 	next few seconds.
>
> Well, this is by far off-topic here now, but how does this solve
> the general problem of knowing that gettimeofday() might be a
> problem in the given situation? But yeah, once you know that,
> the functionality is useful, no doubt.

If you did not get it yet, it's an example to show what may be done.
There are many more features. Just fetch the

	"Solaris Dynamic Tracing Guide" 

manual as PDF for more information - it's 300 pages.

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
       [not found] <2wAWW-12a-11@gated-at.bofh.it>
@ 2004-08-24 13:04 ` Pascal Schmidt
  2004-08-24 13:07   ` Joerg Schilling
  0 siblings, 1 reply; 87+ messages in thread
From: Pascal Schmidt @ 2004-08-24 13:04 UTC (permalink / raw)
  To: Joerg Schilling; +Cc: linux-kernel

On Tue, 24 Aug 2004 06:20:06 +0200, you wrote in linux.kernel:

> Dou you know of any other system where you can say:
>
> 	Print me a strack trace with symbols for all processes on this
> 	computer (even stripped ones) that call gettimeofday() within the
> 	next few seconds.

Well, this is by far off-topic here now, but how does this solve
the general problem of knowing that gettimeofday() might be a
problem in the given situation? But yeah, once you know that,
the functionality is useful, no doubt.

-- 
Ciao,
Pascal

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
@ 2004-08-24  4:14 Joerg Schilling
  2004-08-28 19:15 ` Alan Cox
  0 siblings, 1 reply; 87+ messages in thread
From: Joerg Schilling @ 2004-08-24  4:14 UTC (permalink / raw)
  To: linux-kernel

Christoph Halder wrote:

>True, to europeans this sounds far too overenthusiastic - almost like a 
>commercial - and will most certainly lead to the impression, that the 
>article is not very serious.
>
>Europeans try to write serious articles VERY neutral - any personal 
>opinion(s) will only be a short statement at the very end of the article.

No, it is definitely not overestimated.

It is more likely to rather be the opposite and you will find this out if you
try to use dtrace or attend a demo.

Dou you know of any other system where you can say:

	Print me a strack trace with symbols for all processes on this
	computer (even stripped ones) that call gettimeofday() within the
	next few seconds.

Note that you do not need a special kernel, no reboot, no restart of 
applications.

There are a lot more possibilities including tracing kernel routines on a 
production kernel but it would take too long to describe them....

Jörg

-- 
 EMail:joerg@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
       js@cs.tu-berlin.de		(uni)  If you don't have iso-8859-1
       schilling@fokus.fraunhofer.de	(work) chars I am J"org Schilling
 URL:  http://www.fokus.fraunhofer.de/usr/schilling ftp://ftp.berlios.de/pub/schily

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-23 19:48       ` Robert Milkowski
@ 2004-08-24  0:39         ` David S. Miller
  2004-08-28 19:16         ` Alan Cox
  1 sibling, 0 replies; 87+ messages in thread
From: David S. Miller @ 2004-08-24  0:39 UTC (permalink / raw)
  To: Robert Milkowski; +Cc: alan, kloczek, usenet-20040502, miles.lane, linux-kernel

On Mon, 23 Aug 2004 21:48:57 +0200 (CEST)
Robert Milkowski <milek@rudy.mif.pg.gda.pl> wrote:

> >> [1] Remember: if you want profile some part of code you mast _first_
> >> (re)compile them with profiling enabled. If you wand debug some code
> >
> > OProfile doesn't require this.
> 
> I must admit I don't know OProfile.
> But can you profile already running application without interuption (not 
> to mention stopping it) to it?

Yes, this is exactly what oprofile allows you to do.
Same with things like valgrind.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-21 21:49       ` Bryan Cantrill
@ 2004-08-23 23:08         ` Christoph Halder
  0 siblings, 0 replies; 87+ messages in thread
From: Christoph Halder @ 2004-08-23 23:08 UTC (permalink / raw)
  To: Bryan Cantrill
  Cc: Julien Oster, kloczek, usenet-20040502, miles.lane, linux-kernel, bmc

Bryan Cantrill wrote:

>>Sorry, but that goes a little too far. No, I didn't try out dtrace
>>and, right after reading the article (and that's the important thing!)
>>I didn't seek for further information about it, I'm not a Solaris
>>System Administrator right now (I was, some years ago). And all I was
>>saying is that this *article* was just ridiculous.

I believe that the main problem in this dispute is just a 
missunderstanding based on cultural differences between Europe and 
America and the way articles are written and perceived.

>>But in that article, I was just missing the objectiveness. A quick
>>note about the fact that Sun's been introducing dtrace for Solaris 10
>>and what it is, what it does, would have been much better instead of
>>talking about a "Cantrill explosion", how "DTrace has completely
>>changed the way I do business" (actual quotes).

True, to europeans this sounds far too overenthusiastic - almost like a 
commercial - and will most certainly lead to the impression, that the 
article is not very serious.

Europeans try to write serious articles VERY neutral - any personal 
opinion(s) will only be a short statement at the very end of the article.

It's just what we are used to!

> You don't like customer quotes?  It seems to me that quotes like that
> one (or like the other customer quotes that appear in the article)
> give weight to the claims.  Don't you like to hear from people who have
> actually _used_ a technology?  I know that I do -- those who have used
> a technology are likely to have a much more balanced view on its
> strengths and weaknesses than those who have just read about it.  (Indeed,
> this is true of pretty much anything -- experience matters.)

Nobody likes them here, really. A product should stand out and prove 
quality by its own, we don't like to be told about it by people we don't 
personally know.(even if they may have a good reputation)

Another example to what europeans will regard as "unserious".
When we watch american commercials, which are sometimes broadcasted in 
Tv, we very often think about this as very odd. Simply very different to 
how this is done in Europe.

"Customer Quotes" have a very, VERY bad reputation, it smells like 
manipulation (whether it is or not).

>>Bryan Cantrill, I can understand that you have to defend DTrace. But
>>please, PLEASE stop saying that I am a clueless moron if I wasn't even
>>ranting about you, ranting about DTrace, but just about *that single
>>article* and it's presentation of DTrace to me.

Personal insults will not lead to anything, and IMHO should not be made 
at all. A response sticking to the facts might have been more useful.

> You were attacking more than just the article; you ended with:
> 
>   I sure hope that article is meant sarcastically. By the way, did I
>   miss something or is profiling suddenly a new thing again?

Maybe a lapse made by Julien.
But the main topic still seems to be the style the article was written in.

> You asked if you were missing something, and I replied that you were
> missing plenty.  Presumably you now feel informed (if a little embarrassed),
> and I think that those that you misinformed also now realize that what
> you provided them was misinformation.  So as far as I'm concerned, that's
> the end of that.

Embarrassed?
Yes he most certainly will be, but what information did he provide to 
others? I dont't see much information about DTrace itself.

I hope nobody feels offended by my statements.

Christoph Halder

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 11:35     ` Alan Cox
  2004-08-22 18:27       ` Tomasz Kłoczko
@ 2004-08-23 19:48       ` Robert Milkowski
  2004-08-24  0:39         ` David S. Miller
  2004-08-28 19:16         ` Alan Cox
  1 sibling, 2 replies; 87+ messages in thread
From: Robert Milkowski @ 2004-08-23 19:48 UTC (permalink / raw)
  To: Alan Cox
  Cc: Tomasz Kłoczko, Julien Oster, Miles Lane, Linux Kernel Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 4713 bytes --]

On Sun, 22 Aug 2004, Alan Cox wrote:
> On Sad, 2004-08-21 at 07:03, Tomasz Kłoczko wrote:
>> In Solaris DTrace is enabled in _normal production_ kernel and you can
>> hang any probe or probes set without restarting system or any runed
>> application which was compiled withoud debug info.
>
> Solaris only runs on large computers. You don't want kprobes randomly on
> your phone, pda, wireless router. Solaris deals with an extremely narrow
> market segment of "big computers for people with lots of money".

Overstatement.
Solaris runs on x86 platform, and runs quite well.
And guess what - DTrace runs on x86 like a charm.

>> [1] Remember: if you want profile some part of code you mast _first_
>> (re)compile them with profiling enabled. If you wand debug some code
>
> OProfile doesn't require this.

I must admit I don't know OProfile.
But can you profile already running application without interuption (not 
to mention stopping it) to it? Sure, DTrace introduces some overhead but 
if it's not acceptable just stop DTrace and application again runs with 
its original speed.
What about getting structure contents, function arguments and returns, 
etc... all on the fly. And then D languages which saves a lot of time with 
its aggregations, thread local variables, speculations. Sure you can use 
Perl, AWK, and so on - but it takes time - a lot of time.


>> Some enterprise systems have limited summary time to few hours per year
>> and restart cycle can take houre or more (checking and initialize hardware
>> components). If you will try say for admin this kind system "restat your
>
> Enterprise users generally get kernels from vendors who have done the
> analysis of needs for them (and hopefully got it right). They generally
> don't ftp 2.6.8.1 type make config and try it out.


I think you missed the point.
DTrace is not only about kernel, and it's definitly not a tool for ONLY 
kernel developers. It's a great tool for user land developers and for sys 
admins. And it's really easy (well...) to corelate data from kernel and 
application. All in production and it just works.

I think Tomasz was writing not only about kernel/system uptimes but also 
about application uptimes. And DTrace can help in both cases.

Coming back to kernel - if you have for example some kind of memory leak 
in kernel then support guys can provide you with DTrace script, see what's 
going on and get problem solved without unnecessary system restarts. Then
provide you with new kernel (patch). So even if enterprise user do not 
want recompile its own kernel, if there's a kernel problem DTrace can save 
him some downtime.


> Actually I generally
> -	Glance across the load meters
> -	Ask ps where everything is waiting
> -	Potentially turn on oprofile
> -	Potentially use netfilter to see who is causing all my traffic
> -	Maybe strace a few apps to see what is up
> -	Educate the user concerned (if needed)
>
> I've already got the symbols (and they are in the debuginfo package from
> the rpm build too). I could insmod kgdb but that level of
> debugging is generally inappropriate.
>
> DTrace value is ease of use value.


Sure it's one of its values.
I would add safety of running DTrace in a production. In fact it sould be 
number one feature. DTrace does great job in preventing you from crashing 
system or applications.
I would add that there is easy (at least a lot easier then without DTrace)
way to understand what's going on in system and which appliacation and why 
is cousing it. For example Bryan example with xcalls. And all of that in 
production systems.

Sure, you can make your own module on Linux, load it and trace whatever 
you want. But:

   1. it's not easy
   2. requires quite knowledge about kernel
   3. could easly crash your kernel
   4. takes time
   5. only kernel level - what about applications and correlation between
      apps and kernel?
   ...
   ...


My point is DTrace is really awesome tool.
Sure you can do many things without DTrace but it will take much more 
time. And there are a lot of things you can do with DTrace which are 
really hard to do in a production using it.

It just saves your time and gives you answers to questions you would not 
even ask before DTrace 'coz of inacceptable time it would take to answer 
them. It's really 'fun' to see what's going on in system and/or 
appliactions with DTrace.

My post sounds almost like marketing crap - but it is really what I find 
in DTrace. I must admit that I did not really appreciate this tool until 
I've started using it.


-- 
 						Robert Milkowski
 						milek@rudy.mif.pg.gda.pl

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
       [not found]         ` <2w9Dq-65C-13@gated-at.bofh.it>
@ 2004-08-23 18:19           ` Andi Kleen
  0 siblings, 0 replies; 87+ messages in thread
From: Andi Kleen @ 2004-08-23 18:19 UTC (permalink / raw)
  To: John Levon; +Cc: linux-kernel

John Levon <levon@movementarian.org> writes:

> On Sun, Aug 22, 2004 at 08:27:38PM +0200, Tomasz K?oczko wrote:
>
>> As same as KProbe/DTrace. Can you use OProfile for something other tnan 
>> profiling ? Probably yes and this answer opens: probably it will be good 
>> prepare some common code for KProbe and Oprofile.
>
> I don't see an overlap here, except maybe the possibility of delivering
> sample events into the kprobes framework

Not sure what you mean with "delivering into the kprobes framework"
kprobes currently only uses printk which really isn't up to the 
task of any significant data delivery. The IBM people have relayfs
to solve this problem, eventually this should be probably
merged too. Without something like relayfs i don't see any way
to compete with dtrace.

-Andi


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 18:46         ` Alan Cox
@ 2004-08-23 17:34           ` Tomasz Kłoczko
  0 siblings, 0 replies; 87+ messages in thread
From: Tomasz Kłoczko @ 2004-08-23 17:34 UTC (permalink / raw)
  To: Alan Cox; +Cc: Julien Oster, Miles Lane, Linux Kernel Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1987 bytes --]

On Sun, 22 Aug 2004, Alan Cox wrote:

> On Sul, 2004-08-22 at 19:27, Tomasz K?oczko wrote:
>> Using yor thing path: KProbe/Dtrace is for development and yes it must
>> depend on DEBUG_KERNEL.
>> ptrace() is also for tracing and ver offen used by developers but it is
>> enabled by default and it is not only for developers. So .. ptrace() must
>> also depend on DEBUG_KERNEL.
>
> ptrace is for debugging user space, as for example is oprofile. kprobes
> is for debugging including kernel internal goings on

Yes. It *is* for debuging/tracing in kernel space but like DTrace is 
*also* in user space.

>> compilation stage). In Solaris kernel exist few thousands avalaible probes
>> and IIRC only very small subset is "near zero effect" (uses nop
>> instructions).
>
> Sounds like a kprobes clone 8).

Look on some time stamps both projects.
Seems KProbes is clone DTrace ..

>>> OProfile doesn't require this.
>>
>> As same as KProbe/DTrace. Can you use OProfile for something other tnan
>> profiling ? Probably yes and this answer opens: probably it will be good
>> prepare some common code for KProbe and Oprofile.
>
> Oprofile lets you work on stuff like cache affinity, tuning array walks
> and prefetches. Short of running the app under cachegrind its one of the
> most detailed ways of getting all the profile register data from the x86
> processors.

The same is KProbes but you can combined with small programs associated 
with called probes. Again: DTrace/KProbes is much more than traditional
profiling/tracing/measuring tools.

>> So it will be good stop disscuss about "yes or no ?" and start about
>> "how and when in Linux ?" ..
>
> When you send patches ?

KProbes patches was annouced on lkml few times.

kloczek
-- 
-----------------------------------------------------------
*Ludzie nie mają problemów, tylko sobie sami je stwarzają*
-----------------------------------------------------------
Tomasz Kłoczko, sys adm @zie.pg.gda.pl|*e-mail: kloczek@rudy.mif.pg.gda.pl*

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 18:27       ` Tomasz Kłoczko
  2004-08-22 18:46         ` Alan Cox
@ 2004-08-22 23:03         ` John Levon
  1 sibling, 0 replies; 87+ messages in thread
From: John Levon @ 2004-08-22 23:03 UTC (permalink / raw)
  To: Tomasz K?oczko
  Cc: Alan Cox, Julien Oster, Miles Lane, Linux Kernel Mailing List

On Sun, Aug 22, 2004 at 08:27:38PM +0200, Tomasz K?oczko wrote:

> As same as KProbe/DTrace. Can you use OProfile for something other tnan 
> profiling ? Probably yes and this answer opens: probably it will be good 
> prepare some common code for KProbe and Oprofile.

I don't see an overlap here, except maybe the possibility of delivering
sample events into the kprobes framework

> It is not only extenging entropy kernel tree. IMO KProbe can bring some
> functionalities wich can be common also for OProfile and probably in 
> future IMO OProfile can be droped.

This seems extremely unlikely to say the least, compare the methods
used.

regards
john

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 18:27       ` Tomasz Kłoczko
@ 2004-08-22 18:46         ` Alan Cox
  2004-08-23 17:34           ` Tomasz Kłoczko
  2004-08-22 23:03         ` John Levon
  1 sibling, 1 reply; 87+ messages in thread
From: Alan Cox @ 2004-08-22 18:46 UTC (permalink / raw)
  To: Tomasz Kłoczko; +Cc: Julien Oster, Miles Lane, Linux Kernel Mailing List

On Sul, 2004-08-22 at 19:27, Tomasz Kłoczko wrote:
> Using yor thing path: KProbe/Dtrace is for development and yes it must 
> depend on DEBUG_KERNEL.
> ptrace() is also for tracing and ver offen used by developers but it is 
> enabled by default and it is not only for developers. So .. ptrace() must 
> also depend on DEBUG_KERNEL.

ptrace is for debugging user space, as for example is oprofile. kprobes
is for debugging including kernel internal goings on

> compilation stage). In Solaris kernel exist few thousands avalaible probes 
> and IIRC only very small subset is "near zero effect" (uses nop 
> instructions).

Sounds like a kprobes clone 8). 

> > OProfile doesn't require this.
> 
> As same as KProbe/DTrace. Can you use OProfile for something other tnan 
> profiling ? Probably yes and this answer opens: probably it will be good 
> prepare some common code for KProbe and Oprofile.

Oprofile lets you work on stuff like cache affinity, tuning array walks
and prefetches. Short of running the app under cachegrind its one of the
most detailed ways of getting all the profile register data from the x86
processors.

> So it will be good stop disscuss about "yes or no ?" and start about
> "how and when in Linux ?" ..

When you send patches ?

> > We have crash dumps - at least all the enterprise vendors do. Linus
> > doesn't seem to like that stuff so much.
> It need some more advanced additional tools for analize and report data
> from CD.

Standard debugging tools. The system dumps across the network to a
server and then you can analyse it offline

> OProfile it will be good integrate ASAP also things like KProbes and CD.
> It is not only extenging entropy kernel tree. IMO KProbe can bring some
> functionalities wich can be common also for OProfile and probably in 
> future IMO OProfile can be droped.

You clearly haven't understood what Oprofile does. Its a parallel
technology that is more in common with say Intel's vtune.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-22 11:35     ` Alan Cox
@ 2004-08-22 18:27       ` Tomasz Kłoczko
  2004-08-22 18:46         ` Alan Cox
  2004-08-22 23:03         ` John Levon
  2004-08-23 19:48       ` Robert Milkowski
  1 sibling, 2 replies; 87+ messages in thread
From: Tomasz Kłoczko @ 2004-08-22 18:27 UTC (permalink / raw)
  To: Alan Cox; +Cc: Julien Oster, Miles Lane, Linux Kernel Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5888 bytes --]

On Sun, 22 Aug 2004, Alan Cox wrote:

> On Sad, 2004-08-21 at 07:03, Tomasz K?oczko wrote:
>> Again: DTrace is *ALSO* admininstration tool and this is why I can't
>> understand why in IBM KProbe/DProbe patch it is marked as "depends on
>> DEBUG_KERNEL" which is IMO bigest mistake on thinking level about this :>
>
> Because its a debugging feature

KProbe on ground/idea is very similar to DTrace -> DTrace isn't only
debuging tools -> KProbe cen be used not only for debuging.

Yes it *is* debugging feature but it is much more and can be used *also* 
for mamy other things. So marking them as DEBUG_KERNEL dependens is wrong.

>> In Solaris DTrace is enabled in _normal production_ kernel and you can
>> hang any probe or probes set without restarting system or any runed
>> application which was compiled withoud debug info.
>
> Solaris only runs on large computers. You don't want kprobes randomly on
> your phone, pda, wireless router. Solaris deals with an extremely narrow
> market segment of "big computers for people with lots of money".

No Anal. You are wrong. DTrace isn't only for big computers .. it is not 
even for computers but for *finding souce some bugs or other behaviors* 
(not only bugs or bad behaviors :) without specify system size and/or 
price and/or importance and/or is it runed on developmer computer on not.

Ones more: DTrace isn't classic tracing/debuging/measuring tool. It is
much much more and some additional DTrace abilities aren't used only by
developers but also by administrators for finding source veriouse
problems which in spe *aren't bugs* on application or system code level.
And this why DTrace is enabled in distribution/production kernel.
You can have perfect system and perfect application but because system
and applications coexist in one enviroment this will create not empty set
bad cases.
Using yor thing path: KProbe/Dtrace is for development and yes it must 
depend on DEBUG_KERNEL.
ptrace() is also for tracing and ver offen used by developers but it is 
enabled by default and it is not only for developers. So .. ptrace() must 
also depend on DEBUG_KERNEL.
This is of corse wrong .. why ? Because strace just like DTrace/KProbe 
isn't only development tool and/or for developers.

IIRC all Dtrace probes can be divided in to two classes: zero effect and 
near zero effect. First mean: if you don't use probe this do not degrade 
system speed (it uses online self modified binary code without
reservation memory by nop instructions for insert call entry on 
compilation stage). In Solaris kernel exist few thousands avalaible probes 
and IIRC only very small subset is "near zero effect" (uses nop 
instructions).
All other *do not degrade system speed* and this *why* this 
is enabled in production kernel because price this is ~nothing !!

Solaris have very well archived binary compatibilities even on kernel 
level. This offen will mean: you can use some binary modules from older 
kernels and use them with good effect in latest Solaris. Latest Solaris 
(SX) have DTrace.
1 + 1 = you can trace some old code in some limited area (not the same as 
in code prepared for kernel with DTrace) using DTrace using "zero effect" 
probes if you know some entry points in this code.
The same for usser space applications.

>> [1] Remember: if you want profile some part of code you mast _first_
>> (re)compile them with profiling enabled. If you wand debug some code
>
> OProfile doesn't require this.

As same as KProbe/DTrace. Can you use OProfile for something other tnan 
profiling ? Probably yes and this answer opens: probably it will be good 
prepare some common code for KProbe and Oprofile.

>> Some enterprise systems have limited summary time to few hours per year
>> and restart cycle can take houre or more (checking and initialize hardware
>> components). If you will try say for admin this kind system "restat your
>
> Enterprise users generally get kernels from vendors who have done the
> analysis of needs for them (and hopefully got it right). They generally
> don't ftp 2.6.8.1 type make config and try it out.

But using thins like KProbe (which is similar to DTrace) for many Linux
developers will ollow better find bugs in 2.6.8.1 :)
Are they enterprise users ? Is it realy subject _only_ chained to 
"enterprise users" or "vendors" ? IMO no.
And again: DTrace *isn't only for kernel* (current KProbe too) and it is
much more than tracinfg tool so it is importand for developers and not 
develpers too.
So it will be good stop disscuss about "yes or no ?" and start about
"how and when in Linux ?" ..

>> For bring few levels up kernel *development* speed on finding some bugs
>> source and measure some consequences adding/modifing some part of code
>> it will be good have two very well prepared on Solaris things:
>> - crash dumps,
>> - DTrace.
>
> We have crash dumps - at least all the enterprise vendors do. Linus
> doesn't seem to like that stuff so much.

Linux CD to Solaris CD have very long distance ..
Yes it work but compare to Solaris state and as says my fiend "ca~ it only 
work".
It need some more advanced additional tools for analize and report data
from CD.

To Linus: things like CD or KProbe/DProbe allow catch problem by not 
skilled person and analize them in something other in diffret location in 
much more easier way than now. If now in source tree is integrated 
OProfile it will be good integrate ASAP also things like KProbes and CD.
It is not only extenging entropy kernel tree. IMO KProbe can bring some
functionalities wich can be common also for OProfile and probably in 
future IMO OProfile can be droped.

kloczek
-- 
-----------------------------------------------------------
*Ludzie nie mają problemów, tylko sobie sami je stwarzają*
-----------------------------------------------------------
Tomasz Kłoczko, sys adm @zie.pg.gda.pl|*e-mail: kloczek@rudy.mif.pg.gda.pl*

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-21  6:03   ` Tomasz Kłoczko
  2004-08-21  6:12     ` David S. Miller
  2004-08-21 12:12     ` Julien Oster
@ 2004-08-22 11:35     ` Alan Cox
  2004-08-22 18:27       ` Tomasz Kłoczko
  2004-08-23 19:48       ` Robert Milkowski
  2 siblings, 2 replies; 87+ messages in thread
From: Alan Cox @ 2004-08-22 11:35 UTC (permalink / raw)
  To: Tomasz Kłoczko; +Cc: Julien Oster, Miles Lane, Linux Kernel Mailing List

On Sad, 2004-08-21 at 07:03, Tomasz Kłoczko wrote:
> Again: DTrace is *ALSO* admininstration tool and this is why I can't
> understand why in IBM KProbe/DProbe patch it is marked as "depends on
> DEBUG_KERNEL" which is IMO bigest mistake on thinking level about this :>

Because its a debugging feature

> In Solaris DTrace is enabled in _normal production_ kernel and you can 
> hang any probe or probes set without restarting system or any runed
> application which was compiled withoud debug info.

Solaris only runs on large computers. You don't want kprobes randomly on
your phone, pda, wireless router. Solaris deals with an extremely narrow
market segment of "big computers for people with lots of money".

> [1] Remember: if you want profile some part of code you mast _first_
> (re)compile them with profiling enabled. If you wand debug some code

OProfile doesn't require this. 

> Some enterprise systems have limited summary time to few hours per year 
> and restart cycle can take houre or more (checking and initialize hardware
> components). If you will try say for admin this kind system "restat your

Enterprise users generally get kernels from vendors who have done the
analysis of needs for them (and hopefully got it right). They generally
don't ftp 2.6.8.1 type make config and try it out.

> For bring few levels up kernel *development* speed on finding some bugs 
> source and measure some consequences adding/modifing some part of code
> it will be good have two very well prepared on Solaris things:
> - crash dumps,
> - DTrace.

We have crash dumps - at least all the enterprise vendors do. Linus
doesn't seem to like that stuff so much.

> When you see some strange behavior without system destabization 
> current/cassic Linux kernel development look now like:
> 1) if you have good kernel knowledge you can imagine which part of them
>     is source of same observed strange behavior,
> 2) after looking on kernel source you can cut of area to part where bug
>     exist,
> 3) after few recompilations you can say in which area bug exist and after
>     few other iteration stage 3) you can say what maust be fixed.

Actually I generally
-	Glance across the load meters
-	Ask ps where everything is waiting
-	Potentially turn on oprofile
-	Potentially use netfilter to see who is causing all my traffic
-	Maybe strace a few apps to see what is up
-	Educate the user concerned (if needed)

I've already got the symbols (and they are in the debuginfo package from
the rpm build too). I could insmod kgdb but that level of
debugging is generally inappropriate. 

DTrace value is ease of use value.

> http://blogs.sun.com/roller/page/bmc/20040820#dtrace_on_lkml
> Bryan blog is also yet another Dtrace knowledge source ..

Coo I thought only the Sun CEO spent his life making inappropriate
comments 8)


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-21 12:12     ` Julien Oster
  2004-08-21 13:27       ` Tomasz Kłoczko
@ 2004-08-21 21:49       ` Bryan Cantrill
  2004-08-23 23:08         ` Christoph Halder
  1 sibling, 1 reply; 87+ messages in thread
From: Bryan Cantrill @ 2004-08-21 21:49 UTC (permalink / raw)
  To: Julien Oster; +Cc: kloczek, usenet-20040502, miles.lane, linux-kernel, bmc


> > PS. Very interesting commens about this thread is on Bryan Cantrill
> > (DTrace developer) blog:
> > http://blogs.sun.com/roller/page/bmc/20040820#dtrace_on_lkml
> > Bryan blog is also yet another Dtrace knowledge source ..
> 
> Oh, yeah, great. A whole blog entry dedicated to me. Now I am a moron,
> absolutely clueless and I am "looking to confirm preconceived notions
> rather than understand new technology".
> 
> Sorry, but that goes a little too far. No, I didn't try out dtrace
> and, right after reading the article (and that's the important thing!)
> I didn't seek for further information about it, I'm not a Solaris
> System Administrator right now (I was, some years ago). And all I was
> saying is that this *article* was just ridiculous.

As a reminder, you _didn't_ read the entire article, by your own admission:

  That article is way too hypey...  I couldn't even read it completely,
  it was just too much.

And that was actually my point:  you spent more time denigrating the article
for lack of supporting detail than it would have taken you to _finish_ the
article and discover those supporting details.  

> But in that article, I was just missing the objectiveness. A quick
> note about the fact that Sun's been introducing dtrace for Solaris 10
> and what it is, what it does, would have been much better instead of
> talking about a "Cantrill explosion", how "DTrace has completely
> changed the way I do business" (actual quotes).

You don't like customer quotes?  It seems to me that quotes like that
one (or like the other customer quotes that appear in the article)
give weight to the claims.  Don't you like to hear from people who have
actually _used_ a technology?  I know that I do -- those who have used
a technology are likely to have a much more balanced view on its
strengths and weaknesses than those who have just read about it.  (Indeed,
this is true of pretty much anything -- experience matters.)

> Florian and Alan told me in a quick and objective manner why dtrace is
> a good thing, and I am glad for that information. I never stated that
> DTrace was a bad thing. 

Oh really?  You wrote:

  Come on, it's profiling. As presented by that article, it is even more
  micro optimization than one would think. What with tweaking the disk
  I/O improvements and all... If my harddisk accesses were a microsecond
  more immediate or my filesystem giving a quantum more transfer rate,
  it would be nice, but I certainly wouldn't get enthusiastic and I bet
  nobody would even notice.

So come on, yourself:  it's _not_ just "profiling" and DTrace finds
problems that are _much_ larger than "micro-optimizations" (indeed, that's
the whole damn point), and finishing the article would have told you that.

> Bryan Cantrill, I can understand that you have to defend DTrace. But
> please, PLEASE stop saying that I am a clueless moron if I wasn't even
> ranting about you, ranting about DTrace, but just about *that single
> article* and it's presentation of DTrace to me.

You were attacking more than just the article; you ended with:

  I sure hope that article is meant sarcastically. By the way, did I
  miss something or is profiling suddenly a new thing again?

You asked if you were missing something, and I replied that you were
missing plenty.  Presumably you now feel informed (if a little embarrassed),
and I think that those that you misinformed also now realize that what
you provided them was misinformation.  So as far as I'm concerned, that's
the end of that.

	- Bryan

--------------------------------------------------------------------------
Bryan Cantrill, Solaris Kernel Development.       http://blogs.sun.com/bmc

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-21 12:12     ` Julien Oster
@ 2004-08-21 13:27       ` Tomasz Kłoczko
  2004-08-21 21:49       ` Bryan Cantrill
  1 sibling, 0 replies; 87+ messages in thread
From: Tomasz Kłoczko @ 2004-08-21 13:27 UTC (permalink / raw)
  To: Julien Oster; +Cc: Julien Oster, Miles Lane, linux-kernel, Bryan Cantrill

[-- Attachment #1: Type: TEXT/PLAIN, Size: 2692 bytes --]

On Sat, 21 Aug 2004, Julien Oster wrote:
[..]
>> PS. Very interesting commens about this thread is on Bryan Cantrill
>> (DTrace developer) blog:
>> http://blogs.sun.com/roller/page/bmc/20040820#dtrace_on_lkml
>> Bryan blog is also yet another Dtrace knowledge source ..
>
> Oh, yeah, great. A whole blog entry dedicated to me. Now I am a moron,
> absolutely clueless and I am "looking to confirm preconceived notions
> rather than understand new technology".
>
> Sorry, but that goes a little too far. No, I didn't try out dtrace
> and, right after reading the article (and that's the important thing!)
> I didn't seek for further information about it, I'm not a Solaris
> System Administrator right now (I was, some years ago). And all I was
> saying is that this *article* was just ridiculous.

s/DTrace/<something_other>/ .. and yes in any other cases also if you are
not never using this <something_other> and try say publicaly what is it 
maybe you can't be moron but your camment ~100% will be _like_ moron
comment :_)
Why in this case you are comment like moron ? Because DTrace is 
consequense spending may hundrets hours by many many people (probablty not 
only from Sun and not only developers) .. it is probaly bigest innovation 
on operating system word in last few years.

It is very hard to describe in short article what DTrace is and what is not ..
and I can undestand why peple like you after reading some short text will 
see in this *only* tracing tool or *only* profiling tool (*olny* tools 
which they know) .. simple because tool like DTrace partialy creates
new class of tools.
You can know debuger, profiler, any other (statical) tracing tool and 
maybe some tools for measuring some interesting parameters but DTrace 
isn't simple combination above because it have programing abilities. For 
example you can add expression when and what from some bigger set 
parameters/points must be traced or not .. all depending on current 
program/kernel state.

DTrace power isn't in hooking atomic probes abilities but in combine this 
with very small but smart/powerfull programable VM and on collecting some 
data in few usefull forms (tables [1], hashes ..) and reporting all this 
in readable form. This why current Linux KProbe/DProbe isn't so usefull as 
current Solaris DTrace.

[1] in simple tables or tables indexed using probe results or eveven
current stack path or other vector some variables set.

kloczek
-- 
-----------------------------------------------------------
*Ludzie nie mają problemów, tylko sobie sami je stwarzają*
-----------------------------------------------------------
Tomasz Kłoczko, sys adm @zie.pg.gda.pl|*e-mail: kloczek@rudy.mif.pg.gda.pl*

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-21  6:03   ` Tomasz Kłoczko
  2004-08-21  6:12     ` David S. Miller
@ 2004-08-21 12:12     ` Julien Oster
  2004-08-21 13:27       ` Tomasz Kłoczko
  2004-08-21 21:49       ` Bryan Cantrill
  2004-08-22 11:35     ` Alan Cox
  2 siblings, 2 replies; 87+ messages in thread
From: Julien Oster @ 2004-08-21 12:12 UTC (permalink / raw)
  To: Tomasz Kłoczko
  Cc: Julien Oster, Miles Lane, linux-kernel, Bryan Cantrill

Tomasz Kłoczko <kloczek@rudy.mif.pg.gda.pl> writes:

Hello Tomasz,

> Probably you did try use DTrace even less than 5 minutes :->

No, I didn't. I clearly referred to that article, not to dtrace itself.

> PS. Very interesting commens about this thread is on Bryan Cantrill
> (DTrace developer) blog:
> http://blogs.sun.com/roller/page/bmc/20040820#dtrace_on_lkml
> Bryan blog is also yet another Dtrace knowledge source ..

Oh, yeah, great. A whole blog entry dedicated to me. Now I am a moron,
absolutely clueless and I am "looking to confirm preconceived notions
rather than understand new technology".

Sorry, but that goes a little too far. No, I didn't try out dtrace
and, right after reading the article (and that's the important thing!)
I didn't seek for further information about it, I'm not a Solaris
System Administrator right now (I was, some years ago). And all I was
saying is that this *article* was just ridiculous.

Please read this paragraph of my response to it again:

| Maybe, without that article, I would recognize it as a fine thing
| (and by "fine" I don't mean "the best thing since sliced bread"),
| but that piece of text was just too ridiculous to take anything
| serious.

That should make it obvious, shouldn't it? I would have written the
same thing If I read a similar article about, for example, vmware, UML
or valgrind - and I really think those are really great inventions.

But in that article, I was just missing the objectiveness. A quick
note about the fact that Sun's been introducing dtrace for Solaris 10
and what it is, what it does, would have been much better instead of
talking about a "Cantrill explosion", how "DTrace has completely
changed the way I do business" (actual quotes).

Florian and Alan told me in a quick and objective manner why dtrace is
a good thing, and I am glad for that information. I never stated that
DTrace was a bad thing. I repeat it again - if I had any use for it,
and I maybe have in future - it looks like I would consider DTrace a
very nice thing to have. From the (non-insulting) replys I got, I
understood that DTrace actually is one.

Bryan Cantrill, I can understand that you have to defend DTrace. But
please, PLEASE stop saying that I am a clueless moron if I wasn't even
ranting about you, ranting about DTrace, but just about *that single
article* and it's presentation of DTrace to me.

And then all those comments about Linux users and developers being
very defensive about DTrace... heck, can't I even critisize the
quality of an ARTICLE without being accused of being a Linux maniac
which fights against Solaris? I am using Solaris myself. Some years
ago, I was a System Administrator for - guess what - Solaris
machines. The last thing I want to step into is a religious war
between Solaris and Linux. Does that mean I am not allowed to express
my opinion about the public press anymore?

I read about dtrace now, I think it's a good invention. If at any
place at any time we two would meet I maybe would say to you "Bryan, I
like DTrace very much". If I was to administer Solaris systems right
now, I would probably even say: "Bryan, DTrace has helped me a
lot". But I would STILL say that that article was CRAP, because that
is just what that article is from my point of view.

Now, I really don't know how to make this any more clear.

Julien

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-21  6:12     ` David S. Miller
@ 2004-08-21  6:22       ` Tomasz Kłoczko
  0 siblings, 0 replies; 87+ messages in thread
From: Tomasz Kłoczko @ 2004-08-21  6:22 UTC (permalink / raw)
  To: David S. Miller; +Cc: usenet-20040502, miles.lane, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 651 bytes --]

On Fri, 20 Aug 2004, David S. Miller wrote:

> On Sat, 21 Aug 2004 08:03:10 +0200 (CEST)
> Tomasz Kłoczko <kloczek@rudy.mif.pg.gda.pl> wrote:
>
>> [1] Remember: if you want profile some part of code you mast _first_
>> (re)compile them with profiling enabled.
>
> If you use oprofile or valgrind, no you don't.

Of course .. but oprofile can't be used on areas where DTrace can be :)

kloczek
-- 
-----------------------------------------------------------
*Ludzie nie mają problemów, tylko sobie sami je stwarzają*
-----------------------------------------------------------
Tomasz Kłoczko, sys adm @zie.pg.gda.pl|*e-mail: kloczek@rudy.mif.pg.gda.pl*

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-21  6:03   ` Tomasz Kłoczko
@ 2004-08-21  6:12     ` David S. Miller
  2004-08-21  6:22       ` Tomasz Kłoczko
  2004-08-21 12:12     ` Julien Oster
  2004-08-22 11:35     ` Alan Cox
  2 siblings, 1 reply; 87+ messages in thread
From: David S. Miller @ 2004-08-21  6:12 UTC (permalink / raw)
  To: Tomasz K³oczko; +Cc: usenet-20040502, miles.lane, linux-kernel

On Sat, 21 Aug 2004 08:03:10 +0200 (CEST)
Tomasz K³oczko <kloczek@rudy.mif.pg.gda.pl> wrote:

> [1] Remember: if you want profile some part of code you mast _first_
> (re)compile them with profiling enabled.

If you use oprofile or valgrind, no you don't.


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-19 23:23 ` Julien Oster
  2004-08-19 22:33   ` Alan Cox
  2004-08-20  0:23   ` Florian Weimer
@ 2004-08-21  6:03   ` Tomasz Kłoczko
  2004-08-21  6:12     ` David S. Miller
                       ` (2 more replies)
  2004-08-31 20:16   ` Timothy Miller
  3 siblings, 3 replies; 87+ messages in thread
From: Tomasz Kłoczko @ 2004-08-21  6:03 UTC (permalink / raw)
  To: Julien Oster; +Cc: Miles Lane, linux-kernel

[-- Attachment #1: Type: TEXT/PLAIN, Size: 5783 bytes --]

On Fri, 20 Aug 2004, Julien Oster wrote:

> Miles Lane <miles.lane@comcast.net> writes:
>
>> http://www.theregister.co.uk/2004/07/08/dtrace_user_take/:
>> "Sun sees DTrace as a big advantage for Solaris over other versions of Unix
>> and Linux."
>
> That article is way too hypey.
>
> It sounds like one of those strange american commercials you see
> sometimes at night, where two overenthusiastic persons are telling you
> how much that strange fruit juice machine has changed their lives,
> with making them loose 200 pounds in 6 days and improving their
> performance at beach volleyball a lot due to subneutronic antigravity
> manipulation. You usually can't watch those commercials for longer
> than 5 minutes.

Probably you did try use DTrace even less than 5 minutes :->

> The same applies to that article, I couldn't even read it completely,
> it was just too much.
>
> And is it just me or did that article really take that long to
> mentioning what dtrace actually IS?
>
> Come on, it's profiling.

Bullsh* :-|

DTrace is development tool for kernel and *application* develepers
because it have some tracing functionalities .. but *ALSO* .. not mainly.
It can be used *ALSO* as profiling tool but this functionality is small 
piece of this and most exciting part of DTrace is outhere both above 
areas. *ALSO* DTrace can bring classic profiling/tracing tasks to new much 
more effective level (look below on [1]).
For many cases where I try use DTrace and which I heard where it was uses 
by my friends it was used on normal *administration* tasks (look for 
example on SysAdmin DTrace article where on beging this article was used 
example how DTrace can be used for finding source of some 
system misconfiguration) where before was used tools like variouse 
{k,vm,prt,net,io,nfs,lock,...}stat tools and in mamy cases it allow answer 
for the same administrators question where this tools can't be used in
simple way or where answering on some question can take so much time where 
many people will say "huh .. _maybe_ I will try this later".

Again: DTrace is *ALSO* admininstration tool and this is why I can't
understand why in IBM KProbe/DProbe patch it is marked as "depends on
DEBUG_KERNEL" which is IMO bigest mistake on thinking level about this :>

In Solaris DTrace is enabled in _normal production_ kernel and you can 
hang any probe or probes set without restarting system or any runed
application which was compiled withoud debug info.

[1] Remember: if you want profile some part of code you mast _first_
(re)compile them with profiling enabled. If you wand debug some code first 
you must (re)compile code with enabled generate debug info. All this takes 
time .. and in all this cases you must restart system or traced
application (also takes time).
In many cases (even not trivial) using DTrace you can perform 
tracing/profiling/measuring _without recompiling and also without_
_restating_ runed code (which is *very valueable*) .. all this in few
seconds.

Some enterprise systems have limited summary time to few hours per year 
and restart cycle can take houre or more (checking and initialize hardware
components). If you will try say for admin this kind system "restat your
system using this kernel image which have enabled some additional printk() 
and show me syslog output" probaly you will never see this admin again.
Also mamy bugs can be observed _only_ on highly loaded systems where
adding ptrace() based tracing can even kill system (trace using DTrace
takes less power from system than ptrace()).

For bring few levels up kernel *development* speed on finding some bugs 
source and measure some consequences adding/modifing some part of code
it will be good have two very well prepared on Solaris things:
- crash dumps,
- DTrace.
On Solaris also you can combine above (you can generate crash dump using 
signal from DTrace program and you can review DTrace data from system 
crash dump image).

When you see some strange behavior without system destabization 
current/cassic Linux kernel development look now like:
1) if you have good kernel knowledge you can imagine which part of them
    is source of same observed strange behavior,
2) after looking on kernel source you can cut of area to part where bug
    exist,
3) after few recompilations you can say in which area bug exist and after
    few other iteration stage 3) you can say what maust be fixed.
This classic way.

New way using DTrace can look like:
1) by using few times modified DTrace *programs* (in many cases
    prepared in one line command line parameters) you can locate which part
    of runng code (for example by what is on stack) is source of observed
    behavior,
2) after locate bad area code you can find associated to this source code,
3) and now from this limited area you can start thinging about "what I
    must know about kernel" for finding source of problem.

New way is kind of anal development. In much more cases it will allow
find source of bug and have source even prepare better or worse fix
by _not only_ highly expirinced kernel developers.
Also stage 1) and 2) can be performed by *not real developer* (?!).

In classic way if you are not skilled you can't even try pass 1) stage :>
Passing classic variant stage 3) requires installed development 
enviroment.

kloczek
PS. Very interesting commens about this thread is on Bryan Cantrill 
(DTrace developer) blog:
http://blogs.sun.com/roller/page/bmc/20040820#dtrace_on_lkml
Bryan blog is also yet another Dtrace knowledge source ..
-- 
-----------------------------------------------------------
*Ludzie nie mają problemów, tylko sobie sami je stwarzają*
-----------------------------------------------------------
Tomasz Kłoczko, sys adm @zie.pg.gda.pl|*e-mail: kloczek@rudy.mif.pg.gda.pl*

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-20 13:46       ` Florian Weimer
@ 2004-08-20 16:46         ` David S. Miller
  0 siblings, 0 replies; 87+ messages in thread
From: David S. Miller @ 2004-08-20 16:46 UTC (permalink / raw)
  To: Florian Weimer; +Cc: alexn, miles.lane, linux-kernel

On Fri, 20 Aug 2004 15:46:33 +0200
Florian Weimer <fw@deneb.enyo.de> wrote:

> > One could quite easily hack up a tool to monitor I/O per process or
> > does it need to be very more precise?
> 
> It would be nice to obtain file names, too.

I bet you could even implement this quite simply as
a new special oprofile event.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-20 13:34     ` Alexander Nyberg
@ 2004-08-20 13:46       ` Florian Weimer
  2004-08-20 16:46         ` David S. Miller
  0 siblings, 1 reply; 87+ messages in thread
From: Florian Weimer @ 2004-08-20 13:46 UTC (permalink / raw)
  To: Alexander Nyberg; +Cc: Miles Lane, linux-kernel

* Alexander Nyberg:

>> Most other system resources can be tracked quite easily: disk space,
>> CPU time, committed address space, even network I/O (with tcpdump and
>> netstat -p).  But there's no such thing for disk I/O.
>
> Why can't this be done be looking at the major faults a process causes?

Because only paging results in major faults, normal I/O with
read()/write() (or the p*() variants) does not.

> One could quite easily hack up a tool to monitor I/O per process or
> does it need to be very more precise?

It would be nice to obtain file names, too.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-20  0:23   ` Florian Weimer
@ 2004-08-20 13:34     ` Alexander Nyberg
  2004-08-20 13:46       ` Florian Weimer
  0 siblings, 1 reply; 87+ messages in thread
From: Alexander Nyberg @ 2004-08-20 13:34 UTC (permalink / raw)
  To: Florian Weimer; +Cc: Miles Lane, linux-kernel

On Fri, 2004-08-20 at 02:23, Florian Weimer wrote:
> * Julien Oster:
> 
> > Miles Lane <miles.lane@comcast.net> writes:
> >
> >> http://www.theregister.co.uk/2004/07/08/dtrace_user_take/:
> >> "Sun sees DTrace as a big advantage for Solaris over other versions of Unix 
> >> and Linux."
> >
> > That article is way too hypey.
> 
> Maybe, but DTrace seems to solve one really pressing problem: tracking
> disk I/O to the processes causing it.  Unexplained high I/O
> utilization is a *very* common problem, and there aren't any tools to
> diagnose it.
>
> Most other system resources can be tracked quite easily: disk space,
> CPU time, committed address space, even network I/O (with tcpdump and
> netstat -p).  But there's no such thing for disk I/O.

Why can't this be done be looking at the major faults a process causes?

Not very slick indeed, but it can be done.

I wrote a small silly /proc/pid/stat parser at
http://serkiaden.mine.nu/procextract.c 

One could quite easily hack up a tool to monitor I/O per process or
does it need to be very more precise?


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-20 10:08     ` Alex Bennee
@ 2004-08-20 11:21       ` Robert Schwebel
  0 siblings, 0 replies; 87+ messages in thread
From: Robert Schwebel @ 2004-08-20 11:21 UTC (permalink / raw)
  To: Linux Kernel Mailing List

On Fri, Aug 20, 2004 at 11:08:21AM +0100, Alex Bennee wrote:
> Well profiling for user space developers. Certainly for embedded "soft
> realtime" work I've found LTT really useful in understanding where the
> contentions where in my user-space code. And also why the old pthread
> mutex didn't work well with SCHED_RT priorities :-(
> 
> If it was my choice I'd like to see LTT merged, but of course its not
> all about me much as I wish it was ;-)

Same here - LTT turned out to be a useful tool for debugging performance
issues in several USB gadget drivers. Sure, it would have been possible
to instrument the boxes with the scope and trace things in hardware, but
with LTT it was much easier to handle. 

Robert
-- 
 Dipl.-Ing. Robert Schwebel | http://www.pengutronix.de
 Pengutronix - Linux Solutions for Science and Industry
   Handelsregister:  Amtsgericht Hildesheim, HRA 2686
     Hornemannstraße 12,  31137 Hildesheim, Germany
    Phone: +49-5121-28619-0 |  Fax: +49-5121-28619-4

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-19 22:33   ` Alan Cox
@ 2004-08-20 10:08     ` Alex Bennee
  2004-08-20 11:21       ` Robert Schwebel
  0 siblings, 1 reply; 87+ messages in thread
From: Alex Bennee @ 2004-08-20 10:08 UTC (permalink / raw)
  To: Alan Cox; +Cc: Julien Oster, Miles Lane, Linux Kernel Mailing List

On Thu, 2004-08-19 at 23:33, Alan Cox wrote:
> On Gwe, 2004-08-20 at 00:23, Julien Oster wrote:
> > Come on, it's profiling. As presented by that article, it is even more
> <snip>
> "Profiling for the people" as it were.. (as opposed to the current
> fad of 'profiling the people')

Well profiling for user space developers. Certainly for embedded "soft
realtime" work I've found LTT really useful in understanding where the
contentions where in my user-space code. And also why the old pthread
mutex didn't work well with SCHED_RT priorities :-(

If it was my choice I'd like to see LTT merged, but of course its not
all about me much as I wish it was ;-)

-- 
Alex, Kernel Hacker: http://www.bennee.com/~alex/

Assembly language experience is [important] for the maturity
and understanding of how computers work that it provides.
		-- D. Gries


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-19 23:23 ` Julien Oster
  2004-08-19 22:33   ` Alan Cox
@ 2004-08-20  0:23   ` Florian Weimer
  2004-08-20 13:34     ` Alexander Nyberg
  2004-08-21  6:03   ` Tomasz Kłoczko
  2004-08-31 20:16   ` Timothy Miller
  3 siblings, 1 reply; 87+ messages in thread
From: Florian Weimer @ 2004-08-20  0:23 UTC (permalink / raw)
  To: Miles Lane; +Cc: linux-kernel

* Julien Oster:

> Miles Lane <miles.lane@comcast.net> writes:
>
>> http://www.theregister.co.uk/2004/07/08/dtrace_user_take/:
>> "Sun sees DTrace as a big advantage for Solaris over other versions of Unix 
>> and Linux."
>
> That article is way too hypey.

Maybe, but DTrace seems to solve one really pressing problem: tracking
disk I/O to the processes causing it.  Unexplained high I/O
utilization is a *very* common problem, and there aren't any tools to
diagnose it.

Most other system resources can be tracked quite easily: disk space,
CPU time, committed address space, even network I/O (with tcpdump and
netstat -p).  But there's no such thing for disk I/O.

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-19 22:22 Miles Lane
  2004-08-19 23:01 ` Karim Yaghmour
@ 2004-08-19 23:23 ` Julien Oster
  2004-08-19 22:33   ` Alan Cox
                     ` (3 more replies)
  1 sibling, 4 replies; 87+ messages in thread
From: Julien Oster @ 2004-08-19 23:23 UTC (permalink / raw)
  To: Miles Lane; +Cc: linux-kernel

Miles Lane <miles.lane@comcast.net> writes:

> http://www.theregister.co.uk/2004/07/08/dtrace_user_take/:
> "Sun sees DTrace as a big advantage for Solaris over other versions of Unix 
> and Linux."

That article is way too hypey.

It sounds like one of those strange american commercials you see
sometimes at night, where two overenthusiastic persons are telling you
how much that strange fruit juice machine has changed their lives,
with making them loose 200 pounds in 6 days and improving their
performance at beach volleyball a lot due to subneutronic antigravity
manipulation. You usually can't watch those commercials for longer
than 5 minutes.

The same applies to that article, I couldn't even read it completely,
it was just too much.

And is it just me or did that article really take that long to
mentioning what dtrace actually IS?

Come on, it's profiling. As presented by that article, it is even more
micro optimization than one would think. What with tweaking the disk
I/O improvements and all... If my harddisk accesses were a microsecond
more immediate or my filesystem giving a quantum more transfer rate,
it would be nice, but I certainly wouldn't get enthusiastic and I bet
nobody would even notice.

Maybe, without that article, I would recognize it as a fine thing (and
by "fine" I don't mean "the best thing since sliced bread"), but that
piece of text was just too ridiculous to take anything serious.

I sure hope that article is meant sarcastically. By the way, did I
miss something or is profiling suddenly a new thing again?

Regards,
Julien

^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-19 22:22 Miles Lane
@ 2004-08-19 23:01 ` Karim Yaghmour
  2004-08-19 23:23 ` Julien Oster
  1 sibling, 0 replies; 87+ messages in thread
From: Karim Yaghmour @ 2004-08-19 23:01 UTC (permalink / raw)
  To: Miles Lane
  Cc: linux-kernel, Thomas Zanussi, Richard J Moore, Robert Wisniewski,
	Michel Dagenais


Miles Lane wrote:
> http://www.theregister.co.uk/2004/07/08/dtrace_user_take/:
> 
> "Sun sees DTrace as a big advantage for Solaris over other versions of Unix 
> and Linux."

We've been pushing for the inclusion of the Linux Trace Toolkit in the kernel
for the past 5 years. As of late, it seems that the pending argument against
its inclusion is: How is this useful to end users? In answer to that, I had
already posted the same pointer as above:
http://marc.theaimsgroup.com/?l=linux-kernel&m=108938594031379&w=2

Since then, I've had the chance to discuss this matter at the Kernel Summit,
and again I was told that this was a sales problem (i.e. it must be
demonstrated that this is actually useful to users.) So, as the developers
of the Linux Trace Toolkit, it would help us a lot if you could explain to
this list why the sort of functionality provided by DTrace is something you
would personally find useful.

Thanks,

Karim
-- 
Author, Speaker, Developer, Consultant
Pushing Embedded and Real-Time Linux Systems Beyond the Limits
http://www.opersys.com || karim@opersys.com || 1-866-677-4546


^ permalink raw reply	[flat|nested] 87+ messages in thread

* Re: DTrace-like analysis possible with future Linux kernels?
  2004-08-19 23:23 ` Julien Oster
@ 2004-08-19 22:33   ` Alan Cox
  2004-08-20 10:08     ` Alex Bennee
  2004-08-20  0:23   ` Florian Weimer
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 87+ messages in thread
From: Alan Cox @ 2004-08-19 22:33 UTC (permalink / raw)
  To: Julien Oster; +Cc: Miles Lane, Linux Kernel Mailing List

On Gwe, 2004-08-20 at 00:23, Julien Oster wrote:
> Come on, it's profiling. As presented by that article, it is even more
> micro optimization than one would think. What with tweaking the disk
> I/O improvements and all... If my harddisk accesses were a microsecond
> more immediate or my filesystem giving a quantum more transfer rate,
> it would be nice, but I certainly wouldn't get enthusiastic and I bet
> nobody would even notice.

Yes and no. LTT is just profiling too. Both of them are making people
notice because they allow that system profiling work to be done by 
two-banana grade operations staff not by the company wizard.

Neither are perfect - users want a "why is it going slow button"
with a "make it work" and "beat the crap out of the user who caused it"
option set.

"Profiling for the people" as it were.. (as opposed to the current
fad of 'profiling the people')

Alan


^ permalink raw reply	[flat|nested] 87+ messages in thread

* DTrace-like analysis possible with future Linux kernels?
@ 2004-08-19 22:22 Miles Lane
  2004-08-19 23:01 ` Karim Yaghmour
  2004-08-19 23:23 ` Julien Oster
  0 siblings, 2 replies; 87+ messages in thread
From: Miles Lane @ 2004-08-19 22:22 UTC (permalink / raw)
  To: linux-kernel

http://www.theregister.co.uk/2004/07/08/dtrace_user_take/:

"Sun sees DTrace as a big advantage for Solaris over other versions of Unix 
and Linux."


^ permalink raw reply	[flat|nested] 87+ messages in thread

end of thread, other threads:[~2004-09-02 15:42 UTC | newest]

Thread overview: 87+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <2ptdY-42Y-55@gated-at.bofh.it>
     [not found] ` <2uPdM-380-11@gated-at.bofh.it>
     [not found]   ` <2uUwL-6VP-11@gated-at.bofh.it>
     [not found]     ` <2uWfh-8jo-29@gated-at.bofh.it>
     [not found]       ` <2uXl0-Gt-27@gated-at.bofh.it>
     [not found]         ` <2vge2-63k-15@gated-at.bofh.it>
     [not found]           ` <2vgQF-6Ai-39@gated-at.bofh.it>
     [not found]             ` <2vipq-7O8-15@gated-at.bofh.it>
     [not found]               ` <2vj2b-8md-9@gated-at.bofh.it>
     [not found]                 ` <2vDtS-bq-19@gated-at.bofh.it>
2004-08-21 15:01                   ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Pascal Schmidt
2004-08-21 15:57                     ` Joerg Schilling
2004-08-21 21:42                       ` Pascal Schmidt
2004-08-22 11:56                       ` Joerg Schilling
2004-08-22 12:14                         ` Joerg Schilling
2004-08-22 12:52                           ` Patrick McFarland
2004-08-22 13:05                             ` Joerg Schilling
2004-08-22 16:38                               ` Horst von Brand
2004-08-22 15:11                           ` Horst von Brand
2004-08-22 18:09                             ` Matthias Andree
2004-08-22 13:13                         ` Pascal Schmidt
2004-08-22 16:00                           ` Christer Weinigel
2004-08-22 16:32                             ` Joerg Schilling
2004-08-22 17:18                               ` Christer Weinigel
2004-08-22 19:22                                 ` DTrace-like analysis possible with future Linux kernels? Joerg Schilling
2004-08-22 20:27                               ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Giuseppe Bilotta
2004-08-22 21:29                               ` Julien Oster
2004-08-23 11:40                                 ` Joerg Schilling
2004-08-23 13:15                                   ` Matthias Andree
2004-08-23 18:16                               ` Kai Makisara
2004-08-24 10:22                                 ` Christer Weinigel
2004-08-24 15:34                                 ` Joerg Schilling
2004-08-22 16:33                             ` Christer Weinigel
2004-08-22 16:19                               ` Alan Cox
2004-08-22 17:31                                 ` Christer Weinigel
2004-08-22 20:47                                   ` Alan Cox
2004-08-22 22:17                                     ` Christer Weinigel
2004-08-23 12:22                                 ` Adam Sampson
2004-08-22 19:26                             ` Tonnerre
2004-08-22 20:14                               ` DTrace-like analysis possible with future Linux kernels? Joerg Schilling
2004-08-22 20:33                                 ` Tonnerre
2004-08-22 20:38                                   ` Alan Cox
2004-08-22 20:43                                   ` Joerg Schilling
2004-08-22 21:37                                     ` Christer Weinigel
2004-08-23 11:44                                       ` Joerg Schilling
2004-08-23 17:40                                 ` Horst von Brand
2004-08-23 20:25                               ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Bill Davidsen
2004-08-23 21:01                                 ` Doug Maxey
2004-08-25 18:29                                   ` Bill Davidsen
2004-08-24  2:22                                 ` Nuno Silva
2004-08-31 22:22                             ` (was: Re: PATCH: cdrecord: avoiding scsi device numbering for ide devices) John Myers
2004-09-02  9:44                               ` Joerg Schilling
2004-09-02 13:49                                 ` John Myers
2004-09-02 15:40                                   ` Joerg Schilling
2004-08-22 21:27                           ` PATCH: cdrecord: avoiding scsi device numbering for ide devices Julien Oster
     [not found] <2wAWW-12a-11@gated-at.bofh.it>
2004-08-24 13:04 ` DTrace-like analysis possible with future Linux kernels? Pascal Schmidt
2004-08-24 13:07   ` Joerg Schilling
2004-08-24  4:14 Joerg Schilling
2004-08-28 19:15 ` Alan Cox
     [not found] <2v3Ad-5tc-29@gated-at.bofh.it>
     [not found] ` <2v4w9-6aQ-5@gated-at.bofh.it>
     [not found]   ` <2vxeJ-4kg-3@gated-at.bofh.it>
     [not found]     ` <2vZNN-7AT-33@gated-at.bofh.it>
     [not found]       ` <2w5q4-34M-1@gated-at.bofh.it>
     [not found]         ` <2w9Dq-65C-13@gated-at.bofh.it>
2004-08-23 18:19           ` Andi Kleen
  -- strict thread matches above, loose matches on Subject: below --
2004-08-19 22:22 Miles Lane
2004-08-19 23:01 ` Karim Yaghmour
2004-08-19 23:23 ` Julien Oster
2004-08-19 22:33   ` Alan Cox
2004-08-20 10:08     ` Alex Bennee
2004-08-20 11:21       ` Robert Schwebel
2004-08-20  0:23   ` Florian Weimer
2004-08-20 13:34     ` Alexander Nyberg
2004-08-20 13:46       ` Florian Weimer
2004-08-20 16:46         ` David S. Miller
2004-08-21  6:03   ` Tomasz Kłoczko
2004-08-21  6:12     ` David S. Miller
2004-08-21  6:22       ` Tomasz Kłoczko
2004-08-21 12:12     ` Julien Oster
2004-08-21 13:27       ` Tomasz Kłoczko
2004-08-21 21:49       ` Bryan Cantrill
2004-08-23 23:08         ` Christoph Halder
2004-08-22 11:35     ` Alan Cox
2004-08-22 18:27       ` Tomasz Kłoczko
2004-08-22 18:46         ` Alan Cox
2004-08-23 17:34           ` Tomasz Kłoczko
2004-08-22 23:03         ` John Levon
2004-08-23 19:48       ` Robert Milkowski
2004-08-24  0:39         ` David S. Miller
2004-08-28 19:16         ` Alan Cox
2004-08-29  0:14           ` Tomasz Kłoczko
2004-08-29  5:30             ` David S. Miller
2004-08-29 10:45               ` Tomasz Kłoczko
2004-08-29 17:46                 ` David S. Miller
2004-08-29 10:53               ` Robert Milkowski
2004-08-29 10:29           ` Robert Milkowski
2004-08-31 20:16   ` Timothy Miller
2004-08-07 12:51 Linux Kernel bug report (includes fix) Joerg Schilling
2004-08-07 13:26 ` Måns Rullgård
2004-08-07 19:32   ` Bernd Schubert
2004-08-08  1:18 ` Horst von Brand
2004-08-08  5:22   ` Alexander E. Patrakov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).