linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Bad ext3/nfs DoS bug
@ 2006-07-17 13:01 James
  2006-07-17 13:06 ` Marcel Holtmann
                   ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: James @ 2006-07-17 13:01 UTC (permalink / raw)
  To: linux-kernel

I've tried contacting the relevant maintainers directly,
and it's even in the kernel bugzilla, but nothing's happened
and it's been over a month now. No-one seems to be doing anyting 
about this. Is one meant to post this to bugtraq or what?

Here's the bug: http://bugzilla.kernel.org/show_bug.cgi?id=6828
(exploit code follows)

> We found this rather surprising behaviour when debugging a
> network card for one of our embedded systems. There was a
> bus problem that occasionally caused the network card to
> place random data in the outgoing packets. We were using
> NFS root, as we hadn't written drivers for the block
> devices yet, and discovered our Linux NFS servers getting
> ext3 errors. It turned out that the 3com cards we have in
> the servers lie about checking UDP checksums, and passed
> the rubbish to knfsd where it was causing the problem. 
> 
> Here's an example one of our widgets (dcm503) is talking
> to an NFS server (dufftown)
> 
> 17:28:38.535011 dcm503.guralp.local.984095109 > dufftown.guralp.local.nfs: 116 
> lookup fh Unknown/1 "" (DF) (ttl 64, id 0, len 144)
>                          4500 0090 0000 4000 4011 3d45 0a52 01fa
>                          c0a8 3024 03ff 0801 007c 8e9c 3aa8 1985
>                          0000 0000 0000 0002 0001 86a3 0000 0002
>                          0000 0004 0000 0001 0000 001c 028f 5b0c
>                          0000 0006 6463 6d35 3033 0000 0000 0000
>                          0000 0000 0000 0000 0000 0000 0000 0000
>                          0100 0001 0021 0003 3d26 3d00 4a2f ffff
>                          3d00 2c08 c923 0000 0000 0000 0000 0000
>                          0000 0000 000a 6d6f 756e 7470 6f69 6e74
> 
> so what's happened here is 4a2f ffff should have been 4a2f
> xxxx but the network card has missed the clock on the bus
> and gotten ffff instead
> 
> nfsd_dispatch: vers 2 proc 4
> nfsd: LOOKUP   32: 01000001 03002100 003d263d ffff2f4a 082c003d 000023c9
> nfsd: nfsd_lookup(fh 32: 01000001 03002100 003d263d ffff2f4a 082c003d 
> 000023c9, )
> nfsd: fh_verify(32: 01000001 03002100 003d263d ffff2f4a 082c003d 000023c9)
> 
> so here the client does a V2 lookup with a DH which has
> gotten screwed up by my clients network card, this is
> received by my server, gets past the UDP checksum code
> (thank you 3com) and ends up at knfsd.
> 
> knfsd passes this to fh_verify which decodes it to be hde3
> and inode 4294913866 (0xffff2f4a)
> 
> that then gets passed to ext3 which then panics.
> 
> EXT3-fs error (device hde3): ext3_get_inode_block: bad inode number: 
> 4294913866
> 
> marks the file system as containing an error, and remounts
> the system read only.
> 
> Obviously this is sub optimal, and a fairly horrid DoS
> since anyone can craft a UDP packet, with a bogus FH in
> it. Whilst this is for V2_LOOKUP it works for all of the
> V2 procedures we tried.
> 

exploit code is available at

http://www.madingley.org/uploaded/crash-nfs.tar.gz

James.





^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-17 13:01 Bad ext3/nfs DoS bug James
@ 2006-07-17 13:06 ` Marcel Holtmann
  2006-07-17 18:43 ` Marcel Holtmann
  2006-07-18  7:55 ` Marcel Holtmann
  2 siblings, 0 replies; 32+ messages in thread
From: Marcel Holtmann @ 2006-07-17 13:06 UTC (permalink / raw)
  To: James; +Cc: linux-kernel

Hi James,

> I've tried contacting the relevant maintainers directly,
> and it's even in the kernel bugzilla, but nothing's happened
> and it's been over a month now. No-one seems to be doing anyting 
> about this. Is one meant to post this to bugtraq or what?

good contact points are security@kernel.org and vendor-sec@lst.de if
this problem has a security implication.

Regards

Marcel



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-17 13:01 Bad ext3/nfs DoS bug James
  2006-07-17 13:06 ` Marcel Holtmann
@ 2006-07-17 18:43 ` Marcel Holtmann
  2006-07-18  7:55 ` Marcel Holtmann
  2 siblings, 0 replies; 32+ messages in thread
From: Marcel Holtmann @ 2006-07-17 18:43 UTC (permalink / raw)
  To: James; +Cc: linux-kernel

Hi James,

> I've tried contacting the relevant maintainers directly,
> and it's even in the kernel bugzilla, but nothing's happened
> and it's been over a month now. No-one seems to be doing anyting 
> about this. Is one meant to post this to bugtraq or what?

please use CVE-2006-3468 for any further reference of this issue.

Regards

Marcel



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-17 13:01 Bad ext3/nfs DoS bug James
  2006-07-17 13:06 ` Marcel Holtmann
  2006-07-17 18:43 ` Marcel Holtmann
@ 2006-07-18  7:55 ` Marcel Holtmann
  2006-07-18 14:56   ` James
  2 siblings, 1 reply; 32+ messages in thread
From: Marcel Holtmann @ 2006-07-18  7:55 UTC (permalink / raw)
  To: James; +Cc: linux-kernel

Hi James,

> I've tried contacting the relevant maintainers directly,
> and it's even in the kernel bugzilla, but nothing's happened
> and it's been over a month now. No-one seems to be doing anyting 
> about this. Is one meant to post this to bugtraq or what?
> 
> Here's the bug: http://bugzilla.kernel.org/show_bug.cgi?id=6828
> (exploit code follows)
> 
> > We found this rather surprising behaviour when debugging a
> > network card for one of our embedded systems. There was a
> > bus problem that occasionally caused the network card to
> > place random data in the outgoing packets. We were using
> > NFS root, as we hadn't written drivers for the block
> > devices yet, and discovered our Linux NFS servers getting
> > ext3 errors. It turned out that the 3com cards we have in
> > the servers lie about checking UDP checksums, and passed
> > the rubbish to knfsd where it was causing the problem. 
> > 
> > Here's an example one of our widgets (dcm503) is talking
> > to an NFS server (dufftown)
> > 
> > 17:28:38.535011 dcm503.guralp.local.984095109 > dufftown.guralp.local.nfs: 116 
> > lookup fh Unknown/1 "" (DF) (ttl 64, id 0, len 144)
> >                          4500 0090 0000 4000 4011 3d45 0a52 01fa
> >                          c0a8 3024 03ff 0801 007c 8e9c 3aa8 1985
> >                          0000 0000 0000 0002 0001 86a3 0000 0002
> >                          0000 0004 0000 0001 0000 001c 028f 5b0c
> >                          0000 0006 6463 6d35 3033 0000 0000 0000
> >                          0000 0000 0000 0000 0000 0000 0000 0000
> >                          0100 0001 0021 0003 3d26 3d00 4a2f ffff
> >                          3d00 2c08 c923 0000 0000 0000 0000 0000
> >                          0000 0000 000a 6d6f 756e 7470 6f69 6e74
> > 
> > so what's happened here is 4a2f ffff should have been 4a2f
> > xxxx but the network card has missed the clock on the bus
> > and gotten ffff instead
> > 
> > nfsd_dispatch: vers 2 proc 4
> > nfsd: LOOKUP   32: 01000001 03002100 003d263d ffff2f4a 082c003d 000023c9
> > nfsd: nfsd_lookup(fh 32: 01000001 03002100 003d263d ffff2f4a 082c003d 
> > 000023c9, )
> > nfsd: fh_verify(32: 01000001 03002100 003d263d ffff2f4a 082c003d 000023c9)
> > 
> > so here the client does a V2 lookup with a DH which has
> > gotten screwed up by my clients network card, this is
> > received by my server, gets past the UDP checksum code
> > (thank you 3com) and ends up at knfsd.
> > 
> > knfsd passes this to fh_verify which decodes it to be hde3
> > and inode 4294913866 (0xffff2f4a)
> > 
> > that then gets passed to ext3 which then panics.
> > 
> > EXT3-fs error (device hde3): ext3_get_inode_block: bad inode number: 
> > 4294913866
> > 
> > marks the file system as containing an error, and remounts
> > the system read only.
> > 
> > Obviously this is sub optimal, and a fairly horrid DoS
> > since anyone can craft a UDP packet, with a bogus FH in
> > it. Whilst this is for V2_LOOKUP it works for all of the
> > V2 procedures we tried.
> > 
> 
> exploit code is available at
> 
> http://www.madingley.org/uploaded/crash-nfs.tar.gz

so I used your exploit and I could reproduce it on every 2.6 kernel, I
tried so far. However with a 2.4 kernel I see the error messages, but it
doesn't get remounted read-only. Did you run tests with 2.4 kernels?

Regards

Marcel



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-18  7:55 ` Marcel Holtmann
@ 2006-07-18 14:56   ` James
  2006-07-18 15:22     ` Marcel Holtmann
  0 siblings, 1 reply; 32+ messages in thread
From: James @ 2006-07-18 14:56 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: James, linux-kernel

> so I used your exploit and I could reproduce it on every 2.6 kernel, I
> tried so far. 

That must have been a lot of fscks.

> However with a 2.4 kernel I see the error messages, but it
> doesn't get remounted read-only. Did you run tests with 2.4 kernels?

no, I don't have any to hand, but someone is preparing one
now. Is NFS subtree checking on by default in 2.4?

James.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-18 14:56   ` James
@ 2006-07-18 15:22     ` Marcel Holtmann
  2006-07-18 15:23       ` James
  0 siblings, 1 reply; 32+ messages in thread
From: Marcel Holtmann @ 2006-07-18 15:22 UTC (permalink / raw)
  To: James; +Cc: linux-kernel

Hi James,

> > so I used your exploit and I could reproduce it on every 2.6 kernel, I
> > tried so far. 
> 
> That must have been a lot of fscks.

it wasn't that many. For some obvious reasons I only tested the RHEL and
Fedora kernels and vanilla plus stable series.

> > However with a 2.4 kernel I see the error messages, but it
> > doesn't get remounted read-only. Did you run tests with 2.4 kernels?
> 
> no, I don't have any to hand, but someone is preparing one
> now. Is NFS subtree checking on by default in 2.4?

I haven't checked within the code, but the manual page exports(5) states
subtree checking as being on by default. And it doesn't mention any
difference to 2.4 kernels.

What is the reason behind your question? Does disabling subtree checking
changes something?

Regards

Marcel



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-18 15:22     ` Marcel Holtmann
@ 2006-07-18 15:23       ` James
  2006-07-18 20:18         ` Marcel Holtmann
  0 siblings, 1 reply; 32+ messages in thread
From: James @ 2006-07-18 15:23 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: James, linux-kernel

> What is the reason behind your question? Does disabling subtree checking
> changes something?

just that the iget() call which causes the problem in 2.6
is happening in the subtree checking code, not based on
analysis of the flow.

James.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-18 15:23       ` James
@ 2006-07-18 20:18         ` Marcel Holtmann
  2006-07-19  9:28           ` James
  0 siblings, 1 reply; 32+ messages in thread
From: Marcel Holtmann @ 2006-07-18 20:18 UTC (permalink / raw)
  To: James; +Cc: linux-kernel

Hi James,

> > What is the reason behind your question? Does disabling subtree checking
> > changes something?
> 
> just that the iget() call which causes the problem in 2.6
> is happening in the subtree checking code, not based on
> analysis of the flow.

just did a quick test with the RHEL4 kernel (2.6.9 based) and
subtree_check and no_subtree_check export option. No difference and in
both cases it gets remounted read-only.

Regards

Marcel



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-18 20:18         ` Marcel Holtmann
@ 2006-07-19  9:28           ` James
  2006-07-19 15:55             ` Jan Kara
  0 siblings, 1 reply; 32+ messages in thread
From: James @ 2006-07-19  9:28 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: James, linux-kernel

So what happens next? Is the ext3 maintainer on sabatical,
or am I expected to submit a patch to fix this?

J.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-19  9:28           ` James
@ 2006-07-19 15:55             ` Jan Kara
  2006-07-20  4:46               ` Neil Brown
  0 siblings, 1 reply; 32+ messages in thread
From: Jan Kara @ 2006-07-19 15:55 UTC (permalink / raw)
  To: James; +Cc: Marcel Holtmann, linux-kernel, akpm, sct

> So what happens next? Is the ext3 maintainer on sabatical,
> or am I expected to submit a patch to fix this?
  I guess people are mostly busy with OLS and such so maybe they missed
the discussion.. Giving CC to relevant people to catch their attention
:)
  Andrew, Stephen: James has come across a nasty bug (potentially remote
DoS). NFS extracts inode number from a filehandle and the inode number
eventually ends in ext3_read_inode(). Now if the inode number is bogus,
ext3_get_inode_block() calls ext3_error() and filesystem is remounted
RO or whatever else is configured. That is quite undesirable in this
case.
  Now the easy "fix" is to change ext3_error() to ext3_warning() but an
attacker flooding your logs with warnings is probably not good either
and in case the inode number comes from ext3 itself we should really do
ext3_error() as there is some corruption in the fs.
  Better fix would be to add a flag to read_inode() saying that the inode
number is from untrusted source (but that means changing a prototype of a
function every fs uses) and change export_iget() to pass this flag. Yet
another solution would be to make ext3 implement its own get_dentry() export
function and pass the flag internally...
  What do you think is the best solution?

								Honza

-- 
Jan Kara <jack@suse.cz>
SuSE CR Labs

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-19 15:55             ` Jan Kara
@ 2006-07-20  4:46               ` Neil Brown
  2006-07-20 16:06                 ` Jan Kara
  2006-07-21  0:42                 ` Marcel Holtmann
  0 siblings, 2 replies; 32+ messages in thread
From: Neil Brown @ 2006-07-20  4:46 UTC (permalink / raw)
  To: Jan Kara; +Cc: James, Marcel Holtmann, linux-kernel, akpm, sct

On Wednesday July 19, jack@suse.cz wrote:
> > So what happens next? Is the ext3 maintainer on sabatical,
> > or am I expected to submit a patch to fix this?
>   I guess people are mostly busy with OLS and such so maybe they missed
> the discussion.. Giving CC to relevant people to catch their attention
> :)
>   Andrew, Stephen: James has come across a nasty bug (potentially remote
> DoS). NFS extracts inode number from a filehandle and the inode number
> eventually ends in ext3_read_inode(). Now if the inode number is bogus,
> ext3_get_inode_block() calls ext3_error() and filesystem is remounted
> RO or whatever else is configured. That is quite undesirable in this
> case.
>   Now the easy "fix" is to change ext3_error() to ext3_warning() but an
> attacker flooding your logs with warnings is probably not good either
> and in case the inode number comes from ext3 itself we should really do
> ext3_error() as there is some corruption in the fs.
>   Better fix would be to add a flag to read_inode() saying that the inode
> number is from untrusted source (but that means changing a prototype of a
> function every fs uses) and change export_iget() to pass this flag. Yet
> another solution would be to make ext3 implement its own get_dentry() export
> function and pass the flag internally...
>   What do you think is the best solution?

I think that a good solution (hard to say if it is the best) is to
remove that error message altogether, and put it where inode numbers
are read out of directories.  Something like the following patch -
compile tested only.

NeilBrown

Avoid triggering ext3_error on bad NFS file handle

The inode number out of an NFS file handle gets passed 
eventually to ext3_get_inode_block without any checking.
If ext3_get_inode_block allows it to trigger a error,
then bad filehandles can have unpleasant effect.

So remove the call to ext3_error there and put a matching
check in ext3/namei.c where inode numbers are read of storage.


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./fs/ext3/inode.c |   10 ++++++----
 ./fs/ext3/namei.c |   19 +++++++++++++++++--
 2 files changed, 23 insertions(+), 6 deletions(-)

diff .prev/fs/ext3/inode.c ./fs/ext3/inode.c
--- .prev/fs/ext3/inode.c	2006-07-20 14:41:07.000000000 +1000
+++ ./fs/ext3/inode.c	2006-07-20 14:42:04.000000000 +1000
@@ -2405,11 +2405,13 @@ static ext3_fsblk_t ext3_get_inode_block
 
 	if ((ino != EXT3_ROOT_INO && ino != EXT3_JOURNAL_INO &&
 		ino != EXT3_RESIZE_INO && ino < EXT3_FIRST_INO(sb)) ||
-		ino > le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count)) {
-		ext3_error(sb, "ext3_get_inode_block",
-			    "bad inode number: %lu", ino);
+		ino > le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count))
+		/* This error already checked for in namei.c unless we
+		 * are looking at an NFS filehandle, in which case,
+		 * no error reported is needed
+		 */
 		return 0;
-	}
+
 	block_group = (ino - 1) / EXT3_INODES_PER_GROUP(sb);
 	if (block_group >= EXT3_SB(sb)->s_groups_count) {
 		ext3_error(sb,"ext3_get_inode_block","group >= groups count");

diff .prev/fs/ext3/namei.c ./fs/ext3/namei.c
--- .prev/fs/ext3/namei.c	2006-07-20 14:39:51.000000000 +1000
+++ ./fs/ext3/namei.c	2006-07-20 14:44:23.000000000 +1000
@@ -1000,7 +1000,14 @@ static struct dentry *ext3_lookup(struct
 	if (bh) {
 		unsigned long ino = le32_to_cpu(de->inode);
 		brelse (bh);
-		inode = iget(dir->i_sb, ino);
+		if ((ino != EXT3_ROOT_INO && ino != EXT3_JOURNAL_INO &&
+		     ino != EXT3_RESIZE_INO && ino < EXT3_FIRST_INO(dir->i_sb)) ||
+		    ino > le32_to_cpu(EXT3_SB(dir->i_sb)->s_es->s_inodes_count)) {
+			ext3_error(dir->i_sb, "ext3_lookup",
+				   "bad inode number: %lu", ino);
+			inode = NULL;
+		} else
+			inode = iget(dir->i_sb, ino);
 
 		if (!inode)
 			return ERR_PTR(-EACCES);
@@ -1028,7 +1035,15 @@ struct dentry *ext3_get_parent(struct de
 		return ERR_PTR(-ENOENT);
 	ino = le32_to_cpu(de->inode);
 	brelse(bh);
-	inode = iget(child->d_inode->i_sb, ino);
+
+	if ((ino != EXT3_ROOT_INO && ino != EXT3_JOURNAL_INO &&
+	     ino != EXT3_RESIZE_INO && ino < EXT3_FIRST_INO(child->d_inode->i_sb)) ||
+	    ino > le32_to_cpu(EXT3_SB(child->d_inode->i_sb)->s_es->s_inodes_count)) {
+		ext3_error(child->d_inode->i_sb, "ext3_get_parent",
+			   "bad inode number: %lu", ino);
+		inode = NULL;
+	} else
+		inode = iget(child->d_inode->i_sb, ino);
 
 	if (!inode)
 		return ERR_PTR(-EACCES);

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-20  4:46               ` Neil Brown
@ 2006-07-20 16:06                 ` Jan Kara
  2006-07-20 20:11                   ` James
  2006-07-21  6:39                   ` Neil Brown
  2006-07-21  0:42                 ` Marcel Holtmann
  1 sibling, 2 replies; 32+ messages in thread
From: Jan Kara @ 2006-07-20 16:06 UTC (permalink / raw)
  To: Neil Brown; +Cc: James, Marcel Holtmann, linux-kernel, akpm, sct

> On Wednesday July 19, jack@suse.cz wrote:
> > > So what happens next? Is the ext3 maintainer on sabatical,
> > > or am I expected to submit a patch to fix this?
> >   I guess people are mostly busy with OLS and such so maybe they missed
> > the discussion.. Giving CC to relevant people to catch their attention
> > :)
> >   Andrew, Stephen: James has come across a nasty bug (potentially remote
> > DoS). NFS extracts inode number from a filehandle and the inode number
> > eventually ends in ext3_read_inode(). Now if the inode number is bogus,
> > ext3_get_inode_block() calls ext3_error() and filesystem is remounted
> > RO or whatever else is configured. That is quite undesirable in this
> > case.
> >   Now the easy "fix" is to change ext3_error() to ext3_warning() but an
> > attacker flooding your logs with warnings is probably not good either
> > and in case the inode number comes from ext3 itself we should really do
> > ext3_error() as there is some corruption in the fs.
> >   Better fix would be to add a flag to read_inode() saying that the inode
> > number is from untrusted source (but that means changing a prototype of a
> > function every fs uses) and change export_iget() to pass this flag. Yet
> > another solution would be to make ext3 implement its own get_dentry() export
> > function and pass the flag internally...
> >   What do you think is the best solution?
> 
> I think that a good solution (hard to say if it is the best) is to
> remove that error message altogether, and put it where inode numbers
> are read out of directories.  Something like the following patch -
> compile tested only.
  Yes, that looks fine too. I did not realize that we get the inode
number only in a few places. Maybe we could wrap the checks in a
function (possibly inline) so that the checks are just in one place?

								Honza
-- 
Jan Kara <jack@suse.cz>
SuSE CR Labs

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-20 16:06                 ` Jan Kara
@ 2006-07-20 20:11                   ` James
  2006-07-21  6:44                     ` Neil Brown
  2006-07-21  6:39                   ` Neil Brown
  1 sibling, 1 reply; 32+ messages in thread
From: James @ 2006-07-20 20:11 UTC (permalink / raw)
  To: Jan Kara; +Cc: Neil Brown, James, Marcel Holtmann, linux-kernel, akpm, sct

Excellent - thank you.

Neil - out of interest did you ever get my direct mail
to you details below:

 Date: Tue, 6 Jun 2006 18:15:17 +0100
 Subject: Bogus FH in NFS request crashes file system code
 Message-ID: <20060606171517.GA2601@selene.humbnet.james.local>

J.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-20  4:46               ` Neil Brown
  2006-07-20 16:06                 ` Jan Kara
@ 2006-07-21  0:42                 ` Marcel Holtmann
  2006-07-21 12:29                   ` Andrew Morton
  1 sibling, 1 reply; 32+ messages in thread
From: Marcel Holtmann @ 2006-07-21  0:42 UTC (permalink / raw)
  To: Neil Brown; +Cc: Jan Kara, James, linux-kernel, akpm, sct

Hi Neil,

> > > So what happens next? Is the ext3 maintainer on sabatical,
> > > or am I expected to submit a patch to fix this?
> >   I guess people are mostly busy with OLS and such so maybe they missed
> > the discussion.. Giving CC to relevant people to catch their attention
> > :)
> >   Andrew, Stephen: James has come across a nasty bug (potentially remote
> > DoS). NFS extracts inode number from a filehandle and the inode number
> > eventually ends in ext3_read_inode(). Now if the inode number is bogus,
> > ext3_get_inode_block() calls ext3_error() and filesystem is remounted
> > RO or whatever else is configured. That is quite undesirable in this
> > case.
> >   Now the easy "fix" is to change ext3_error() to ext3_warning() but an
> > attacker flooding your logs with warnings is probably not good either
> > and in case the inode number comes from ext3 itself we should really do
> > ext3_error() as there is some corruption in the fs.
> >   Better fix would be to add a flag to read_inode() saying that the inode
> > number is from untrusted source (but that means changing a prototype of a
> > function every fs uses) and change export_iget() to pass this flag. Yet
> > another solution would be to make ext3 implement its own get_dentry() export
> > function and pass the flag internally...
> >   What do you think is the best solution?
> 
> I think that a good solution (hard to say if it is the best) is to
> remove that error message altogether, and put it where inode numbers
> are read out of directories.  Something like the following patch -
> compile tested only.

I tested your patch and it works for me. So can someone with ext3
knowledge review and then propose it for upstream inclusion.

Regards

Marcel



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-20 16:06                 ` Jan Kara
  2006-07-20 20:11                   ` James
@ 2006-07-21  6:39                   ` Neil Brown
  2006-07-21 14:24                     ` Jan Kara
  2006-07-22  0:06                     ` Andrew Morton
  1 sibling, 2 replies; 32+ messages in thread
From: Neil Brown @ 2006-07-21  6:39 UTC (permalink / raw)
  To: Jan Kara; +Cc: James, Marcel Holtmann, linux-kernel, akpm, sct

On Thursday July 20, jack@suse.cz wrote:
>   Yes, that looks fine too. I did not realize that we get the inode
> number only in a few places. Maybe we could wrap the checks in a
> function (possibly inline) so that the checks are just in one place?

Like this?

NeilBrown


Avoid triggering ext3_error on bad NFS file handle

The inode number out of an NFS file handle gets passed 
eventually to ext3_get_inode_block without any checking.
If ext3_get_inode_block allows it to trigger a error,
then bad filehandles can have unpleasant effect.

So remove the call to ext3_error there and put a matching
check in ext3/namei.c where inode numbers are read of storage.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./fs/ext3/inode.c         |   13 ++++++-------
 ./fs/ext3/namei.c         |   15 +++++++++++++--
 ./include/linux/ext3_fs.h |    9 +++++++++
 3 files changed, 28 insertions(+), 9 deletions(-)

diff .prev/fs/ext3/inode.c ./fs/ext3/inode.c
--- .prev/fs/ext3/inode.c	2006-07-20 14:41:07.000000000 +1000
+++ ./fs/ext3/inode.c	2006-07-21 16:36:32.000000000 +1000
@@ -2402,14 +2402,13 @@ static ext3_fsblk_t ext3_get_inode_block
 	struct buffer_head *bh;
 	struct ext3_group_desc * gdp;
 
-
-	if ((ino != EXT3_ROOT_INO && ino != EXT3_JOURNAL_INO &&
-		ino != EXT3_RESIZE_INO && ino < EXT3_FIRST_INO(sb)) ||
-		ino > le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count)) {
-		ext3_error(sb, "ext3_get_inode_block",
-			    "bad inode number: %lu", ino);
+	if (!ext3_valid_inum(sb, ino))
+		/* This error already checked for in namei.c unless we
+		 * are looking at an NFS filehandle, in which case,
+		 * no error reported is needed
+		 */
 		return 0;
-	}
+
 	block_group = (ino - 1) / EXT3_INODES_PER_GROUP(sb);
 	if (block_group >= EXT3_SB(sb)->s_groups_count) {
 		ext3_error(sb,"ext3_get_inode_block","group >= groups count");

diff .prev/fs/ext3/namei.c ./fs/ext3/namei.c
--- .prev/fs/ext3/namei.c	2006-07-20 14:39:51.000000000 +1000
+++ ./fs/ext3/namei.c	2006-07-21 16:36:09.000000000 +1000
@@ -1000,7 +1000,12 @@ static struct dentry *ext3_lookup(struct
 	if (bh) {
 		unsigned long ino = le32_to_cpu(de->inode);
 		brelse (bh);
-		inode = iget(dir->i_sb, ino);
+		if (!ext3_valid_inum(dir->i_sb, ino)) {
+			ext3_error(dir->i_sb, "ext3_lookup",
+				   "bad inode number: %lu", ino);
+			inode = NULL;
+		} else
+			inode = iget(dir->i_sb, ino);
 
 		if (!inode)
 			return ERR_PTR(-EACCES);
@@ -1028,7 +1033,13 @@ struct dentry *ext3_get_parent(struct de
 		return ERR_PTR(-ENOENT);
 	ino = le32_to_cpu(de->inode);
 	brelse(bh);
-	inode = iget(child->d_inode->i_sb, ino);
+
+	if (!ext3_valid_inum(child->d_inode->i_sb, ino)) {
+		ext3_error(child->d_inode->i_sb, "ext3_get_parent",
+			   "bad inode number: %lu", ino);
+		inode = NULL;
+	} else
+		inode = iget(child->d_inode->i_sb, ino);
 
 	if (!inode)
 		return ERR_PTR(-EACCES);

diff .prev/include/linux/ext3_fs.h ./include/linux/ext3_fs.h
--- .prev/include/linux/ext3_fs.h	2006-07-21 16:34:01.000000000 +1000
+++ ./include/linux/ext3_fs.h	2006-07-21 16:35:55.000000000 +1000
@@ -492,6 +492,15 @@ static inline struct ext3_inode_info *EX
 {
 	return container_of(inode, struct ext3_inode_info, vfs_inode);
 }
+
+static inline int ext3_valid_inum(struct super_block *sb, unsigned long ino)
+{
+	return ino == EXT3_ROOT_INO ||
+		ino == EXT3_JOURNAL_INO ||
+		ino == EXT3_RESIZE_INO ||
+		(ino > EXT3_FIRST_INO(sb) &&
+		 ino <= le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count));
+}
 #else
 /* Assume that user mode programs are passing in an ext3fs superblock, not
  * a kernel struct super_block.  This will allow us to call the feature-test

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-20 20:11                   ` James
@ 2006-07-21  6:44                     ` Neil Brown
  0 siblings, 0 replies; 32+ messages in thread
From: Neil Brown @ 2006-07-21  6:44 UTC (permalink / raw)
  To: James; +Cc: Jan Kara, Marcel Holtmann, linux-kernel, akpm, sct

On Thursday July 20, 20@madingley.org wrote:
> Excellent - thank you.
> 
> Neil - out of interest did you ever get my direct mail
> to you details below:
> 
>  Date: Tue, 6 Jun 2006 18:15:17 +0100
>  Subject: Bogus FH in NFS request crashes file system code
>  Message-ID: <20060606171517.GA2601@selene.humbnet.james.local>
> 
> J.

Hmmm.... cannot find it in my mailbox, and it doesn't ring any bell.
The first I remember of this was early July shortly before going on
vacation....
That doesn't mean it didn't arrive though :-(  My apologies if I did
miss it.

NeilBrown

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-21  0:42                 ` Marcel Holtmann
@ 2006-07-21 12:29                   ` Andrew Morton
  0 siblings, 0 replies; 32+ messages in thread
From: Andrew Morton @ 2006-07-21 12:29 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: neilb, jack, 20, linux-kernel, sct

On Fri, 21 Jul 2006 02:42:13 +0200
Marcel Holtmann <marcel@holtmann.org> wrote:

> I tested your patch and it works for me. So can someone with ext3
> knowledge review and then propose it for upstream inclusion.

Yup, I have it save away for next week's resumption of work.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-21  6:39                   ` Neil Brown
@ 2006-07-21 14:24                     ` Jan Kara
  2006-07-22  0:06                     ` Andrew Morton
  1 sibling, 0 replies; 32+ messages in thread
From: Jan Kara @ 2006-07-21 14:24 UTC (permalink / raw)
  To: Neil Brown; +Cc: James, Marcel Holtmann, linux-kernel, akpm, sct

> On Thursday July 20, jack@suse.cz wrote:
> >   Yes, that looks fine too. I did not realize that we get the inode
> > number only in a few places. Maybe we could wrap the checks in a
> > function (possibly inline) so that the checks are just in one place?
> 
> Like this?
  Yes.

> NeilBrown
> 
> 
> Avoid triggering ext3_error on bad NFS file handle
> 
> The inode number out of an NFS file handle gets passed 
> eventually to ext3_get_inode_block without any checking.
> If ext3_get_inode_block allows it to trigger a error,
> then bad filehandles can have unpleasant effect.
> 
> So remove the call to ext3_error there and put a matching
> check in ext3/namei.c where inode numbers are read of storage.
> 
> Signed-off-by: Neil Brown <neilb@suse.de>
  Signed-off-by: Jan Kara <jack@suse.cz>

							Honza

> ### Diffstat output
>  ./fs/ext3/inode.c         |   13 ++++++-------
>  ./fs/ext3/namei.c         |   15 +++++++++++++--
>  ./include/linux/ext3_fs.h |    9 +++++++++
>  3 files changed, 28 insertions(+), 9 deletions(-)
> 
> diff .prev/fs/ext3/inode.c ./fs/ext3/inode.c
> --- .prev/fs/ext3/inode.c	2006-07-20 14:41:07.000000000 +1000
> +++ ./fs/ext3/inode.c	2006-07-21 16:36:32.000000000 +1000
> @@ -2402,14 +2402,13 @@ static ext3_fsblk_t ext3_get_inode_block
>  	struct buffer_head *bh;
>  	struct ext3_group_desc * gdp;
>  
> -
> -	if ((ino != EXT3_ROOT_INO && ino != EXT3_JOURNAL_INO &&
> -		ino != EXT3_RESIZE_INO && ino < EXT3_FIRST_INO(sb)) ||
> -		ino > le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count)) {
> -		ext3_error(sb, "ext3_get_inode_block",
> -			    "bad inode number: %lu", ino);
> +	if (!ext3_valid_inum(sb, ino))
> +		/* This error already checked for in namei.c unless we
> +		 * are looking at an NFS filehandle, in which case,
> +		 * no error reported is needed
> +		 */
>  		return 0;
> -	}
> +
>  	block_group = (ino - 1) / EXT3_INODES_PER_GROUP(sb);
>  	if (block_group >= EXT3_SB(sb)->s_groups_count) {
>  		ext3_error(sb,"ext3_get_inode_block","group >= groups count");
> 
> diff .prev/fs/ext3/namei.c ./fs/ext3/namei.c
> --- .prev/fs/ext3/namei.c	2006-07-20 14:39:51.000000000 +1000
> +++ ./fs/ext3/namei.c	2006-07-21 16:36:09.000000000 +1000
> @@ -1000,7 +1000,12 @@ static struct dentry *ext3_lookup(struct
>  	if (bh) {
>  		unsigned long ino = le32_to_cpu(de->inode);
>  		brelse (bh);
> -		inode = iget(dir->i_sb, ino);
> +		if (!ext3_valid_inum(dir->i_sb, ino)) {
> +			ext3_error(dir->i_sb, "ext3_lookup",
> +				   "bad inode number: %lu", ino);
> +			inode = NULL;
> +		} else
> +			inode = iget(dir->i_sb, ino);
>  
>  		if (!inode)
>  			return ERR_PTR(-EACCES);
> @@ -1028,7 +1033,13 @@ struct dentry *ext3_get_parent(struct de
>  		return ERR_PTR(-ENOENT);
>  	ino = le32_to_cpu(de->inode);
>  	brelse(bh);
> -	inode = iget(child->d_inode->i_sb, ino);
> +
> +	if (!ext3_valid_inum(child->d_inode->i_sb, ino)) {
> +		ext3_error(child->d_inode->i_sb, "ext3_get_parent",
> +			   "bad inode number: %lu", ino);
> +		inode = NULL;
> +	} else
> +		inode = iget(child->d_inode->i_sb, ino);
>  
>  	if (!inode)
>  		return ERR_PTR(-EACCES);
> 
> diff .prev/include/linux/ext3_fs.h ./include/linux/ext3_fs.h
> --- .prev/include/linux/ext3_fs.h	2006-07-21 16:34:01.000000000 +1000
> +++ ./include/linux/ext3_fs.h	2006-07-21 16:35:55.000000000 +1000
> @@ -492,6 +492,15 @@ static inline struct ext3_inode_info *EX
>  {
>  	return container_of(inode, struct ext3_inode_info, vfs_inode);
>  }
> +
> +static inline int ext3_valid_inum(struct super_block *sb, unsigned long ino)
> +{
> +	return ino == EXT3_ROOT_INO ||
> +		ino == EXT3_JOURNAL_INO ||
> +		ino == EXT3_RESIZE_INO ||
> +		(ino > EXT3_FIRST_INO(sb) &&
> +		 ino <= le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count));
> +}
>  #else
>  /* Assume that user mode programs are passing in an ext3fs superblock, not
>   * a kernel struct super_block.  This will allow us to call the feature-test
-- 
Jan Kara <jack@suse.cz>
SuSE CR Labs

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-21  6:39                   ` Neil Brown
  2006-07-21 14:24                     ` Jan Kara
@ 2006-07-22  0:06                     ` Andrew Morton
  2006-07-22 13:17                       ` Theodore Tso
  1 sibling, 1 reply; 32+ messages in thread
From: Andrew Morton @ 2006-07-22  0:06 UTC (permalink / raw)
  To: Neil Brown
  Cc: jack, 20, marcel, linux-kernel, sct, Theodore Ts'o, Andreas Dilger

On Fri, 21 Jul 2006 16:39:32 +1000
Neil Brown <neilb@suse.de> wrote:

> Avoid triggering ext3_error on bad NFS file handle
> 
> The inode number out of an NFS file handle gets passed 
> eventually to ext3_get_inode_block without any checking.
> If ext3_get_inode_block allows it to trigger a error,
> then bad filehandles can have unpleasant effect.
> 
> So remove the call to ext3_error there and put a matching
> check in ext3/namei.c where inode numbers are read of storage.
> 

There are strange things happening in here.

> +static inline int ext3_valid_inum(struct super_block *sb, unsigned long ino)
> +{
> +	return ino == EXT3_ROOT_INO ||
> +		ino == EXT3_JOURNAL_INO ||
> +		ino == EXT3_RESIZE_INO ||
> +		(ino > EXT3_FIRST_INO(sb) &&
> +		 ino <= le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count));
> +}

One would expect the inode validity test to be

	(ino >= EXT3_FIRST_INO(sb)) && (ino < ...->s_inodes_count))

but even this assumes that s_inodes_count is misnamed and really should be
called s_last_inode_plus_one.  If it is not misnamed then the validity test
should be

	(ino >= EXT3_FIRST_INO(sb)) &&
		(ino < EXT3_FIRST_INO(sb) + ...->s_inodes_count))

Look through the filesystem for other uses of EXT3_FIRST_INO().  It's all
rather fishily inconsistent.

Ted, Andreas: do you think we could come up with statements describing
exactly what the values in EXT3_FIRST_INO() and in ->s_inodes_count
represent?  Thanks.


Also Neil, I wonder whether this patch of yours still permits NFS clients
to access the journal and resize inodes in undesirable ways?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-22  0:06                     ` Andrew Morton
@ 2006-07-22 13:17                       ` Theodore Tso
  2006-07-25  1:56                         ` Andrew Morton
  0 siblings, 1 reply; 32+ messages in thread
From: Theodore Tso @ 2006-07-22 13:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Neil Brown, jack, 20, marcel, linux-kernel, sct, Andreas Dilger

Sorry, OLS and some work-related emergencies have been hitting hard
this week, so I've been deferred doing a full review of Jan's patch.
Hopefully after OLS I'll be able to comment more fully.

On Fri, Jul 21, 2006 at 05:06:27PM -0700, Andrew Morton wrote:
> There are strange things happening in here.
> 
> > +static inline int ext3_valid_inum(struct super_block *sb, unsigned long ino)
> > +{
> > +	return ino == EXT3_ROOT_INO ||
> > +		ino == EXT3_JOURNAL_INO ||
> > +		ino == EXT3_RESIZE_INO ||
> > +		(ino > EXT3_FIRST_INO(sb) &&
> > +		 ino <= le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count));
> > +}
> 
> One would expect the inode validity test to be
> 
> 	(ino >= EXT3_FIRST_INO(sb)) && (ino < ...->s_inodes_count))
>
> but even this assumes that s_inodes_count is misnamed and really should be
> called s_last_inode_plus_one.  If it is not misnamed then the validity test
> should be
> 
> 	(ino >= EXT3_FIRST_INO(sb)) &&
> 		(ino < EXT3_FIRST_INO(sb) + ...->s_inodes_count))
> 
> Look through the filesystem for other uses of EXT3_FIRST_INO().  It's all
> rather fishily inconsistent.

I don't think there's anything in consistent.  Basically, inodes are 1
based (inode number 0 in a directory entry means that the file is
deleted and the directory entry is freed).  Hence valid inode numbers
are between 1 and s_inodes_count, inclusive.

Inodes between 1 and EXT3_FIRST_INO-1 (inclusive) are reserved, are
should always be marked as in use in the inode allocation bitmap.
EXT3_FIRST_INO (which is 11 on all ext3 filesystems to date) is
generally the lost+found directory, but that's just because it is the
first file/directory which is allocated by mke2fs.  So EXT3_FIRST_INO
representes the first inode which can be allocated by userspace.  (The
root inode doesn't fall in this category because it can never be
deleted or reallocated after the filesystem is created, and as a nod
to Unix filesystem history, it has inode #2).

The net of all of this is the inode validity test should be:

 	(ino >= EXT3_FIRST_INO(sb)) && (ino <= ...->s_inodes_count))

However, I would suggest that we *not* allow remote NFS users to get
access to the journal inode or the resize inode, please?  That's only
going to cause mischief of the DoS attack kind.....  

						- Ted


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-22 13:17                       ` Theodore Tso
@ 2006-07-25  1:56                         ` Andrew Morton
  2006-07-25  2:21                           ` Neil Brown
  2006-07-25  2:36                           ` Neil Brown
  0 siblings, 2 replies; 32+ messages in thread
From: Andrew Morton @ 2006-07-25  1:56 UTC (permalink / raw)
  To: Theodore Tso; +Cc: neilb, jack, 20, marcel, linux-kernel, sct, adilger

On Sat, 22 Jul 2006 09:17:59 -0400
Theodore Tso <tytso@mit.edu> wrote:

> Sorry, OLS and some work-related emergencies have been hitting hard
> this week, so I've been deferred doing a full review of Jan's patch.
> Hopefully after OLS I'll be able to comment more fully.
> 
> On Fri, Jul 21, 2006 at 05:06:27PM -0700, Andrew Morton wrote:
> > There are strange things happening in here.
> > 
> > > +static inline int ext3_valid_inum(struct super_block *sb, unsigned long ino)
> > > +{
> > > +	return ino == EXT3_ROOT_INO ||
> > > +		ino == EXT3_JOURNAL_INO ||
> > > +		ino == EXT3_RESIZE_INO ||
> > > +		(ino > EXT3_FIRST_INO(sb) &&
> > > +		 ino <= le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count));
> > > +}
> > 
> > One would expect the inode validity test to be
> > 
> > 	(ino >= EXT3_FIRST_INO(sb)) && (ino < ...->s_inodes_count))
> >
> > but even this assumes that s_inodes_count is misnamed and really should be
> > called s_last_inode_plus_one.  If it is not misnamed then the validity test
> > should be
> > 
> > 	(ino >= EXT3_FIRST_INO(sb)) &&
> > 		(ino < EXT3_FIRST_INO(sb) + ...->s_inodes_count))
> > 
> > Look through the filesystem for other uses of EXT3_FIRST_INO().  It's all
> > rather fishily inconsistent.
> 
> I don't think there's anything in consistent.  Basically, inodes are 1
> based (inode number 0 in a directory entry means that the file is
> deleted and the directory entry is freed).  Hence valid inode numbers
> are between 1 and s_inodes_count, inclusive.
> 
> Inodes between 1 and EXT3_FIRST_INO-1 (inclusive) are reserved, are
> should always be marked as in use in the inode allocation bitmap.
> EXT3_FIRST_INO (which is 11 on all ext3 filesystems to date) is
> generally the lost+found directory, but that's just because it is the
> first file/directory which is allocated by mke2fs.  So EXT3_FIRST_INO
> representes the first inode which can be allocated by userspace.  (The
> root inode doesn't fall in this category because it can never be
> deleted or reallocated after the filesystem is created, and as a nod
> to Unix filesystem history, it has inode #2).
> 
> The net of all of this is the inode validity test should be:
> 
>  	(ino >= EXT3_FIRST_INO(sb)) && (ino <= ...->s_inodes_count))

I agree; I made that change.

> However, I would suggest that we *not* allow remote NFS users to get
> access to the journal inode or the resize inode, please?  That's only
> going to cause mischief of the DoS attack kind.....  

<looks at Neil>

<then looks at ext2>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-25  1:56                         ` Andrew Morton
@ 2006-07-25  2:21                           ` Neil Brown
  2006-07-26 17:12                             ` Eric Sandeen
  2006-07-25  2:36                           ` Neil Brown
  1 sibling, 1 reply; 32+ messages in thread
From: Neil Brown @ 2006-07-25  2:21 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger

On Monday July 24, akpm@osdl.org wrote:
> On Sat, 22 Jul 2006 09:17:59 -0400
> Theodore Tso <tytso@mit.edu> wrote:
> > The net of all of this is the inode validity test should be:
> > 
> >  	(ino >= EXT3_FIRST_INO(sb)) && (ino <= ...->s_inodes_count))
> 
> I agree; I made that change.
> 

Yeh, my bad.  I double checked the second comparison but not the first
:-(

> > However, I would suggest that we *not* allow remote NFS users to get
> > access to the journal inode or the resize inode, please?  That's only
> > going to cause mischief of the DoS attack kind.....  
> 
> <looks at Neil>

See, I had a funny feeling that someone was watching me ....

I doubt there is much room for a real problem here.  
To get to the point of IO on any of these files, you would need root
access to the whole filesystem anyway.

The follow patch should do what is suggested though.

> 
> <then looks at ext2>

Hmmm. are two looks allowed in the same email?

NeilBrown

----------------------------
Make ext3 reject filehandles referring to special files.

Inodes earlier than the 'first' inode (e.g. journal,
resize) should be rejected early - except the root inode.


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./fs/exportfs/expfs.c |    4 +++-
 ./fs/ext3/super.c     |   15 +++++++++++++++
 ./include/linux/fs.h  |    2 ++
 3 files changed, 20 insertions(+), 1 deletion(-)

diff .prev/fs/exportfs/expfs.c ./fs/exportfs/expfs.c
--- .prev/fs/exportfs/expfs.c	2006-07-25 12:19:21.000000000 +1000
+++ ./fs/exportfs/expfs.c	2006-07-25 12:16:12.000000000 +1000
@@ -392,7 +392,8 @@ out:
 }
 
 
-static struct dentry *export_iget(struct super_block *sb, unsigned long ino, __u32 generation)
+struct dentry *export_iget(struct super_block *sb, unsigned long ino,
+			   __u32 generation)
 {
 
 	/* iget isn't really right if the inode is currently unallocated!!
@@ -434,6 +435,7 @@ static struct dentry *export_iget(struct
 	}
 	return result;
 }
+EXPORT_SYMBOL_GPL(export_iget);
 
 
 static struct dentry *get_object(struct super_block *sb, void *vobjp)

diff .prev/fs/ext3/super.c ./fs/ext3/super.c
--- .prev/fs/ext3/super.c	2006-07-25 12:19:21.000000000 +1000
+++ ./fs/ext3/super.c	2006-07-25 12:19:43.000000000 +1000
@@ -554,6 +554,20 @@ static int ext3_show_options(struct seq_
 	return 0;
 }
 
+
+static struct dentry *ext3_get_dentry(struct super_block *sb, void *vobjp)
+{
+	__u32 *objp = vobjp;
+	unsigned long ino = objp[0];
+	__u32 generation = objp[1];
+
+	if (ino != EXT3_ROOT_INO && ino < EXT3_FIRST_INO(sb))
+		return ERR_PTR(-ESTALE);
+
+	return export_iget(sb, ino, generation);
+}
+
+
 #ifdef CONFIG_QUOTA
 #define QTYPE2NAME(t) ((t)==USRQUOTA?"user":"group")
 #define QTYPE2MOPT(on, t) ((t)==USRQUOTA?((on)##USRJQUOTA):((on)##GRPJQUOTA))
@@ -622,6 +636,7 @@ static struct super_operations ext3_sops
 
 static struct export_operations ext3_export_ops = {
 	.get_parent = ext3_get_parent,
+	.get_dentry = ext3_get_dentry,
 };
 
 enum {

diff .prev/include/linux/fs.h ./include/linux/fs.h
--- .prev/include/linux/fs.h	2006-07-25 12:19:21.000000000 +1000
+++ ./include/linux/fs.h	2006-07-25 12:15:32.000000000 +1000
@@ -1381,6 +1381,8 @@ extern struct dentry *
 find_exported_dentry(struct super_block *sb, void *obj, void *parent,
 		     int (*acceptable)(void *context, struct dentry *de),
 		     void *context);
+struct dentry *export_iget(struct super_block *sb, unsigned long ino,
+			   __u32 generation);
 
 struct file_system_type {
 	const char *name;

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-25  1:56                         ` Andrew Morton
  2006-07-25  2:21                           ` Neil Brown
@ 2006-07-25  2:36                           ` Neil Brown
  2006-07-25 18:27                             ` Eric Sandeen
  1 sibling, 1 reply; 32+ messages in thread
From: Neil Brown @ 2006-07-25  2:36 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger

On Monday July 24, akpm@osdl.org wrote:
> 
> <then looks at ext2>

Yeh.... this patch is somewhat different to the others, yet similar...

I didn't bother changing ext2_get_inode at all, but just defined a
get_dentry export_op so that bad inode numbers never get to
ext2_get_inode via nfsd.

So in this case I had to test for the upper bound as well as the lower
bound.

Putting it another way,
 ext3_get_dentry reject certain inums that are known to be a problem.
 ext2_get_dentry allows only those inums that could possibly be ok.

So if you (anyone) prefer one approach over the other, making the
change so they both fs take the same approach would be trivial.

Compile tested only

NeilBrown


------------------------------------
Have ext2 reject file handles with bad inode numbers early.

This prevents bad inode numbers from triggering errors in
ext2_get_inode.


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./fs/ext2/super.c |   15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff .prev/fs/ext2/super.c ./fs/ext2/super.c
--- .prev/fs/ext2/super.c	2006-07-25 12:29:10.000000000 +1000
+++ ./fs/ext2/super.c	2006-07-25 12:31:04.000000000 +1000
@@ -251,6 +251,20 @@ static struct super_operations ext2_sops
 #endif
 };
 
+static struct dentry *ext2_get_dentry(struct super_block *sb, void *vobjp)
+{
+	__u32 *objp = vobjp;
+	unsigned long ino = objp[0];
+	__u32 generation = objp[1];
+
+	if (ino != EXT2_ROOT_INO && ino < EXT2_FIRST_INO(sb))
+		return ERR_PTR(-ESTALE);
+	if (ino > le32_to_cpu(EXT2_SB(sb)->s_es->s_inodes_count))
+		return ERR_PTR(-ESTALE);
+
+	return export_iget(sb, ino, generation);
+}
+
 /* Yes, most of these are left as NULL!!
  * A NULL value implies the default, which works with ext2-like file
  * systems, but can be improved upon.
@@ -258,6 +272,7 @@ static struct super_operations ext2_sops
  */
 static struct export_operations ext2_export_ops = {
 	.get_parent = ext2_get_parent,
+	.get_dentry = ext2_get_dentry,
 };
 
 static unsigned long get_sb_block(void **data)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-25  2:36                           ` Neil Brown
@ 2006-07-25 18:27                             ` Eric Sandeen
  0 siblings, 0 replies; 32+ messages in thread
From: Eric Sandeen @ 2006-07-25 18:27 UTC (permalink / raw)
  To: Neil Brown
  Cc: Andrew Morton, Theodore Tso, jack, 20, marcel, linux-kernel, sct,
	adilger

Neil Brown wrote:

> Putting it another way,
>  ext3_get_dentry reject certain inums that are known to be a problem.
>  ext2_get_dentry allows only those inums that could possibly be ok.
> 
> So if you (anyone) prefer one approach over the other, making the
> change so they both fs take the same approach would be trivial.

I like the 2nd approach - seems simpler, takes care of everything in 
->get_dentry, right?.  But I think your original patch is all that will 
work for 2.4 kernels...

-Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-25  2:21                           ` Neil Brown
@ 2006-07-26 17:12                             ` Eric Sandeen
  2006-07-26 23:53                               ` Neil Brown
  0 siblings, 1 reply; 32+ messages in thread
From: Eric Sandeen @ 2006-07-26 17:12 UTC (permalink / raw)
  To: Neil Brown
  Cc: Andrew Morton, Theodore Tso, jack, 20, marcel, linux-kernel, sct,
	adilger

> +EXPORT_SYMBOL_GPL(export_iget);
...
> +static struct dentry *ext3_get_dentry(struct super_block *sb, void *vobjp)
> +{
> +	__u32 *objp = vobjp;
> +	unsigned long ino = objp[0];
> +	__u32 generation = objp[1];
> +
> +	if (ino != EXT3_ROOT_INO && ino < EXT3_FIRST_INO(sb))
> +		return ERR_PTR(-ESTALE);
> +
> +	return export_iget(sb, ino, generation);
> +}

Hm, with this, ext3.ko has a new dependency on exportfs.ko.  Is that 
desirable/acceptable?
-Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-26 17:12                             ` Eric Sandeen
@ 2006-07-26 23:53                               ` Neil Brown
  2006-07-27 18:32                                 ` Eric Sandeen
  0 siblings, 1 reply; 32+ messages in thread
From: Neil Brown @ 2006-07-26 23:53 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Andrew Morton, Theodore Tso, jack, 20, marcel, linux-kernel, sct,
	adilger

On Wednesday July 26, sandeen@sandeen.net wrote:
> > +EXPORT_SYMBOL_GPL(export_iget);
> ...
> > +static struct dentry *ext3_get_dentry(struct super_block *sb, void *vobjp)
> > +{
> > +	__u32 *objp = vobjp;
> > +	unsigned long ino = objp[0];
> > +	__u32 generation = objp[1];
> > +
> > +	if (ino != EXT3_ROOT_INO && ino < EXT3_FIRST_INO(sb))
> > +		return ERR_PTR(-ESTALE);
> > +
> > +	return export_iget(sb, ino, generation);
> > +}
> 
> Hm, with this, ext3.ko has a new dependency on exportfs.ko.  Is that 
> desirable/acceptable?

Drat, you're right.
No, I don't think that is what we want.
I'll do it differently in a day or so.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-26 23:53                               ` Neil Brown
@ 2006-07-27 18:32                                 ` Eric Sandeen
  2006-07-27 19:12                                   ` Christoph Hellwig
  0 siblings, 1 reply; 32+ messages in thread
From: Eric Sandeen @ 2006-07-27 18:32 UTC (permalink / raw)
  To: Neil Brown
  Cc: Andrew Morton, Theodore Tso, jack, 20, marcel, linux-kernel, sct,
	adilger

Neil Brown wrote:
> On Wednesday July 26, sandeen@sandeen.net wrote:

>> Hm, with this, ext3.ko has a new dependency on exportfs.ko.  Is that 
>> desirable/acceptable?
> 
> Drat, you're right.
> No, I don't think that is what we want.
> I'll do it differently in a day or so.

Would moving export_iget into fs/inode.c & exporting it from there be a 
reasonable way to go?  At least ext2 & ext3 both have this need (prevent 
nfs access to special inodes) so putting the bulk of what they need for 
get_dentry (i.e. export_iget) somewhere common seems like a decent 
option to me.

-Eric

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-27 18:32                                 ` Eric Sandeen
@ 2006-07-27 19:12                                   ` Christoph Hellwig
  2006-07-28  0:34                                     ` Neil Brown
  2006-07-28 13:27                                     ` Peter Staubach
  0 siblings, 2 replies; 32+ messages in thread
From: Christoph Hellwig @ 2006-07-27 19:12 UTC (permalink / raw)
  To: Eric Sandeen
  Cc: Neil Brown, Andrew Morton, Theodore Tso, jack, 20, marcel,
	linux-kernel, sct, adilger

On Thu, Jul 27, 2006 at 01:32:43PM -0500, Eric Sandeen wrote:
> Neil Brown wrote:
> >On Wednesday July 26, sandeen@sandeen.net wrote:
> 
> >>Hm, with this, ext3.ko has a new dependency on exportfs.ko.  Is that 
> >>desirable/acceptable?
> >
> >Drat, you're right.
> >No, I don't think that is what we want.
> >I'll do it differently in a day or so.
> 
> Would moving export_iget into fs/inode.c & exporting it from there be a 
> reasonable way to go?  At least ext2 & ext3 both have this need (prevent 
> nfs access to special inodes) so putting the bulk of what they need for 
> get_dentry (i.e. export_iget) somewhere common seems like a decent 
> option to me.

Nope.  The right fix is to not make ext2/3 rely on export_iget at all. Please
implement the proper export_operations instead, similar to e.g. xfs. 

export_iget is a horrible layering violation that needs to go away long-term,
not move into core code.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-27 19:12                                   ` Christoph Hellwig
@ 2006-07-28  0:34                                     ` Neil Brown
  2006-07-28 13:27                                     ` Peter Staubach
  1 sibling, 0 replies; 32+ messages in thread
From: Neil Brown @ 2006-07-28  0:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Eric Sandeen, Andrew Morton, Theodore Tso, jack, 20, marcel,
	linux-kernel, sct, adilger

On Thursday July 27, hch@infradead.org wrote:
> On Thu, Jul 27, 2006 at 01:32:43PM -0500, Eric Sandeen wrote:
> > Neil Brown wrote:
> > >I'll do it differently in a day or so.
> > 
> > Would moving export_iget into fs/inode.c & exporting it from there be a 
> > reasonable way to go?  At least ext2 & ext3 both have this need (prevent 
> > nfs access to special inodes) so putting the bulk of what they need for 
> > get_dentry (i.e. export_iget) somewhere common seems like a decent 
> > option to me.
> 
> Nope.  The right fix is to not make ext2/3 rely on export_iget at all. Please
> implement the proper export_operations instead, similar to e.g. xfs. 
> 
> export_iget is a horrible layering violation that needs to go away long-term,
> not move into core code.

Agreed.
I've just submitted revised patches.  They effectively open-code
export_iget in a local implemention of the get_dentry
export_operation.

This should give the problem with no unpleasant exports or layering
issues. 

NeilBrown

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-27 19:12                                   ` Christoph Hellwig
  2006-07-28  0:34                                     ` Neil Brown
@ 2006-07-28 13:27                                     ` Peter Staubach
  2006-07-28 13:30                                       ` Christoph Hellwig
  1 sibling, 1 reply; 32+ messages in thread
From: Peter Staubach @ 2006-07-28 13:27 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Eric Sandeen, Neil Brown, Andrew Morton, Theodore Tso, jack, 20,
	marcel, linux-kernel, sct, adilger

Christoph Hellwig wrote:

>On Thu, Jul 27, 2006 at 01:32:43PM -0500, Eric Sandeen wrote:
>  
>
>>Neil Brown wrote:
>>    
>>
>>>On Wednesday July 26, sandeen@sandeen.net wrote:
>>>      
>>>
>>>>Hm, with this, ext3.ko has a new dependency on exportfs.ko.  Is that 
>>>>desirable/acceptable?
>>>>        
>>>>
>>>Drat, you're right.
>>>No, I don't think that is what we want.
>>>I'll do it differently in a day or so.
>>>      
>>>
>>Would moving export_iget into fs/inode.c & exporting it from there be a 
>>reasonable way to go?  At least ext2 & ext3 both have this need (prevent 
>>nfs access to special inodes) so putting the bulk of what they need for 
>>get_dentry (i.e. export_iget) somewhere common seems like a decent 
>>option to me.
>>    
>>
>
>Nope.  The right fix is to not make ext2/3 rely on export_iget at all. Please
>implement the proper export_operations instead, similar to e.g. xfs. 
>
>export_iget is a horrible layering violation that needs to go away long-term,
>not move into core code.
>

Since export_iget() doesn't actually involve any code which has anything to
do with the NFS server exports data structures, what exactly is the 
objection?
Is it truly better to duplicate code than to use a common routine which
can be documented?

    Thanx...

       ps

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
  2006-07-28 13:27                                     ` Peter Staubach
@ 2006-07-28 13:30                                       ` Christoph Hellwig
  0 siblings, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2006-07-28 13:30 UTC (permalink / raw)
  To: Peter Staubach
  Cc: Christoph Hellwig, Eric Sandeen, Neil Brown, Andrew Morton,
	Theodore Tso, jack, 20, marcel, linux-kernel, sct, adilger

On Fri, Jul 28, 2006 at 09:27:01AM -0400, Peter Staubach wrote:
> Since export_iget() doesn't actually involve any code which has anything to
> do with the NFS server exports data structures, what exactly is the 
> objection?
> Is it truly better to duplicate code than to use a common routine which
> can be documented?

export_iget calls iget() which assumes a lot about how a filesystem works.
Generally no one should call iget outside of filesystem code (export_iget
is the only such occurance) and should be replaced by opencoding iget_locked
& co on filesystems where it helps or a simple_iget that takes a callback
similar to the current read_inode method.  By moving export_iget to core
code you encourage people to use it, and that's the last thing we want.

Btw, you folks might want to ping Al Viro, he had patches to fix various
nfsd vs icache issues a while ago.

> 
>    Thanx...
> 
>       ps
---end quoted text---

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: Bad ext3/nfs DoS bug
@ 2006-07-22  3:38 linux
  0 siblings, 0 replies; 32+ messages in thread
From: linux @ 2006-07-22  3:38 UTC (permalink / raw)
  To: akpm, linux-kernel

>> +static inline int ext3_valid_inum(struct super_block *sb, unsigned long ino)
>> +{
>> +	return ino == EXT3_ROOT_INO ||
>> +		ino == EXT3_JOURNAL_INO ||
>> +		ino == EXT3_RESIZE_INO ||
>> +		(ino > EXT3_FIRST_INO(sb) &&
>> +		 ino <= le32_to_cpu(EXT3_SB(sb)->s_es->s_inodes_count));
>> +}
> 
> One would expect the inode validity test to be
> 
> 	(ino >= EXT3_FIRST_INO(sb)) && (ino < ...->s_inodes_count))
> 
> but even this assumes that s_inodes_count is misnamed and really should be
> called s_last_inode_plus_one.  If it is not misnamed then the validity test
> should be
> 
> 	(ino >= EXT3_FIRST_INO(sb)) &&
> 		(ino < EXT3_FIRST_INO(sb) + ...->s_inodes_count))
> 
> Look through the filesystem for other uses of EXT3_FIRST_INO().  It's all
> rather fishily inconsistent.

Er... I'm not an authoritative speaker, but it seems very simple to me.

Inodes are indexed starting from 1; the index 0 is reserved, and the first
inode on disk is number 1.

Thus, potentially valid inode numbers are 1 through s_inodes_count,
inclusive. Thus the <=.  If this were a standard 0-based index, it would
be <, but it's not.

Further, a range of low inode numbers are reserved for special purposes.
Only a few inode numbers in this range are valid.  However, these
numbers are still assigned space in the inode tables.

The only confusing term is EXT3_FIRST_INO, which is actually
more like EXT3_RESERVED_INODES.  The same 1-based indexing explains
the use of > rather than >= there.

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2006-07-28 13:30 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-07-17 13:01 Bad ext3/nfs DoS bug James
2006-07-17 13:06 ` Marcel Holtmann
2006-07-17 18:43 ` Marcel Holtmann
2006-07-18  7:55 ` Marcel Holtmann
2006-07-18 14:56   ` James
2006-07-18 15:22     ` Marcel Holtmann
2006-07-18 15:23       ` James
2006-07-18 20:18         ` Marcel Holtmann
2006-07-19  9:28           ` James
2006-07-19 15:55             ` Jan Kara
2006-07-20  4:46               ` Neil Brown
2006-07-20 16:06                 ` Jan Kara
2006-07-20 20:11                   ` James
2006-07-21  6:44                     ` Neil Brown
2006-07-21  6:39                   ` Neil Brown
2006-07-21 14:24                     ` Jan Kara
2006-07-22  0:06                     ` Andrew Morton
2006-07-22 13:17                       ` Theodore Tso
2006-07-25  1:56                         ` Andrew Morton
2006-07-25  2:21                           ` Neil Brown
2006-07-26 17:12                             ` Eric Sandeen
2006-07-26 23:53                               ` Neil Brown
2006-07-27 18:32                                 ` Eric Sandeen
2006-07-27 19:12                                   ` Christoph Hellwig
2006-07-28  0:34                                     ` Neil Brown
2006-07-28 13:27                                     ` Peter Staubach
2006-07-28 13:30                                       ` Christoph Hellwig
2006-07-25  2:36                           ` Neil Brown
2006-07-25 18:27                             ` Eric Sandeen
2006-07-21  0:42                 ` Marcel Holtmann
2006-07-21 12:29                   ` Andrew Morton
2006-07-22  3:38 linux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).