linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Verify the items that we read from blocks
@ 2021-07-02 15:05 Shreyansh Chouhan
  2021-07-05 11:33 ` Jan Kara
  0 siblings, 1 reply; 4+ messages in thread
From: Shreyansh Chouhan @ 2021-07-02 15:05 UTC (permalink / raw)
  To: jack, rkovhaev; +Cc: reiserfs-devel, linux-kernel

Hi,

I was trying to work on this[1] bug. After a lot of reading the code and
running it under gdb, I found out that the error happens because
syzkaller creates a segment with raw binary data in the reproducer[2],
that has the wrong deh_location for the `..` directory item. (The value
is 0x5d (93), where as it should have been 0x20 (32).)

I think that the solution would involve checking the items that we read,
and verify that they are actually valid. But this check could actually
happen in two places:

- First idea would be to check as soon as we read a
  block, and one way of doing that would be adding a wrapper around
  ll_rw_block that validates the leaf node blocks that we read. The
  benifits to this would be that since we're solving the problem at it's
  root, very few functions would have to be changed. But I don't know
  how much of a performance hit would it be.

- Second idea would be to do these validation checks lazily. This should
  be faster than the first idea, but this would involve changing the
  code at more places than in the first idea.

For how the validation happens, the first idea that comes to mind is
reading the item headers from the block that we read and verifying if
the header is valid, and if the items themselves are valid according to
the header.

It's very likely that better approaches to this problem exist, that I
wasn't able to think of. I wanted to discuss about this before pursuing
the solution any further. Would such a change be accepted?

If there are better approaches, or if I am looking at this bug from an
incorrect perspective, please let me know.

Thank you,
Shreyansh Chouhan

--

[1] https://syzkaller.appspot.com/bug?id=d8c00bae1644df59696f2d74d1955fd286691234
[2] https://syzkaller.appspot.com/text?tag=ReproC&x=13f9f338d00000

(PS: In the reproducer, the segment partition with data at 0x20011100 in
the execute_once function has the faulty directory item.)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Verify the items that we read from blocks
  2021-07-02 15:05 Verify the items that we read from blocks Shreyansh Chouhan
@ 2021-07-05 11:33 ` Jan Kara
  2021-07-12  3:44   ` Shreyansh Chouhan
  0 siblings, 1 reply; 4+ messages in thread
From: Jan Kara @ 2021-07-05 11:33 UTC (permalink / raw)
  To: Shreyansh Chouhan; +Cc: jack, rkovhaev, reiserfs-devel, linux-kernel

Hello!

On Fri 02-07-21 20:35:41, Shreyansh Chouhan wrote:
> I was trying to work on this[1] bug. After a lot of reading the code and
> running it under gdb, I found out that the error happens because
> syzkaller creates a segment with raw binary data in the reproducer[2],
> that has the wrong deh_location for the `..` directory item. (The value
> is 0x5d (93), where as it should have been 0x20 (32).)

First, I'd like to note that reiserfs is a legacy filesystem which gets
little maintenance and I think distributions are close to disabling it in
their default kernels if they didn't do it already. So I'm not sure how
much is it worth it to do any larger fixes to it. But if you have a
personal passion for reiserfs feel free to go ahead and try to fix these
issues.

> I think that the solution would involve checking the items that we read,
> and verify that they are actually valid. But this check could actually
> happen in two places:
> 
> - First idea would be to check as soon as we read a
>   block, and one way of doing that would be adding a wrapper around
>   ll_rw_block that validates the leaf node blocks that we read. The
>   benifits to this would be that since we're solving the problem at it's
>   root, very few functions would have to be changed. But I don't know
>   how much of a performance hit would it be.

It depends on how heavy the checks are going to be but generally checking
when loading from the disk is the way how most filesystems handle this.

> - Second idea would be to do these validation checks lazily. This should
>   be faster than the first idea, but this would involve changing the
>   code at more places than in the first idea.
> 
> For how the validation happens, the first idea that comes to mind is
> reading the item headers from the block that we read and verifying if
> the header is valid, and if the items themselves are valid according to
> the header.

Looks sound.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Verify the items that we read from blocks
  2021-07-05 11:33 ` Jan Kara
@ 2021-07-12  3:44   ` Shreyansh Chouhan
  0 siblings, 0 replies; 4+ messages in thread
From: Shreyansh Chouhan @ 2021-07-12  3:44 UTC (permalink / raw)
  To: Jan Kara; +Cc: rkovhaev, reiserfs-devel, linux-kernel

Hi,

I thought that my last email wasn't an appropriate response, since to me
it looked as if I hadn't read your suggestions before sending a
response. (Couldn't quote anything because I wasn't able to find the email on
mutt (messed up filters,) and had to write a quick email with the
in-reply-to option.) So I thought I'd resend the response after I've
fixed my inbox.

On Mon, Jul 05, 2021 at 01:33:29PM +0200, Jan Kara wrote:
> Hello!
> 
> On Fri 02-07-21 20:35:41, Shreyansh Chouhan wrote:
> > I was trying to work on this[1] bug. After a lot of reading the code and
> > running it under gdb, I found out that the error happens because
> > syzkaller creates a segment with raw binary data in the reproducer[2],
> > that has the wrong deh_location for the `..` directory item. (The value
> > is 0x5d (93), where as it should have been 0x20 (32).)
> 
> First, I'd like to note that reiserfs is a legacy filesystem which gets
> little maintenance and I think distributions are close to disabling it in
> their default kernels if they didn't do it already. So I'm not sure how
> much is it worth it to do any larger fixes to it. But if you have a
> personal passion for reiserfs feel free to go ahead and try to fix these
> issues.
> 

I had already spent a considerable amount of time on the debugging
portion, (to find an obvious mistake, now that I look back at it in
hindsight,) so I thought I'd just send in a patch.

> > I think that the solution would involve checking the items that we read,
> > and verify that they are actually valid. But this check could actually
> > happen in two places:
> > 
> > - First idea would be to check as soon as we read a
> >   block, and one way of doing that would be adding a wrapper around
> >   ll_rw_block that validates the leaf node blocks that we read. The
> >   benifits to this would be that since we're solving the problem at it's
> >   root, very few functions would have to be changed. But I don't know
> >   how much of a performance hit would it be.
> 
> It depends on how heavy the checks are going to be but generally checking
> when loading from the disk is the way how most filesystems handle this.
> 

The checks would be an O(n) traversal of directory headers, which
themselves check if the deh_location is greater than item length. The
item header checks were already present in the `is_leaf`(?) function.

> > - Second idea would be to do these validation checks lazily. This should
> >   be faster than the first idea, but this would involve changing the
> >   code at more places than in the first idea.
> > 
> > For how the validation happens, the first idea that comes to mind is
> > reading the item headers from the block that we read and verifying if
> > the header is valid, and if the items themselves are valid according to
> > the header.
> 
> Looks sound.
> 

I have added the implementation for the above idea to the `is_leaf`
function. Thanks a lot for your suggestions.

> 								Honza
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

Thanks,
Shreyansh Chouhan

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Verify the items that we read from blocks
@ 2021-07-09 15:34 Shreyansh Chouhan
  0 siblings, 0 replies; 4+ messages in thread
From: Shreyansh Chouhan @ 2021-07-09 15:34 UTC (permalink / raw)
  To: jack; +Cc: rkovhaev, reiserfs-devel, linux-kernel

Hi,

Thanks a lot for your input. After investing this much time on the bug, I
thought I'd just send a patch. Hope that won't be a problem.

Thank you,
Shreyansh Chouhan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-07-12  3:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-02 15:05 Verify the items that we read from blocks Shreyansh Chouhan
2021-07-05 11:33 ` Jan Kara
2021-07-12  3:44   ` Shreyansh Chouhan
2021-07-09 15:34 Shreyansh Chouhan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).