linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* New reliability technique
@ 2006-02-25 13:21 Victor Porton
  2006-02-25 13:26 ` Jesper Juhl
  0 siblings, 1 reply; 9+ messages in thread
From: Victor Porton @ 2006-02-25 13:21 UTC (permalink / raw)
  To: linux-kernel

A minute ago I invented a new reliability enhancing technique.

In idle cycles (or periodically in expense of some performance) Linux can
calculate MD5 or CRC sums of _unused_ (free) memory areas and compare these
sums with previously calculated sums.

Additionally it can be done for allocated memory, if it will be write
protected before the first actual write. Moreover, all memory may be made
write-protected if it is not written e.g. more than a second. (When it
is written kernel would unlock it and allow to write, by a techniqie like
to how swap works.) If write-protected memory appears to be modified by
a check sum, this likewise indicates a bug.

If a sum is inequal, it would notice a bug in kernel or in hardware.

I suggest to add "Check free memory control sums" in config.

-- 
Victor Porton (porton@ex-code.com) - http://porton.ex-code.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: New reliability technique
  2006-02-25 13:21 New reliability technique Victor Porton
@ 2006-02-25 13:26 ` Jesper Juhl
  2006-02-25 13:27   ` Jesper Juhl
  0 siblings, 1 reply; 9+ messages in thread
From: Jesper Juhl @ 2006-02-25 13:26 UTC (permalink / raw)
  To: Victor Porton; +Cc: linux-kernel

On 2/25/06, Victor Porton <porton@ex-code.com> wrote:
> A minute ago I invented a new reliability enhancing technique.
>
> In idle cycles (or periodically in expense of some performance) Linux can
> calculate MD5 or CRC sums of _unused_ (free) memory areas and compare these
> sums with previously calculated sums.
>
> Additionally it can be done for allocated memory, if it will be write
> protected before the first actual write. Moreover, all memory may be made
> write-protected if it is not written e.g. more than a second. (When it
> is written kernel would unlock it and allow to write, by a techniqie like
> to how swap works.) If write-protected memory appears to be modified by
> a check sum, this likewise indicates a bug.
>
> If a sum is inequal, it would notice a bug in kernel or in hardware.
>
> I suggest to add "Check free memory control sums" in config.
>

Implement it then and send a patch.

--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please      http://www.expita.com/nomime.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: New reliability technique
  2006-02-25 13:26 ` Jesper Juhl
@ 2006-02-25 13:27   ` Jesper Juhl
  2006-02-25 19:52     ` Avi Kivity
  0 siblings, 1 reply; 9+ messages in thread
From: Jesper Juhl @ 2006-02-25 13:27 UTC (permalink / raw)
  To: Victor Porton; +Cc: linux-kernel

On 2/25/06, Jesper Juhl <jesper.juhl@gmail.com> wrote:
> On 2/25/06, Victor Porton <porton@ex-code.com> wrote:
> > A minute ago I invented a new reliability enhancing technique.
> >
> > In idle cycles (or periodically in expense of some performance) Linux can
> > calculate MD5 or CRC sums of _unused_ (free) memory areas and compare these
> > sums with previously calculated sums.
> >
> > Additionally it can be done for allocated memory, if it will be write
> > protected before the first actual write. Moreover, all memory may be made
> > write-protected if it is not written e.g. more than a second. (When it
> > is written kernel would unlock it and allow to write, by a techniqie like
> > to how swap works.) If write-protected memory appears to be modified by
> > a check sum, this likewise indicates a bug.
> >
> > If a sum is inequal, it would notice a bug in kernel or in hardware.
> >
> > I suggest to add "Check free memory control sums" in config.
> >
>
> Implement it then and send a patch.
>

But, doesn't slab poisoning and the like already cover this ground somewhat?


--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please      http://www.expita.com/nomime.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: New reliability technique
  2006-02-25 13:27   ` Jesper Juhl
@ 2006-02-25 19:52     ` Avi Kivity
  2006-02-25 19:56       ` Jesper Juhl
  2006-02-26 11:34       ` Victor Porton
  0 siblings, 2 replies; 9+ messages in thread
From: Avi Kivity @ 2006-02-25 19:52 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: Victor Porton, linux-kernel

Jesper Juhl wrote:

>On 2/25/06, Jesper Juhl <jesper.juhl@gmail.com> wrote:
>  
>
>>On 2/25/06, Victor Porton <porton@ex-code.com> wrote:
>>    
>>
>>>A minute ago I invented a new reliability enhancing technique.
>>>
>>>In idle cycles (or periodically in expense of some performance) Linux can
>>>calculate MD5 or CRC sums of _unused_ (free) memory areas and compare these
>>>sums with previously calculated sums.
>>>
>>>Additionally it can be done for allocated memory, if it will be write
>>>protected before the first actual write. Moreover, all memory may be made
>>>write-protected if it is not written e.g. more than a second. (When it
>>>is written kernel would unlock it and allow to write, by a techniqie like
>>>to how swap works.) If write-protected memory appears to be modified by
>>>a check sum, this likewise indicates a bug.
>>>
>>>If a sum is inequal, it would notice a bug in kernel or in hardware.
>>>
>>>I suggest to add "Check free memory control sums" in config.
>>>
>>>      
>>>
>>Implement it then and send a patch.
>>
>>    
>>
>
>But, doesn't slab poisoning and the like already cover this ground somewhat?
>
>  
>
No, they don't. They cover only a very small percentage of memory.

On the other hand, ECC memory and caches do this in hardware.

-- 
Do not meddle in the internals of kernels, for they are subtle and quick to panic.


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: New reliability technique
  2006-02-25 19:52     ` Avi Kivity
@ 2006-02-25 19:56       ` Jesper Juhl
  2006-02-25 22:44         ` Janos Farkas
  2006-02-26 11:34       ` Victor Porton
  1 sibling, 1 reply; 9+ messages in thread
From: Jesper Juhl @ 2006-02-25 19:56 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Victor Porton, linux-kernel

On 2/25/06, Avi Kivity <avi@argo.co.il> wrote:
> Jesper Juhl wrote:
>
> >On 2/25/06, Jesper Juhl <jesper.juhl@gmail.com> wrote:
> >
> >
> >>On 2/25/06, Victor Porton <porton@ex-code.com> wrote:
> >>
> >>
> >>>A minute ago I invented a new reliability enhancing technique.
> >>>
> >>>In idle cycles (or periodically in expense of some performance) Linux can
> >>>calculate MD5 or CRC sums of _unused_ (free) memory areas and compare these
> >>>sums with previously calculated sums.
> >>>
> >>>Additionally it can be done for allocated memory, if it will be write
> >>>protected before the first actual write. Moreover, all memory may be made
> >>>write-protected if it is not written e.g. more than a second. (When it
> >>>is written kernel would unlock it and allow to write, by a techniqie like
> >>>to how swap works.) If write-protected memory appears to be modified by
> >>>a check sum, this likewise indicates a bug.
> >>>
> >>>If a sum is inequal, it would notice a bug in kernel or in hardware.
> >>>
> >>>I suggest to add "Check free memory control sums" in config.
> >>>
> >>>
> >>>
> >>Implement it then and send a patch.
> >>
> >>
> >>
> >
> >But, doesn't slab poisoning and the like already cover this ground somewhat?
> >
> >
> >
> No, they don't. They cover only a very small percentage of memory.
>

Ohh, ok, then it makes sense as a debug thing.

Let's see an implementation then.

--
Jesper Juhl <jesper.juhl@gmail.com>
Don't top-post  http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please      http://www.expita.com/nomime.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: New reliability technique
  2006-02-25 19:56       ` Jesper Juhl
@ 2006-02-25 22:44         ` Janos Farkas
  0 siblings, 0 replies; 9+ messages in thread
From: Janos Farkas @ 2006-02-25 22:44 UTC (permalink / raw)
  To: Jesper Juhl; +Cc: Avi Kivity, Victor Porton, linux-kernel

> > >>On 2/25/06, Victor Porton <porton@ex-code.com> wrote:
> > >>>A minute ago I invented a new reliability enhancing technique.

> > >>>In idle cycles (or periodically in expense of some performance) Linux can
> > >>>calculate MD5 or CRC sums of _unused_ (free) memory areas and compare these
> > >>>sums with previously calculated sums.

On 2006-02-25 at 20:56:27, Jesper Juhl wrote:
> Ohh, ok, then it makes sense as a debug thing.
> 
> Let's see an implementation then.

http://www.ussg.iu.edu/hypermail/linux/kernel/9701.1/0058.html

At least a variation of it, maybe not from a minute ago :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: New reliability technique
  2006-02-25 19:52     ` Avi Kivity
  2006-02-25 19:56       ` Jesper Juhl
@ 2006-02-26 11:34       ` Victor Porton
  2006-02-26 12:25         ` Pekka Enberg
  1 sibling, 1 reply; 9+ messages in thread
From: Victor Porton @ 2006-02-26 11:34 UTC (permalink / raw)
  To: Avi Kivity; +Cc: linux-kernel, Jesper Juhl


On 25-Feb-2006 Avi Kivity wrote:
> Jesper Juhl wrote:
...
>>>On 2/25/06, Victor Porton <porton@ex-code.com> wrote:
>>>    
>>>
>>>>In idle cycles (or periodically in expense of some performance) Linux can
>>>>calculate MD5 or CRC sums of _unused_ (free) memory areas and compare these
>>>>sums with previously calculated sums.
>>>>
>>>>Additionally it can be done for allocated memory, if it will be write
>>>>protected before the first actual write. Moreover, all memory may be made
>>>>write-protected if it is not written e.g. more than a second. (When it
>>>>is written kernel would unlock it and allow to write, by a techniqie like
>>>>to how swap works.) If write-protected memory appears to be modified by
>>>>a check sum, this likewise indicates a bug.
>>>>
>>>>If a sum is inequal, it would notice a bug in kernel or in hardware.
>>>>
>>>>I suggest to add "Check free memory control sums" in config.

> No, they don't. They cover only a very small percentage of memory.
> 
> On the other hand, ECC memory and caches do this in hardware.

Isn't it better to double check (especially after such risky things as
e.g. software suspend)?

We need to check not only for damaged hardware, but also for
kernel/modules bugs. For this ECC and cache reliability is useless.

-- 
Victor Porton (porton@ex-code.com) - http://porton.ex-code.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: New reliability technique
  2006-02-26 11:34       ` Victor Porton
@ 2006-02-26 12:25         ` Pekka Enberg
  2006-02-26 12:30           ` Nick Piggin
  0 siblings, 1 reply; 9+ messages in thread
From: Pekka Enberg @ 2006-02-26 12:25 UTC (permalink / raw)
  To: Victor Porton; +Cc: Avi Kivity, linux-kernel, Jesper Juhl

On 2/26/06, Victor Porton <porton@ex-code.com> wrote:
> Isn't it better to double check (especially after such risky things as
> e.g. software suspend)?
>
> We need to check not only for damaged hardware, but also for
> kernel/modules bugs. For this ECC and cache reliability is useless.

What kernel bugs do you want to catch with double-checking free
memory? For use-after-free, we already have slab poisoning.

                                    Pekka

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: New reliability technique
  2006-02-26 12:25         ` Pekka Enberg
@ 2006-02-26 12:30           ` Nick Piggin
  0 siblings, 0 replies; 9+ messages in thread
From: Nick Piggin @ 2006-02-26 12:30 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: Victor Porton, Avi Kivity, linux-kernel, Jesper Juhl

Pekka Enberg wrote:
> On 2/26/06, Victor Porton <porton@ex-code.com> wrote:
> 
>>Isn't it better to double check (especially after such risky things as
>>e.g. software suspend)?
>>
>>We need to check not only for damaged hardware, but also for
>>kernel/modules bugs. For this ECC and cache reliability is useless.
> 
> 
> What kernel bugs do you want to catch with double-checking free
> memory? For use-after-free, we already have slab poisoning.
> 

And for !slab, we unmap kernel virtual addresses with page debugging,
which seems like a better solution.

-- 
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com 

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2006-02-26 12:30 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-25 13:21 New reliability technique Victor Porton
2006-02-25 13:26 ` Jesper Juhl
2006-02-25 13:27   ` Jesper Juhl
2006-02-25 19:52     ` Avi Kivity
2006-02-25 19:56       ` Jesper Juhl
2006-02-25 22:44         ` Janos Farkas
2006-02-26 11:34       ` Victor Porton
2006-02-26 12:25         ` Pekka Enberg
2006-02-26 12:30           ` Nick Piggin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).