All of lore.kernel.org
 help / color / mirror / Atom feed
* Is it OK to pass non-acquired objects to kfree?
@ 2015-09-08  7:51 Dmitry Vyukov
  2015-09-08 14:13 ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-08  7:51 UTC (permalink / raw)
  To: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm
  Cc: Andrey Konovalov, Alexander Potapenko, Paul McKenney

Hello mm-maintainers,

I have a question about kfree semantics, I can't find answer in docs
and opinions I hear differ.
Namely, is it OK to pass non-acquired objects to
kfree/kmem_cache_free? By non-acquired mean objects unsafely passed
between threads without using proper release/acquire (wmb/rmb) memory
barriers.

The question arose during work on KernelThreadSanitizer, a kernel data
race, and in particular caused by the following existing code:

// kernel/pid.c
         if ((atomic_read(&pid->count) == 1) ||
              atomic_dec_and_test(&pid->count)) {
                 kmem_cache_free(ns->pid_cachep, pid);
                 put_pid_ns(ns);
         }

//drivers/tty/tty_buffer.c
while ((next = buf->head->next) != NULL) {
     tty_buffer_free(port, buf->head);
     buf->head = next;
}
// Here another thread can concurrently append to the buffer list, and
tty_buffer_free eventually calls kfree.

Both these cases don't contain proper memory barrier before handing
off the object to kfree. In my opinion the code should use
smp_load_acquire or READ_ONCE_CTRL ("control-dependnecy-acquire").
Otherwise there can be pending memory accesses to the object in other
threads that can interfere with slab code or the next usage of the
object after reuse.

Paul McKenney suggested that:

"
The maintainers probably want this sort of code to be allowed:
        p->a++;
        if (p->b) {
                kfree(p);
                p = NULL;
        }
And the users even more so.
So if the compiler really is free to reorder any scribbling/checking
by the caller with any scribbling/checking by kfree(), that should
be fixed in kfree() rather than in all the callers.
"

This does not look reasonable to me for 2 reasons:
- this incurs unnecessary cost for all kfree users, kfree would have
to execute a memory barrier always while most callers already have the
object acquired (either single-threaded use, or mutex protected, or
the object was properly handed off to the freeing thread)
- as far as I understand if an object is unsafely passed between a
chain of threads A->B->C->D and then the last does kfree, then kfree
can't acquire visibility over the object by executing a memory
barrier. All threads in the chain must play by the rules to properly
hand off the object to kfree.

As far as I understand, that's why atomic_dec_and_test used for
reference counting contains full memory barrier; and also kfree does
not seem to contain any memory barriers on fast path.

Can you please clarify the rules here?

Thank you



-- 
Dmitry Vyukov, Software Engineer, dvyukov@google.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-08  7:51 Is it OK to pass non-acquired objects to kfree? Dmitry Vyukov
@ 2015-09-08 14:13 ` Christoph Lameter
  2015-09-08 14:41   ` Dmitry Vyukov
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-08 14:13 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Tue, 8 Sep 2015, Dmitry Vyukov wrote:

> The question arose during work on KernelThreadSanitizer, a kernel data
> race, and in particular caused by the following existing code:
>
> // kernel/pid.c
>          if ((atomic_read(&pid->count) == 1) ||
>               atomic_dec_and_test(&pid->count)) {
>                  kmem_cache_free(ns->pid_cachep, pid);
>                  put_pid_ns(ns);
>          }

It frees when there the refcount is one? Should this not be

	if (atomic_read(&pid->count) === 0) || ...

>
> //drivers/tty/tty_buffer.c
> while ((next = buf->head->next) != NULL) {
>      tty_buffer_free(port, buf->head);
>      buf->head = next;
> }
> // Here another thread can concurrently append to the buffer list, and
> tty_buffer_free eventually calls kfree.
>
> Both these cases don't contain proper memory barrier before handing
> off the object to kfree. In my opinion the code should use
> smp_load_acquire or READ_ONCE_CTRL ("control-dependnecy-acquire").
> Otherwise there can be pending memory accesses to the object in other
> threads that can interfere with slab code or the next usage of the
> object after reuse.

There can be pending reads maybe? But a write would require exclusive
acccess to the cachelines.


> Paul McKenney suggested that:
>
> "
> The maintainers probably want this sort of code to be allowed:
>         p->a++;
>         if (p->b) {
>                 kfree(p);
>                 p = NULL;
>         }
> And the users even more so.


Sure. What would be the problem with the above code? The write to the
object (p->a++) results in exclusive access to a cacheline being obtained.
So one cpu holds that cacheline. Then the object is freed and reused
either

1. On the same cpu -> No problem.

2. On another cpu. This means that a hand off of the pointer to the object
occurs in the slab allocators. The hand off involves a spinlock and thus
implicit barriers. The other processor will acquire exclusive access to
the cacheline when it initializes the object. At that point the cacheline
ownership will transfer between the processors.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-08 14:13 ` Christoph Lameter
@ 2015-09-08 14:41   ` Dmitry Vyukov
  2015-09-08 15:13     ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-08 14:41 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Tue, Sep 8, 2015 at 4:13 PM, Christoph Lameter <cl@linux.com> wrote:
> On Tue, 8 Sep 2015, Dmitry Vyukov wrote:
>
>> The question arose during work on KernelThreadSanitizer, a kernel data
>> race, and in particular caused by the following existing code:
>>
>> // kernel/pid.c
>>          if ((atomic_read(&pid->count) == 1) ||
>>               atomic_dec_and_test(&pid->count)) {
>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }
>
> It frees when there the refcount is one? Should this not be
>
>         if (atomic_read(&pid->count) === 0) || ...

The code is meant to do decrement of pid->count, but since
pid->count==1 it figures out that it is the only owner of the object,
so it just skips the "pid->count--" part and proceeds directly to
free.

>> //drivers/tty/tty_buffer.c
>> while ((next = buf->head->next) != NULL) {
>>      tty_buffer_free(port, buf->head);
>>      buf->head = next;
>> }
>> // Here another thread can concurrently append to the buffer list, and
>> tty_buffer_free eventually calls kfree.
>>
>> Both these cases don't contain proper memory barrier before handing
>> off the object to kfree. In my opinion the code should use
>> smp_load_acquire or READ_ONCE_CTRL ("control-dependnecy-acquire").
>> Otherwise there can be pending memory accesses to the object in other
>> threads that can interfere with slab code or the next usage of the
>> object after reuse.
>
> There can be pending reads maybe? But a write would require exclusive
> acccess to the cachelines.
>
>
>> Paul McKenney suggested that:
>>
>> "
>> The maintainers probably want this sort of code to be allowed:
>>         p->a++;
>>         if (p->b) {
>>                 kfree(p);
>>                 p = NULL;
>>         }
>> And the users even more so.
>
>
> Sure. What would be the problem with the above code? The write to the
> object (p->a++) results in exclusive access to a cacheline being obtained.
> So one cpu holds that cacheline. Then the object is freed and reused
> either

I am not sure what cache line states has to do with it...
Anyway, another thread can do p->c++ after this thread does p->a++,
then this thread loses its ownership. Or p->c can be located on a
separate cache line with p->a. And then we still free the object with
a pending write.

> 1. On the same cpu -> No problem.
>
> 2. On another cpu. This means that a hand off of the pointer to the object
> occurs in the slab allocators. The hand off involves a spinlock and thus
> implicit barriers. The other processor will acquire exclusive access to
> the cacheline when it initializes the object. At that point the cacheline
> ownership will transfer between the processors.
>



-- 
Dmitry Vyukov, Software Engineer, dvyukov@google.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-08 14:41   ` Dmitry Vyukov
@ 2015-09-08 15:13     ` Christoph Lameter
  2015-09-08 15:23       ` Dmitry Vyukov
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-08 15:13 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Tue, 8 Sep 2015, Dmitry Vyukov wrote:

> >>
> >> // kernel/pid.c
> >>          if ((atomic_read(&pid->count) == 1) ||
> >>               atomic_dec_and_test(&pid->count)) {
> >>                  kmem_cache_free(ns->pid_cachep, pid);
> >>                  put_pid_ns(ns);
> >>          }
> >
> > It frees when there the refcount is one? Should this not be
> >
> >         if (atomic_read(&pid->count) === 0) || ...
>
> The code is meant to do decrement of pid->count, but since
> pid->count==1 it figures out that it is the only owner of the object,
> so it just skips the "pid->count--" part and proceeds directly to
> free.

The atomic_dec_and_test will therefore not be executed for count == 1?
Strange code. The atomic_dec_and_test suggests there are concurrency
concerns. The count test with a simple comparison does not share these
concerns it seems.

> >> The maintainers probably want this sort of code to be allowed:
> >>         p->a++;
> >>         if (p->b) {
> >>                 kfree(p);
> >>                 p = NULL;
> >>         }
> >> And the users even more so.
> >
> >
> > Sure. What would be the problem with the above code? The write to the
> > object (p->a++) results in exclusive access to a cacheline being obtained.
> > So one cpu holds that cacheline. Then the object is freed and reused
> > either
>
> I am not sure what cache line states has to do with it...
> Anyway, another thread can do p->c++ after this thread does p->a++,
> then this thread loses its ownership. Or p->c can be located on a
> separate cache line with p->a. And then we still free the object with
> a pending write.

The subsystem must ensure no other references exist before a call to free.
So this cannot occur. If it does then these are cases of an object being
used after free which can be caught by a number of diagnostic tools in the
kernel.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-08 15:13     ` Christoph Lameter
@ 2015-09-08 15:23       ` Dmitry Vyukov
  2015-09-08 15:33         ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-08 15:23 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Tue, Sep 8, 2015 at 5:13 PM, Christoph Lameter <cl@linux.com> wrote:
> On Tue, 8 Sep 2015, Dmitry Vyukov wrote:
>
>> >>
>> >> // kernel/pid.c
>> >>          if ((atomic_read(&pid->count) == 1) ||
>> >>               atomic_dec_and_test(&pid->count)) {
>> >>                  kmem_cache_free(ns->pid_cachep, pid);
>> >>                  put_pid_ns(ns);
>> >>          }
>> >
>> > It frees when there the refcount is one? Should this not be
>> >
>> >         if (atomic_read(&pid->count) === 0) || ...
>>
>> The code is meant to do decrement of pid->count, but since
>> pid->count==1 it figures out that it is the only owner of the object,
>> so it just skips the "pid->count--" part and proceeds directly to
>> free.
>
> The atomic_dec_and_test will therefore not be executed for count == 1?
> Strange code. The atomic_dec_and_test suggests there are concurrency
> concerns. The count test with a simple comparison does not share these
> concerns it seems.

Yes, it skips atomic decrement when counter is equal to 1. This is
relatively common optimization for basic-thread-safety reference
counting (when you can acquire a new reference only when you already
have a one). If counter == 1, then the only owner is the current
thread, so other threads cannot change counter concurrently. So there
is no point in doing the atomic decrement (we know that the counter
will go to 0).

>> >> The maintainers probably want this sort of code to be allowed:
>> >>         p->a++;
>> >>         if (p->b) {
>> >>                 kfree(p);
>> >>                 p = NULL;
>> >>         }
>> >> And the users even more so.
>> >
>> >
>> > Sure. What would be the problem with the above code? The write to the
>> > object (p->a++) results in exclusive access to a cacheline being obtained.
>> > So one cpu holds that cacheline. Then the object is freed and reused
>> > either
>>
>> I am not sure what cache line states has to do with it...
>> Anyway, another thread can do p->c++ after this thread does p->a++,
>> then this thread loses its ownership. Or p->c can be located on a
>> separate cache line with p->a. And then we still free the object with
>> a pending write.
>
> The subsystem must ensure no other references exist before a call to free.
> So this cannot occur. If it does then these are cases of an object being
> used after free which can be caught by a number of diagnostic tools in the
> kernel.


Yes, this is a case of use-after-free bug. But the use-after-free can
happen only due to memory access reordering in a multithreaded
environment.
OK, here is a simpler code snippet:

void *p; // = NULL

// thread 1
p = kmalloc(8);

// thread 2
void *r = READ_ONCE(p);
if (r != NULL)
    kfree(r);

I would expect that this is illegal code. Is my understanding correct?


-- 
Dmitry Vyukov, Software Engineer, dvyukov@google.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-08 15:23       ` Dmitry Vyukov
@ 2015-09-08 15:33         ` Christoph Lameter
  2015-09-08 15:37           ` Dmitry Vyukov
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-08 15:33 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Tue, 8 Sep 2015, Dmitry Vyukov wrote:

> Yes, this is a case of use-after-free bug. But the use-after-free can
> happen only due to memory access reordering in a multithreaded
> environment.
> OK, here is a simpler code snippet:
>
> void *p; // = NULL
>
> // thread 1
> p = kmalloc(8);
>
> // thread 2
> void *r = READ_ONCE(p);
> if (r != NULL)
>     kfree(r);
>
> I would expect that this is illegal code. Is my understanding correct?

This should work. It could be a problem if thread 1 is touching
the object.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-08 15:33         ` Christoph Lameter
@ 2015-09-08 15:37           ` Dmitry Vyukov
  2015-09-08 17:09             ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-08 15:37 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Tue, Sep 8, 2015 at 5:33 PM, Christoph Lameter <cl@linux.com> wrote:
> On Tue, 8 Sep 2015, Dmitry Vyukov wrote:
>
>> Yes, this is a case of use-after-free bug. But the use-after-free can
>> happen only due to memory access reordering in a multithreaded
>> environment.
>> OK, here is a simpler code snippet:
>>
>> void *p; // = NULL
>>
>> // thread 1
>> p = kmalloc(8);
>>
>> // thread 2
>> void *r = READ_ONCE(p);
>> if (r != NULL)
>>     kfree(r);
>>
>> I would expect that this is illegal code. Is my understanding correct?
>
> This should work. It could be a problem if thread 1 is touching
> the object.

What does make it work?
There are clearly memory barriers missing when passing the object
between threads. The typical correct pattern is:

// thread 1
smp_store_release(&p, kmalloc(8));

// thread 2
void *r = smp_load_acquire(&p); // or READ_ONCE_CTRL
if (r)
  kfree(r);

Otherwise stores into the object in kmalloc can reach the object when
it is already freed, which is a use-after-free.

What does prevent the use-after-free?



-- 
Dmitry Vyukov, Software Engineer, dvyukov@google.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-08 15:37           ` Dmitry Vyukov
@ 2015-09-08 17:09             ` Christoph Lameter
  2015-09-08 19:24               ` Dmitry Vyukov
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-08 17:09 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Tue, 8 Sep 2015, Dmitry Vyukov wrote:

> >> I would expect that this is illegal code. Is my understanding correct?
> >
> > This should work. It could be a problem if thread 1 is touching
> > the object.
>
> What does make it work?

The 2nd thread gets the pointer that the first allocated and frees it.
If there is no more processing then fine.

> There are clearly memory barriers missing when passing the object
> between threads. The typical correct pattern is:

Why? If thread 2 gets the pointer it frees it. Thats ok.

> // thread 1
> smp_store_release(&p, kmalloc(8));
>
> // thread 2
> void *r = smp_load_acquire(&p); // or READ_ONCE_CTRL
> if (r)
>   kfree(r);
>
> Otherwise stores into the object in kmalloc can reach the object when
> it is already freed, which is a use-after-free.

Ok so there is more code executing in thread #1. That changes things.
>
> What does prevent the use-after-free?

There is no access to p in the first thread. If there are such accesses
then they are illegal. A user of slab allocators must ensure that there
are no accesses after freeing the object. And since there is a thread
that  at random checks p and frees it when not NULL then no other thread
would be allowed to touch the object.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-08 17:09             ` Christoph Lameter
@ 2015-09-08 19:24               ` Dmitry Vyukov
  2015-09-09 14:02                 ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-08 19:24 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Tue, Sep 8, 2015 at 7:09 PM, Christoph Lameter <cl@linux.com> wrote:
> On Tue, 8 Sep 2015, Dmitry Vyukov wrote:
>
>> >> I would expect that this is illegal code. Is my understanding correct?
>> >
>> > This should work. It could be a problem if thread 1 is touching
>> > the object.
>>
>> What does make it work?
>
> The 2nd thread gets the pointer that the first allocated and frees it.
> If there is no more processing then fine.
>
>> There are clearly memory barriers missing when passing the object
>> between threads. The typical correct pattern is:
>
> Why? If thread 2 gets the pointer it frees it. Thats ok.
>
>> // thread 1
>> smp_store_release(&p, kmalloc(8));
>>
>> // thread 2
>> void *r = smp_load_acquire(&p); // or READ_ONCE_CTRL
>> if (r)
>>   kfree(r);
>>
>> Otherwise stores into the object in kmalloc can reach the object when
>> it is already freed, which is a use-after-free.
>
> Ok so there is more code executing in thread #1. That changes things.
>>
>> What does prevent the use-after-free?
>
> There is no access to p in the first thread. If there are such accesses
> then they are illegal. A user of slab allocators must ensure that there
> are no accesses after freeing the object. And since there is a thread
> that  at random checks p and frees it when not NULL then no other thread
> would be allowed to touch the object.


But the memory allocator itself (kmalloc/kfree) generally reads and
writes the object (e.g. storing object size in header before object,
writing redzone in debug mode, reading and checking redzone in debug
mode, building freelist using first word of the object, etc). There is
no different between user accesses and memory allocator accesses just
before returning the object from kmalloc and right after accepting the
object in kfree.


-- 
Dmitry Vyukov, Software Engineer, dvyukov@google.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-08 19:24               ` Dmitry Vyukov
@ 2015-09-09 14:02                 ` Christoph Lameter
  2015-09-09 14:19                   ` Dmitry Vyukov
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-09 14:02 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Tue, 8 Sep 2015, Dmitry Vyukov wrote:

> > There is no access to p in the first thread. If there are such accesses
> > then they are illegal. A user of slab allocators must ensure that there
> > are no accesses after freeing the object. And since there is a thread
> > that  at random checks p and frees it when not NULL then no other thread
> > would be allowed to touch the object.
>
>
> But the memory allocator itself (kmalloc/kfree) generally reads and
> writes the object (e.g. storing object size in header before object,
> writing redzone in debug mode, reading and checking redzone in debug
> mode, building freelist using first word of the object, etc). There is
> no different between user accesses and memory allocator accesses just
> before returning the object from kmalloc and right after accepting the
> object in kfree.

There is a difference. The object is not accessible to any code before
kmalloc() returns. And it must not be accessible anymore when kfree() is called.
Thus the object is under exclusive control of the allocators when it is
handled.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-09 14:02                 ` Christoph Lameter
@ 2015-09-09 14:19                   ` Dmitry Vyukov
  2015-09-09 14:36                     ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-09 14:19 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Wed, Sep 9, 2015 at 4:02 PM, Christoph Lameter <cl@linux.com> wrote:
> On Tue, 8 Sep 2015, Dmitry Vyukov wrote:
>
>> > There is no access to p in the first thread. If there are such accesses
>> > then they are illegal. A user of slab allocators must ensure that there
>> > are no accesses after freeing the object. And since there is a thread
>> > that  at random checks p and frees it when not NULL then no other thread
>> > would be allowed to touch the object.
>>
>>
>> But the memory allocator itself (kmalloc/kfree) generally reads and
>> writes the object (e.g. storing object size in header before object,
>> writing redzone in debug mode, reading and checking redzone in debug
>> mode, building freelist using first word of the object, etc). There is
>> no different between user accesses and memory allocator accesses just
>> before returning the object from kmalloc and right after accepting the
>> object in kfree.
>
> There is a difference. The object is not accessible to any code before
> kmalloc() returns. And it must not be accessible anymore when kfree() is called.
> Thus the object is under exclusive control of the allocators when it is
> handled.


Yes, the object should not be accessible to other threads when kfree
is called. But in all examples above it is accessible.
For example, in the last example it is still being accessed by
kmalloc. Since there are no memory barriers, kmalloc does not
happen-before kfree, it happens concurrently with kfree, thus memory
accesses from kmalloc and kfree can be intermixed.
It would not be the case on a sequentially consistent
machine/language, but most machines and the implementation language do
not give sequential consistency guarantees.



-- 
Dmitry Vyukov, Software Engineer, dvyukov@google.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-09 14:19                   ` Dmitry Vyukov
@ 2015-09-09 14:36                     ` Christoph Lameter
  2015-09-09 15:30                       ` Dmitry Vyukov
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-09 14:36 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Wed, 9 Sep 2015, Dmitry Vyukov wrote:

> Yes, the object should not be accessible to other threads when kfree
> is called. But in all examples above it is accessible.

Ok. Then the code is buggy. If such an access is made then our debugging
tools will flag that.

> For example, in the last example it is still being accessed by
> kmalloc. Since there are no memory barriers, kmalloc does not
> happen-before kfree, it happens concurrently with kfree, thus memory

kmalloc cannot happen concurrently with kfree because the pointer to the
object is only available after kfree completes. There is therefore an
ordering implied by the API.

> accesses from kmalloc and kfree can be intermixed.

They cannot be mixed for the same object. kfree cannot run while kmalloc
is still in progress.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-09 14:36                     ` Christoph Lameter
@ 2015-09-09 15:30                       ` Dmitry Vyukov
  2015-09-09 15:44                         ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-09 15:30 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Wed, Sep 9, 2015 at 4:36 PM, Christoph Lameter <cl@linux.com> wrote:
> On Wed, 9 Sep 2015, Dmitry Vyukov wrote:
>
>> Yes, the object should not be accessible to other threads when kfree
>> is called. But in all examples above it is accessible.
>
> Ok. Then the code is buggy. If such an access is made then our debugging
> tools will flag that.
>
>> For example, in the last example it is still being accessed by
>> kmalloc. Since there are no memory barriers, kmalloc does not
>> happen-before kfree, it happens concurrently with kfree, thus memory
>
> kmalloc cannot happen concurrently with kfree because the pointer to the
> object is only available after kfree completes. There is therefore an
> ordering implied by the API.
>
>> accesses from kmalloc and kfree can be intermixed.
>
> They cannot be mixed for the same object. kfree cannot run while kmalloc
> is still in progress.

Things do not work this way for long time. If you read
Documentation/memory-barriers.txt or ARM/POWER manual and C language
standard, you will see that memory accesses from different threads can
be reordered (as perceived by other threads). So kmalloc still can be
running when the pointer to the newly allocated object is assigned to
a global (thus making it available for other threads, which can, in
particular, call kfree).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-09 15:30                       ` Dmitry Vyukov
@ 2015-09-09 15:44                         ` Christoph Lameter
  2015-09-09 16:09                           ` Dmitry Vyukov
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-09 15:44 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Wed, 9 Sep 2015, Dmitry Vyukov wrote:

> Things do not work this way for long time. If you read
> Documentation/memory-barriers.txt or ARM/POWER manual and C language
> standard, you will see that memory accesses from different threads can
> be reordered (as perceived by other threads). So kmalloc still can be
> running when the pointer to the newly allocated object is assigned to
> a global (thus making it available for other threads, which can, in
> particular, call kfree).

Guess this means that cachelines (A) may not have been be written back to
memory when the pointer to the object is written to another cacheline(B)
and that cacheline B arrives at the other processor first which has
outdated cachelines A in its cache? So the other processor uses the
contents of B to get to the pointer to A but then accesses outdated
information since the object contents cachelines (A) have not arrive there
yet?

Ok lets say that is the case then any write attempt to A results in an
exclusive cacheline state and at that point the cacheline is going to
reflect current contents. So if kfree would write to the object then it
will have the current information.

Also what does it matter for kfree since the contents of the object are no
longer in use?

Could you please come up with a concrete example where there is
brokenness that we need to consider.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-09 15:44                         ` Christoph Lameter
@ 2015-09-09 16:09                           ` Dmitry Vyukov
  2015-09-09 17:56                             ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-09 16:09 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Wed, Sep 9, 2015 at 5:44 PM, Christoph Lameter <cl@linux.com> wrote:
> On Wed, 9 Sep 2015, Dmitry Vyukov wrote:
>
>> Things do not work this way for long time. If you read
>> Documentation/memory-barriers.txt or ARM/POWER manual and C language
>> standard, you will see that memory accesses from different threads can
>> be reordered (as perceived by other threads). So kmalloc still can be
>> running when the pointer to the newly allocated object is assigned to
>> a global (thus making it available for other threads, which can, in
>> particular, call kfree).
>
> Guess this means that cachelines (A) may not have been be written back to
> memory when the pointer to the object is written to another cacheline(B)
> and that cacheline B arrives at the other processor first which has
> outdated cachelines A in its cache? So the other processor uses the
> contents of B to get to the pointer to A but then accesses outdated
> information since the object contents cachelines (A) have not arrive there
> yet?

That's one example.
Another example will be that kfree reads size from the object _before_
the object to the pointer is read. That sounds crazy, but it as
actually possible on Alpha processors.
Another example will be that C compiler lets a store to the object in
kmalloc sink below the store of the pointer to the object into global.


> Ok lets say that is the case then any write attempt to A results in an
> exclusive cacheline state and at that point the cacheline is going to
> reflect current contents. So if kfree would write to the object then it
> will have the current information.

No, because store to the object can still be pending on another CPU.
So kfree can get the object in E state in cache, but then another CPU
will finally issue the store and overwrite the slab freelist.

> Also what does it matter for kfree since the contents of the object are no
> longer in use?

I don't understand. First, it is not "not in use" infinitely, it can
be in use the very next moment. Also, we don't want corruption of slab
freelist as well. And we don't want spurious failure of debug
allocator that checks that there no writes after free.


> Could you please come up with a concrete example where there is
> brokenness that we need to consider.

Well, both examples in the first email are broken according to all of
Documentation/memory-barriers.txt, Alpha processor manual and C
standard (assuming that object passed to kfree must be in "quiescent"
state).
If you want a description of an exact scenario of how it can break:
building of freelist in kfree can be hoisted above check of
atomic_read(&pid->count) == 1 on Alpha processors, then the freelist
can become corrupted.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-09 16:09                           ` Dmitry Vyukov
@ 2015-09-09 17:56                             ` Christoph Lameter
  2015-09-09 18:44                               ` Paul E. McKenney
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-09 17:56 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko, Paul McKenney

On Wed, 9 Sep 2015, Dmitry Vyukov wrote:

> > Guess this means that cachelines (A) may not have been be written back to
> > memory when the pointer to the object is written to another cacheline(B)
> > and that cacheline B arrives at the other processor first which has
> > outdated cachelines A in its cache? So the other processor uses the
> > contents of B to get to the pointer to A but then accesses outdated
> > information since the object contents cachelines (A) have not arrive there
> > yet?
>
> That's one example.
> Another example will be that kfree reads size from the object _before_
> the object to the pointer is read. That sounds crazy, but it as
> actually possible on Alpha processors.

The size is encoded in the kmem_cache structure which is not changed. How
can that be relevant?

> Another example will be that C compiler lets a store to the object in
> kmalloc sink below the store of the pointer to the object into global.

Well if the pointer is used nakedly to communicate between threads the
barriers need to be used but what does this have to do with slabs?

> > Ok lets say that is the case then any write attempt to A results in an
> > exclusive cacheline state and at that point the cacheline is going to
> > reflect current contents. So if kfree would write to the object then it
> > will have the current information.
>
> No, because store to the object can still be pending on another CPU.

That would violate the cache coherency protocol as far as I can tell?

> So kfree can get the object in E state in cache, but then another CPU
> will finally issue the store and overwrite the slab freelist.

Sounds like a broken processor design to me. AFAICT the MESI protocol does
not allow this.

> > Also what does it matter for kfree since the contents of the object are no
> > longer in use?
>
> I don't understand. First, it is not "not in use" infinitely, it can
> be in use the very next moment. Also, we don't want corruption of slab
> freelist as well. And we don't want spurious failure of debug
> allocator that checks that there no writes after free.

Slab freelists are protected by locks.

A processor that can randomly defer writes to cachelines in the face of
other processors owning cachelines exclusively does not seem sane to me.
In fact its no longer exclusive.

> > Could you please come up with a concrete example where there is
> > brokenness that we need to consider.
>
> Well, both examples in the first email are broken according to all of
> Documentation/memory-barriers.txt, Alpha processor manual and C
> standard (assuming that object passed to kfree must be in "quiescent"
> state).
> If you want a description of an exact scenario of how it can break:
> building of freelist in kfree can be hoisted above check of
> atomic_read(&pid->count) == 1 on Alpha processors, then the freelist
> can become corrupted.

Sounds like the atomic_read needs more barriers.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-09 17:56                             ` Christoph Lameter
@ 2015-09-09 18:44                               ` Paul E. McKenney
  2015-09-09 19:01                                 ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Paul E. McKenney @ 2015-09-09 18:44 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Dmitry Vyukov, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Wed, Sep 09, 2015 at 12:56:20PM -0500, Christoph Lameter wrote:
> On Wed, 9 Sep 2015, Dmitry Vyukov wrote:
> 
> > > Guess this means that cachelines (A) may not have been be written back to
> > > memory when the pointer to the object is written to another cacheline(B)
> > > and that cacheline B arrives at the other processor first which has
> > > outdated cachelines A in its cache? So the other processor uses the
> > > contents of B to get to the pointer to A but then accesses outdated
> > > information since the object contents cachelines (A) have not arrive there
> > > yet?
> >
> > That's one example.
> > Another example will be that kfree reads size from the object _before_
> > the object to the pointer is read. That sounds crazy, but it as
> > actually possible on Alpha processors.
> 
> The size is encoded in the kmem_cache structure which is not changed. How
> can that be relevant?

IIRC, at one point some of the Linux-kernel allocators stored some
state in the object itself.  What is the current state?

> > Another example will be that C compiler lets a store to the object in
> > kmalloc sink below the store of the pointer to the object into global.
> 
> Well if the pointer is used nakedly to communicate between threads the
> barriers need to be used but what does this have to do with slabs?

It certainly is something that the users of slabs need to know.
In particular, what exactly are the synchronization requirements that
the slabs place on their users?  Dmitry needs to know this because he
is constructing a tool that automatically locates race conditions, and
he needs to know who to complain to when he finds a race condition that
involves slabs and their users.

Here are some of my guesses, but you are the maintainer, not me.  ;-)

1.	Do there need to be any compiler or CPU barriers between
	last use and free on a single thread?  Here is an example:

	p = kmalloc(sizeof(*p), GFP_KERNEL);
	if (!p)
		return NULL;
	initialize_me(p);
	if (do_not_really_need_it(p)) {
		kfree(p);
		return NULL;
	}
	return p;

	Suppose that both initialize_me() and do_not_really_need_it()
	are static inline functions, so that all of their loads and
	stores to the structure referenced by p are visible to the
	compiler.  Is the above code correct, or is the user required
	to place something like barrier() before the call to kfree()?

	I would hope that the caller of kfree() need not invoke barrier()
	beforehand, but it is your decision.  If the caller need not
	invoke barrier(), then it might (or might not) need to be supplied
	by the kfree() implementation.	From what I understand, Dmitry's
	tool indicated a barrier() is needed somewhere in this code path.

2.	Is it OK to do a hot handoff from kmalloc() on one thread to
	kfree on another?

	Thread 0:

		gp = kmalloc(sizeof(*gp), GFP_KERNEL);

	Thread 1:

		p = READ_ONCE(gp);
		if (gp)
			kfree(gp);

	I would be strongly tempted to just say "no" to this use case
	on the grounds that it is pointless, but you know your users
	better than do I.

3.	The case that Dmitry pointed out was something like the following:

	Thread 0:

		p = kmalloc(sizeof(*p), GFP_KERNEL);
		if (!p)
			return NULL;
		atomic_set(&p->rc, 1);
		return p;

	Thread 1:

		WARN_ON(!p->rc);  /* Must own ref to take another. */
		atomic_inc(&p->rc);

	Thread 2:

		if (p->rc == 1 ||
		    atomic_dec_and_test(&p->rc))
		    	kfree(p);

	This ends up really being the same as #1 above.

> > > Ok lets say that is the case then any write attempt to A results in an
> > > exclusive cacheline state and at that point the cacheline is going to
> > > reflect current contents. So if kfree would write to the object then it
> > > will have the current information.
> >
> > No, because store to the object can still be pending on another CPU.
> 
> That would violate the cache coherency protocol as far as I can tell?

It would, but there are three cases that neverthess need to be considered:
(1) The pointer is in a different cacheline than is the pointed-to object,
and ordering of accesses to the pointer and object matter, (2) The object
covers more than one cacheline, and the ordering of accesses matters,
and (3) The fields are accessed using non-atomic operations and the
compiler can see into kfree().  I am most worried about #3.

> > So kfree can get the object in E state in cache, but then another CPU
> > will finally issue the store and overwrite the slab freelist.
> 
> Sounds like a broken processor design to me. AFAICT the MESI protocol does
> not allow this.

We really need to focus on specific code sequences.  I suspect that you
guys are talking past each other.

> > > Also what does it matter for kfree since the contents of the object are no
> > > longer in use?
> >
> > I don't understand. First, it is not "not in use" infinitely, it can
> > be in use the very next moment. Also, we don't want corruption of slab
> > freelist as well. And we don't want spurious failure of debug
> > allocator that checks that there no writes after free.
> 
> Slab freelists are protected by locks.

Are these locks acquired on the fastpaths?  I was under the impression
that they are not.  That said, I do believe that these locks fully
protect the case where one CPU does kfree() and some other CPU later
returns that same object from kmalloc().

> A processor that can randomly defer writes to cachelines in the face of
> other processors owning cachelines exclusively does not seem sane to me.
> In fact its no longer exclusive.

Welcome to the wonderful world of store buffers, which are present even
on strongly ordered systems such as x86 and the mainframe.

> > > Could you please come up with a concrete example where there is
> > > brokenness that we need to consider.
> >
> > Well, both examples in the first email are broken according to all of
> > Documentation/memory-barriers.txt, Alpha processor manual and C
> > standard (assuming that object passed to kfree must be in "quiescent"
> > state).
> > If you want a description of an exact scenario of how it can break:
> > building of freelist in kfree can be hoisted above check of
> > atomic_read(&pid->count) == 1 on Alpha processors, then the freelist
> > can become corrupted.
> 
> Sounds like the atomic_read needs more barriers.

We all know that this won't happen.

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-09 18:44                               ` Paul E. McKenney
@ 2015-09-09 19:01                                 ` Christoph Lameter
  2015-09-09 20:36                                   ` Paul E. McKenney
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-09 19:01 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Dmitry Vyukov, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Wed, 9 Sep 2015, Paul E. McKenney wrote:

> > The size is encoded in the kmem_cache structure which is not changed. How
> > can that be relevant?
>
> IIRC, at one point some of the Linux-kernel allocators stored some
> state in the object itself.  What is the current state?

SLUB stores the link to the next object most of the time in the first word
of the object. But then this is mostly done for per cpu lists. Another
processor would access another linked list.

> > Well if the pointer is used nakedly to communicate between threads the
> > barriers need to be used but what does this have to do with slabs?
>
> It certainly is something that the users of slabs need to know.

Well no this is general synchronization.

> In particular, what exactly are the synchronization requirements that
> the slabs place on their users?  Dmitry needs to know this because he
> is constructing a tool that automatically locates race conditions, and
> he needs to know who to complain to when he finds a race condition that
> involves slabs and their users.
>
> Here are some of my guesses, but you are the maintainer, not me.  ;-)
>
> 1.	Do there need to be any compiler or CPU barriers between
> 	last use and free on a single thread?  Here is an example:
>
> 	p = kmalloc(sizeof(*p), GFP_KERNEL);
> 	if (!p)
> 		return NULL;
> 	initialize_me(p);
> 	if (do_not_really_need_it(p)) {
> 		kfree(p);
> 		return NULL;
> 	}
> 	return p;
>
> 	Suppose that both initialize_me() and do_not_really_need_it()
> 	are static inline functions, so that all of their loads and
> 	stores to the structure referenced by p are visible to the
> 	compiler.  Is the above code correct, or is the user required
> 	to place something like barrier() before the call to kfree()?
>
> 	I would hope that the caller of kfree() need not invoke barrier()
> 	beforehand, but it is your decision.  If the caller need not
> 	invoke barrier(), then it might (or might not) need to be supplied
> 	by the kfree() implementation.	From what I understand, Dmitry's
> 	tool indicated a barrier() is needed somewhere in this code path.

This would all use per cpu data. As soon as a handoff is required within
the allocators locks are being used. So I would say no.


> 2.	Is it OK to do a hot handoff from kmalloc() on one thread to
> 	kfree on another?
>
> 	Thread 0:
>
> 		gp = kmalloc(sizeof(*gp), GFP_KERNEL);
>
> 	Thread 1:
>
> 		p = READ_ONCE(gp);
> 		if (gp)
> 			kfree(gp);
>
> 	I would be strongly tempted to just say "no" to this use case
> 	on the grounds that it is pointless, but you know your users
> 	better than do I.

Its pointless since you cannot use gp in thread 0 to dereference the
object given the race condition that thread 1 creates. There is always
the potential of a use after free.

> 3.	The case that Dmitry pointed out was something like the following:
>
> 	Thread 0:
>
> 		p = kmalloc(sizeof(*p), GFP_KERNEL);
> 		if (!p)
> 			return NULL;

I guess p is a global. p exists with p->rc having a random value afer the
kmalloc. Since you access p at random times this is a bad(tm) idea. Use
kzalloc at least?

> 		atomic_set(&p->rc, 1);
> 		return p;
>
> 	Thread 1:
>
> 		WARN_ON(!p->rc);  /* Must own ref to take another. */

This definitely can occur given the above code.

> 		atomic_inc(&p->rc);
>
> 	Thread 2:
>
> 		if (p->rc == 1 ||
> 		    atomic_dec_and_test(&p->rc))
> 		    	kfree(p);
>
> 	This ends up really being the same as #1 above.

This is all screwed if p is a global variable. Typically one would expose
p in a safe way by first storing the address in a local variable,
populating the contents of the object and then ensuring that the
cachelines are written back before storing the address into p.


> > > > Ok lets say that is the case then any write attempt to A results in an
> > > > exclusive cacheline state and at that point the cacheline is going to
> > > > reflect current contents. So if kfree would write to the object then it
> > > > will have the current information.
> > >
> > > No, because store to the object can still be pending on another CPU.
> >
> > That would violate the cache coherency protocol as far as I can tell?
>
> It would, but there are three cases that neverthess need to be considered:
> (1) The pointer is in a different cacheline than is the pointed-to object,
> and ordering of accesses to the pointer and object matter, (2) The object
> covers more than one cacheline, and the ordering of accesses matters,
> and (3) The fields are accessed using non-atomic operations and the
> compiler can see into kfree().  I am most worried about #3.

(1) was covered in the discussion and it seems that this is tied to the
way the global pointer is being used.

(2) If the ordering matters then we need to have proper fences etc.

(3) What does "see" mean? The compiler knows the code of kfree and pulls
bits of processing out?


> > > So kfree can get the object in E state in cache, but then another CPU
> > > will finally issue the store and overwrite the slab freelist.
> >
> > Sounds like a broken processor design to me. AFAICT the MESI protocol does
> > not allow this.
>
> We really need to focus on specific code sequences.  I suspect that you
> guys are talking past each other.


Could be. Lets be clear here.

> > > > Also what does it matter for kfree since the contents of the object are no
> > > > longer in use?
> > >
> > > I don't understand. First, it is not "not in use" infinitely, it can
> > > be in use the very next moment. Also, we don't want corruption of slab
> > > freelist as well. And we don't want spurious failure of debug
> > > allocator that checks that there no writes after free.
> >
> > Slab freelists are protected by locks.
>
> Are these locks acquired on the fastpaths?  I was under the impression
> that they are not.  That said, I do believe that these locks fully
> protect the case where one CPU does kfree() and some other CPU later
> returns that same object from kmalloc().

Fastpaths only do per cpu processing and do not cross boundaries. If
objects are handed over to other processors queues or concurrent accesses
occur then locks are used.

> > A processor that can randomly defer writes to cachelines in the face of
> > other processors owning cachelines exclusively does not seem sane to me.
> > In fact its no longer exclusive.
>
> Welcome to the wonderful world of store buffers, which are present even
> on strongly ordered systems such as x86 and the mainframe.

Store buffers hold complete cachelines that have been written to by a
processor. Hmmm... Guess I need to think more about this. Dont know the
detail here on how they interact with cacheline exclusivity and stuff.

> > Sounds like the atomic_read needs more barriers.
>
> We all know that this won't happen.

Well then welcome to the wonderful world of a broken kernel. Still
wondering what this has to do with slab allocators.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-09 19:01                                 ` Christoph Lameter
@ 2015-09-09 20:36                                   ` Paul E. McKenney
  2015-09-09 23:23                                     ` Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?) Christoph Lameter
  2015-09-09 23:31                                     ` Is it OK to pass non-acquired objects to kfree? Christoph Lameter
  0 siblings, 2 replies; 45+ messages in thread
From: Paul E. McKenney @ 2015-09-09 20:36 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Dmitry Vyukov, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Wed, Sep 09, 2015 at 02:01:40PM -0500, Christoph Lameter wrote:
> On Wed, 9 Sep 2015, Paul E. McKenney wrote:
> 
> > > The size is encoded in the kmem_cache structure which is not changed. How
> > > can that be relevant?
> >
> > IIRC, at one point some of the Linux-kernel allocators stored some
> > state in the object itself.  What is the current state?
> 
> SLUB stores the link to the next object most of the time in the first word
> of the object. But then this is mostly done for per cpu lists. Another
> processor would access another linked list.

OK, good to know.

> > > Well if the pointer is used nakedly to communicate between threads the
> > > barriers need to be used but what does this have to do with slabs?
> >
> > It certainly is something that the users of slabs need to know.
> 
> Well no this is general synchronization.

Either way, Dmitry's tool got a hit on real code using the slab
allocators.  If that hit is a false positive, then clearly Dmitry
needs to fix his tool, however, I am not (yet) convinced that it is a
false positive.  If it is not a false positive, we might well need to
articulate the rules for use of the slab allocators.

> > In particular, what exactly are the synchronization requirements that
> > the slabs place on their users?  Dmitry needs to know this because he
> > is constructing a tool that automatically locates race conditions, and
> > he needs to know who to complain to when he finds a race condition that
> > involves slabs and their users.
> >
> > Here are some of my guesses, but you are the maintainer, not me.  ;-)
> >
> > 1.	Do there need to be any compiler or CPU barriers between
> > 	last use and free on a single thread?  Here is an example:
> >
> > 	p = kmalloc(sizeof(*p), GFP_KERNEL);
> > 	if (!p)
> > 		return NULL;
> > 	initialize_me(p);
> > 	if (do_not_really_need_it(p)) {
> > 		kfree(p);
> > 		return NULL;
> > 	}
> > 	return p;
> >
> > 	Suppose that both initialize_me() and do_not_really_need_it()
> > 	are static inline functions, so that all of their loads and
> > 	stores to the structure referenced by p are visible to the
> > 	compiler.  Is the above code correct, or is the user required
> > 	to place something like barrier() before the call to kfree()?
> >
> > 	I would hope that the caller of kfree() need not invoke barrier()
> > 	beforehand, but it is your decision.  If the caller need not
> > 	invoke barrier(), then it might (or might not) need to be supplied
> > 	by the kfree() implementation.	From what I understand, Dmitry's
> > 	tool indicated a barrier() is needed somewhere in this code path.
> 
> This would all use per cpu data. As soon as a handoff is required within
> the allocators locks are being used. So I would say no.

As in "no, it is not necessary for the caller of kfree() to invoke barrier()
in this example", right?

> > 2.	Is it OK to do a hot handoff from kmalloc() on one thread to
> > 	kfree on another?
> >
> > 	Thread 0:
> >
> > 		gp = kmalloc(sizeof(*gp), GFP_KERNEL);
> >
> > 	Thread 1:
> >
> > 		p = READ_ONCE(gp);
> > 		if (gp)
> > 			kfree(gp);
> >
> > 	I would be strongly tempted to just say "no" to this use case
> > 	on the grounds that it is pointless, but you know your users
> > 	better than do I.
> 
> Its pointless since you cannot use gp in thread 0 to dereference the
> object given the race condition that thread 1 creates. There is always
> the potential of a use after free.

I like that approach!  ;-)

> > 3.	The case that Dmitry pointed out was something like the following:
> >
> > 	Thread 0:
> >
> > 		p = kmalloc(sizeof(*p), GFP_KERNEL);
> > 		if (!p)
> > 			return NULL;
> 
> I guess p is a global. p exists with p->rc having a random value afer the
> kmalloc. Since you access p at random times this is a bad(tm) idea. Use
> kzalloc at least?

Ah, I clearly made this example too minimal.  My intent was that p is
a local, and is returned to the caller.  The caller can hand it off,
but each thread having a reference must have that reference recorded
in p->rc.  Let me try redoing this example below.

> > 		atomic_set(&p->rc, 1);
> > 		return p;
> >
> > 	Thread 1:
> >
> > 		WARN_ON(!p->rc);  /* Must own ref to take another. */
> 
> This definitely can occur given the above code.
> 
> > 		atomic_inc(&p->rc);
> >
> > 	Thread 2:
> >
> > 		if (p->rc == 1 ||
> > 		    atomic_dec_and_test(&p->rc))
> > 		    	kfree(p);
> >
> > 	This ends up really being the same as #1 above.
> 
> This is all screwed if p is a global variable. Typically one would expose
> p in a safe way by first storing the address in a local variable,
> populating the contents of the object and then ensuring that the
> cachelines are written back before storing the address into p.

OK, trying again...

	struct foo *alloc_it(void)
	{
		struct foo *p;

		p = kmalloc(sizeof(*p), GFP_KERNEL);
		if (!p)
			return NULL;
		atomic_set(&p->rc, 1);
		return p;
	}

	void get_ref(struct foo *p)
	{
		WARN_ON(!p->rc);  /* Must own ref to take another. */
		atomic_inc(&p->rc);
	}

	void put_ref(struct foo *p)
	{
		if (p->rc == 1 ||
		    atomic_dec_and_test(&p->rc))
		    	kfree(p);
	}

	Thread 0:
		p = alloc_it();  /* p is a local. */
		get_ref(p);  /* Get reference for thread 1. */
		smp_store_release(&thread1_p, p);  /* only threads 0 and 1. */
		do_something();
		put_ref(p);  /* Thread 1 now has the only reference. */

	Thread 1:
		p = smp_load_acquire(&thread1_p);
		do_something_else(p);
		put_ref(p);

	The second-to-last reference used the fully ordered
	atomic_dec_and_test() to release the reference, so there should
	be no cross-thread references.

	So I believe that this ends up really being the same as #1 above.

> > > > > Ok lets say that is the case then any write attempt to A results in an
> > > > > exclusive cacheline state and at that point the cacheline is going to
> > > > > reflect current contents. So if kfree would write to the object then it
> > > > > will have the current information.
> > > >
> > > > No, because store to the object can still be pending on another CPU.
> > >
> > > That would violate the cache coherency protocol as far as I can tell?
> >
> > It would, but there are three cases that neverthess need to be considered:
> > (1) The pointer is in a different cacheline than is the pointed-to object,
> > and ordering of accesses to the pointer and object matter, (2) The object
> > covers more than one cacheline, and the ordering of accesses matters,
> > and (3) The fields are accessed using non-atomic operations and the
> > compiler can see into kfree().  I am most worried about #3.
> 
> (1) was covered in the discussion and it seems that this is tied to the
> way the global pointer is being used.

I am inclined to agree here.

> (2) If the ordering matters then we need to have proper fences etc.

Including compiler fences when required.

> (3) What does "see" mean? The compiler knows the code of kfree and pulls
> bits of processing out?

Yep, including for example the store adding to the SLUB per-CPU freelist.
Keeping the C aliasing rules in mind.

> > > > So kfree can get the object in E state in cache, but then another CPU
> > > > will finally issue the store and overwrite the slab freelist.
> > >
> > > Sounds like a broken processor design to me. AFAICT the MESI protocol does
> > > not allow this.
> >
> > We really need to focus on specific code sequences.  I suspect that you
> > guys are talking past each other.
> 
> Could be. Lets be clear here.
> 
> > > > > Also what does it matter for kfree since the contents of the object are no
> > > > > longer in use?
> > > >
> > > > I don't understand. First, it is not "not in use" infinitely, it can
> > > > be in use the very next moment. Also, we don't want corruption of slab
> > > > freelist as well. And we don't want spurious failure of debug
> > > > allocator that checks that there no writes after free.
> > >
> > > Slab freelists are protected by locks.
> >
> > Are these locks acquired on the fastpaths?  I was under the impression
> > that they are not.  That said, I do believe that these locks fully
> > protect the case where one CPU does kfree() and some other CPU later
> > returns that same object from kmalloc().
> 
> Fastpaths only do per cpu processing and do not cross boundaries. If
> objects are handed over to other processors queues or concurrent accesses
> occur then locks are used.

Good, then the only issues with the fastpath freelists would involve
compiler optimizations.

> > > A processor that can randomly defer writes to cachelines in the face of
> > > other processors owning cachelines exclusively does not seem sane to me.
> > > In fact its no longer exclusive.
> >
> > Welcome to the wonderful world of store buffers, which are present even
> > on strongly ordered systems such as x86 and the mainframe.
> 
> Store buffers hold complete cachelines that have been written to by a
> processor.

In many cases, partial cachelines.  If the cacheline is not available
locally, the processor cannot know the contents of the rest of the cache
line, only the contents of the portion that it recently stored into.

>            Hmmm... Guess I need to think more about this. Dont know the
> detail here on how they interact with cacheline exclusivity and stuff.

A large number of stores to the same variable can happen concurrently,
and the system can stitch together the order of these stores after
the fact.

> > > Sounds like the atomic_read needs more barriers.
> >
> > We all know that this won't happen.
> 
> Well then welcome to the wonderful world of a broken kernel. Still
> wondering what this has to do with slab allocators.

The concern I have is that the compiler might be able to reorder the
running CPU's last accesses to an object that is to be kfree()ed with
kfree()'s accesses.  The issue being that the compiler is within its
rights to assume pointers to different types don't alias unless one of the
types is char * (or some such, Dmitry can correct me if I am confused).

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?)
  2015-09-09 20:36                                   ` Paul E. McKenney
@ 2015-09-09 23:23                                     ` Christoph Lameter
  2015-09-10  0:08                                       ` Paul E. McKenney
  2015-09-10  7:22                                       ` Vlastimil Babka
  2015-09-09 23:31                                     ` Is it OK to pass non-acquired objects to kfree? Christoph Lameter
  1 sibling, 2 replies; 45+ messages in thread
From: Christoph Lameter @ 2015-09-09 23:23 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Dmitry Vyukov, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Wed, 9 Sep 2015, Paul E. McKenney wrote:

> > > > A processor that can randomly defer writes to cachelines in the face of
> > > > other processors owning cachelines exclusively does not seem sane to me.
> > > > In fact its no longer exclusive.
> > >
> > > Welcome to the wonderful world of store buffers, which are present even
> > > on strongly ordered systems such as x86 and the mainframe.
> >
> > Store buffers hold complete cachelines that have been written to by a
> > processor.
>
> In many cases, partial cachelines.  If the cacheline is not available
> locally, the processor cannot know the contents of the rest of the cache
> line, only the contents of the portion that it recently stored into.

For a partial cacheline it would have to read the rest of the cacheline
before updating. And I would expect the processor to have exclusive access
to the cacheline that is held in a store buffer. If not then there is
trouble afoot.


> >            Hmmm... Guess I need to think more about this. Dont know the
> > detail here on how they interact with cacheline exclusivity and stuff.
>
> A large number of stores to the same variable can happen concurrently,
> and the system can stitch together the order of these stores after
> the fact.

Well thats what I know. The exact way the store buffers interact with
cache coherency is what I do not know.

>
> > > > Sounds like the atomic_read needs more barriers.
> > >
> > > We all know that this won't happen.
> >
> > Well then welcome to the wonderful world of a broken kernel. Still
> > wondering what this has to do with slab allocators.
>
> The concern I have is that the compiler might be able to reorder the
> running CPU's last accesses to an object that is to be kfree()ed with
> kfree()'s accesses.  The issue being that the compiler is within its
> rights to assume pointers to different types don't alias unless one of the
> types is char * (or some such, Dmitry can correct me if I am confused).

Hmmm... Yeah if one assumes that the object is going to be handled by a
different processor then that is a valid concern but if its on the same
processor then the guarantee is that the changes become visible to the
exeucting thread in program order. That is enough.

The transfer to another processor is guarded by locks and I think that
those are enough to ensure that the cachelines become visible in a
controlled fashion.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-09 20:36                                   ` Paul E. McKenney
  2015-09-09 23:23                                     ` Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?) Christoph Lameter
@ 2015-09-09 23:31                                     ` Christoph Lameter
  2015-09-10  9:55                                       ` Dmitry Vyukov
  1 sibling, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-09 23:31 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Dmitry Vyukov, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Wed, 9 Sep 2015, Paul E. McKenney wrote:

> Either way, Dmitry's tool got a hit on real code using the slab
> allocators.  If that hit is a false positive, then clearly Dmitry
> needs to fix his tool, however, I am not (yet) convinced that it is a
> false positive.  If it is not a false positive, we might well need to
> articulate the rules for use of the slab allocators.

Could I get a clear definiton as to what exactly is positive? Was this
using SLAB, SLUB or SLOB?

> > This would all use per cpu data. As soon as a handoff is required within
> > the allocators locks are being used. So I would say no.
>
> As in "no, it is not necessary for the caller of kfree() to invoke barrier()
> in this example", right?

Actually SLUB contains a barrier already in kfree(). Has to be there
because of the way the per cpu pointer is being handled.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?)
  2015-09-09 23:23                                     ` Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?) Christoph Lameter
@ 2015-09-10  0:08                                       ` Paul E. McKenney
  2015-09-10  0:21                                         ` Christoph Lameter
  2015-09-10  7:22                                       ` Vlastimil Babka
  1 sibling, 1 reply; 45+ messages in thread
From: Paul E. McKenney @ 2015-09-10  0:08 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Dmitry Vyukov, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Wed, Sep 09, 2015 at 06:23:11PM -0500, Christoph Lameter wrote:
> On Wed, 9 Sep 2015, Paul E. McKenney wrote:
> 
> > > > > A processor that can randomly defer writes to cachelines in the face of
> > > > > other processors owning cachelines exclusively does not seem sane to me.
> > > > > In fact its no longer exclusive.
> > > >
> > > > Welcome to the wonderful world of store buffers, which are present even
> > > > on strongly ordered systems such as x86 and the mainframe.
> > >
> > > Store buffers hold complete cachelines that have been written to by a
> > > processor.
> >
> > In many cases, partial cachelines.  If the cacheline is not available
> > locally, the processor cannot know the contents of the rest of the cache
> > line, only the contents of the portion that it recently stored into.
> 
> For a partial cacheline it would have to read the rest of the cacheline
> before updating. And I would expect the processor to have exclusive access
> to the cacheline that is held in a store buffer. If not then there is
> trouble afoot.

Yep.  The store buffer would hold part of the cacheline, gain exclusive
access to that cacheline, then update it.

> > >            Hmmm... Guess I need to think more about this. Dont know the
> > > detail here on how they interact with cacheline exclusivity and stuff.
> >
> > A large number of stores to the same variable can happen concurrently,
> > and the system can stitch together the order of these stores after
> > the fact.
> 
> Well thats what I know. The exact way the store buffers interact with
> cache coherency is what I do not know.

That would vary among systems and be highly optimized.

> > > > > Sounds like the atomic_read needs more barriers.
> > > >
> > > > We all know that this won't happen.
> > >
> > > Well then welcome to the wonderful world of a broken kernel. Still
> > > wondering what this has to do with slab allocators.
> >
> > The concern I have is that the compiler might be able to reorder the
> > running CPU's last accesses to an object that is to be kfree()ed with
> > kfree()'s accesses.  The issue being that the compiler is within its
> > rights to assume pointers to different types don't alias unless one of the
> > types is char * (or some such, Dmitry can correct me if I am confused).
> 
> Hmmm... Yeah if one assumes that the object is going to be handled by a
> different processor then that is a valid concern but if its on the same
> processor then the guarantee is that the changes become visible to the
> exeucting thread in program order. That is enough.

The CPU is indeed constrained in this way, but the compiler is not.
In particular, the CPU must do exact alias analysis, while the compiler
is permitted to do approximate alias analysis in some cases.  However,
in gcc builds of the Linux kernel, I believe that the -fno-strict-aliasing
gcc command-line argument forces exact alias analysis.

Dmitry, anything that I am missing?

> The transfer to another processor is guarded by locks and I think that
> those are enough to ensure that the cachelines become visible in a
> controlled fashion.

For the kfree()-to-kmalloc() path, I do believe that you are correct.
Dmitry's question was leading up to the kfree().

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?)
  2015-09-10  0:08                                       ` Paul E. McKenney
@ 2015-09-10  0:21                                         ` Christoph Lameter
  2015-09-10  1:10                                           ` Paul E. McKenney
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-10  0:21 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Dmitry Vyukov, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Wed, 9 Sep 2015, Paul E. McKenney wrote:

> The CPU is indeed constrained in this way, but the compiler is not.
> In particular, the CPU must do exact alias analysis, while the compiler
> is permitted to do approximate alias analysis in some cases.  However,
> in gcc builds of the Linux kernel, I believe that the -fno-strict-aliasing
> gcc command-line argument forces exact alias analysis.
>
> Dmitry, anything that I am missing?
>
> > The transfer to another processor is guarded by locks and I think that
> > those are enough to ensure that the cachelines become visible in a
> > controlled fashion.
>
> For the kfree()-to-kmalloc() path, I do believe that you are correct.
> Dmitry's question was leading up to the kfree().

The kmalloc-to-kfree path has similar bounds that ensure correctness.
First of all it is the availability of the pointer and the transfer of the
contents of the pointer to a remove processor.

Strictly speaking the processor would violate the rule that there cannnot
be a memory access to the object after kfree is called if the compiler
would move a store into kfree().

But then again kfree() contains a barrier() which would block the compiler
from moving anything into the free path.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?)
  2015-09-10  0:21                                         ` Christoph Lameter
@ 2015-09-10  1:10                                           ` Paul E. McKenney
  2015-09-10  1:47                                             ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Paul E. McKenney @ 2015-09-10  1:10 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Dmitry Vyukov, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Wed, Sep 09, 2015 at 07:21:34PM -0500, Christoph Lameter wrote:
> On Wed, 9 Sep 2015, Paul E. McKenney wrote:
> 
> > The CPU is indeed constrained in this way, but the compiler is not.
> > In particular, the CPU must do exact alias analysis, while the compiler
> > is permitted to do approximate alias analysis in some cases.  However,
> > in gcc builds of the Linux kernel, I believe that the -fno-strict-aliasing
> > gcc command-line argument forces exact alias analysis.
> >
> > Dmitry, anything that I am missing?
> >
> > > The transfer to another processor is guarded by locks and I think that
> > > those are enough to ensure that the cachelines become visible in a
> > > controlled fashion.
> >
> > For the kfree()-to-kmalloc() path, I do believe that you are correct.
> > Dmitry's question was leading up to the kfree().
> 
> The kmalloc-to-kfree path has similar bounds that ensure correctness.
> First of all it is the availability of the pointer and the transfer of the
> contents of the pointer to a remove processor.
> 
> Strictly speaking the processor would violate the rule that there cannnot
> be a memory access to the object after kfree is called if the compiler
> would move a store into kfree().
> 
> But then again kfree() contains a barrier() which would block the compiler
> from moving anything into the free path.

That barrier() is implicit in the fact that kfree() is an external
function?  Or are my eyes failing me?

But yes, a barrier() seems to me to suffice in this situation.

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?)
  2015-09-10  1:10                                           ` Paul E. McKenney
@ 2015-09-10  1:47                                             ` Christoph Lameter
  2015-09-10  7:38                                               ` Vlastimil Babka
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-10  1:47 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Dmitry Vyukov, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Wed, 9 Sep 2015, Paul E. McKenney wrote:

> > But then again kfree() contains a barrier() which would block the compiler
> > from moving anything into the free path.
>
> That barrier() is implicit in the fact that kfree() is an external
> function?  Or are my eyes failing me?

kfree at some point calls slab_free(). That function has a barrier. All
free operations go through it.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?)
  2015-09-09 23:23                                     ` Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?) Christoph Lameter
  2015-09-10  0:08                                       ` Paul E. McKenney
@ 2015-09-10  7:22                                       ` Vlastimil Babka
  2015-09-10 16:36                                         ` Christoph Lameter
  1 sibling, 1 reply; 45+ messages in thread
From: Vlastimil Babka @ 2015-09-10  7:22 UTC (permalink / raw)
  To: Christoph Lameter, Paul E. McKenney
  Cc: Dmitry Vyukov, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On 09/10/2015 01:23 AM, Christoph Lameter wrote:
> On Wed, 9 Sep 2015, Paul E. McKenney wrote:
> 
>> > > > A processor that can randomly defer writes to cachelines in the face of
>> > > > other processors owning cachelines exclusively does not seem sane to me.
>> > > > In fact its no longer exclusive.
>> > >
>> > > Welcome to the wonderful world of store buffers, which are present even
>> > > on strongly ordered systems such as x86 and the mainframe.
>> >
>> > Store buffers hold complete cachelines that have been written to by a
>> > processor.
>>
>> In many cases, partial cachelines.  If the cacheline is not available
>> locally, the processor cannot know the contents of the rest of the cache
>> line, only the contents of the portion that it recently stored into.
> 
> For a partial cacheline it would have to read the rest of the cacheline
> before updating. And I would expect the processor to have exclusive access
> to the cacheline that is held in a store buffer. If not then there is
> trouble afoot.

IIRC that (or something similar with same guarantees) basically happens on x86
when you use the LOCK prefix, i.e. for atomic inc etc. Doing that always would
destroy performance.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?)
  2015-09-10  1:47                                             ` Christoph Lameter
@ 2015-09-10  7:38                                               ` Vlastimil Babka
  2015-09-10 16:37                                                 ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Vlastimil Babka @ 2015-09-10  7:38 UTC (permalink / raw)
  To: Christoph Lameter, Paul E. McKenney
  Cc: Dmitry Vyukov, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On 09/10/2015 03:47 AM, Christoph Lameter wrote:
> On Wed, 9 Sep 2015, Paul E. McKenney wrote:
> 
>> > But then again kfree() contains a barrier() which would block the compiler
>> > from moving anything into the free path.
>>
>> That barrier() is implicit in the fact that kfree() is an external
>> function?  Or are my eyes failing me?

Is the "external function" not enough? Does it change e.g. with LTO, or is that
also subject to the aliasing rules (which I admit not knowing exactly)?

> 
> kfree at some point calls slab_free(). That function has a barrier. All
> free operations go through it.

SLAB doesn't have such barrier AFAICS. It will put the object on per-cpu cache
and that's it. Only flushing the full cache takes a spin lock.

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-09 23:31                                     ` Is it OK to pass non-acquired objects to kfree? Christoph Lameter
@ 2015-09-10  9:55                                       ` Dmitry Vyukov
  2015-09-10 10:42                                         ` Jesper Dangaard Brouer
                                                           ` (2 more replies)
  0 siblings, 3 replies; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-10  9:55 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Thu, Sep 10, 2015 at 1:31 AM, Christoph Lameter <cl@linux.com> wrote:
> On Wed, 9 Sep 2015, Paul E. McKenney wrote:
>
>> Either way, Dmitry's tool got a hit on real code using the slab
>> allocators.  If that hit is a false positive, then clearly Dmitry
>> needs to fix his tool, however, I am not (yet) convinced that it is a
>> false positive.  If it is not a false positive, we might well need to
>> articulate the rules for use of the slab allocators.
>
> Could I get a clear definiton as to what exactly is positive? Was this
> using SLAB, SLUB or SLOB?
>
>> > This would all use per cpu data. As soon as a handoff is required within
>> > the allocators locks are being used. So I would say no.
>>
>> As in "no, it is not necessary for the caller of kfree() to invoke barrier()
>> in this example", right?
>
> Actually SLUB contains a barrier already in kfree(). Has to be there
> because of the way the per cpu pointer is being handled.

The positive was reporting of data races in the following code:

// kernel/pid.c
         if ((atomic_read(&pid->count) == 1) ||
              atomic_dec_and_test(&pid->count)) {
                 kmem_cache_free(ns->pid_cachep, pid);
                 put_pid_ns(ns);
         }

//drivers/tty/tty_buffer.c
while ((next = buf->head->next) != NULL) {
     tty_buffer_free(port, buf->head);
     buf->head = next;
}

Namely, the tool reported data races between usage of the object in
other threads before they released the object and kfree.

I am not sure why we are so concentrated on details like SLAB vs SLUB
vs SLOB or cache coherency protocols. This looks like waste of time to
me. General kernel code should not be safe only when working with SLxB
due to current implementation details of SLxB, it should be safe
according to memory allocator contract. And this contract seem to be:
memory allocator can do arbitrary reads and writes to the object
inside of kmalloc and kfree.
Similarly for memory model. There is officially documented kernel
memory model, which all general kernel code must adhere to. Reasoning
about whether a particular piece of code works on architecture X, or
how exactly it can break on architecture Y in unnecessary in such
context. In the end, there can be memory allocator implementation and
new architectures.

My question is about contracts, not about current implementation
details or specific architectures.

There are memory allocator implementations that do reads and writes of
the object, and there are memory allocator implementations that do not
do any barriers on fast paths. From this follows that objects must be
passed in quiescent state to kfree.
Now, kernel memory model says "A load-load control dependency requires
a full read memory barrier".
>From this follows that the following code is broken:

// kernel/pid.c
         if ((atomic_read(&pid->count) == 1) ||
              atomic_dec_and_test(&pid->count)) {
                 kmem_cache_free(ns->pid_cachep, pid);
                 put_pid_ns(ns);
         }

and it should be:

// kernel/pid.c
         if ((smp_load_acquire(&pid->count) == 1) ||
              atomic_dec_and_test(&pid->count)) {
                 kmem_cache_free(ns->pid_cachep, pid);
                 put_pid_ns(ns);
         }



-- 
Dmitry Vyukov, Software Engineer, dvyukov@google.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10  9:55                                       ` Dmitry Vyukov
@ 2015-09-10 10:42                                         ` Jesper Dangaard Brouer
  2015-09-10 12:08                                           ` Dmitry Vyukov
  2015-09-10 12:47                                         ` Vlastimil Babka
  2015-09-10 17:13                                         ` Paul E. McKenney
  2 siblings, 1 reply; 45+ messages in thread
From: Jesper Dangaard Brouer @ 2015-09-10 10:42 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: brouer, Christoph Lameter, Paul E. McKenney, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-mm,
	Andrey Konovalov, Alexander Potapenko, Eric Dumazet

On Thu, 10 Sep 2015 11:55:35 +0200 Dmitry Vyukov <dvyukov@google.com> wrote:

> On Thu, Sep 10, 2015 at 1:31 AM, Christoph Lameter <cl@linux.com> wrote:
> > On Wed, 9 Sep 2015, Paul E. McKenney wrote:
> >
> >> Either way, Dmitry's tool got a hit on real code using the slab
> >> allocators.  If that hit is a false positive, then clearly Dmitry
> >> needs to fix his tool, however, I am not (yet) convinced that it is a
> >> false positive.  If it is not a false positive, we might well need to
> >> articulate the rules for use of the slab allocators.
> >
> > Could I get a clear definiton as to what exactly is positive? Was this
> > using SLAB, SLUB or SLOB?
> >
> >> > This would all use per cpu data. As soon as a handoff is required within
> >> > the allocators locks are being used. So I would say no.
> >>
> >> As in "no, it is not necessary for the caller of kfree() to invoke barrier()
> >> in this example", right?
> >
> > Actually SLUB contains a barrier already in kfree(). Has to be there
> > because of the way the per cpu pointer is being handled.
> 
> The positive was reporting of data races in the following code:
> 
> // kernel/pid.c
>          if ((atomic_read(&pid->count) == 1) ||
>               atomic_dec_and_test(&pid->count)) {
>                  kmem_cache_free(ns->pid_cachep, pid);
>                  put_pid_ns(ns);
>          }
> 
> //drivers/tty/tty_buffer.c
> while ((next = buf->head->next) != NULL) {
>      tty_buffer_free(port, buf->head);
>      buf->head = next;
> }
> 
> Namely, the tool reported data races between usage of the object in
> other threads before they released the object and kfree.
> 
> I am not sure why we are so concentrated on details like SLAB vs SLUB
> vs SLOB or cache coherency protocols. This looks like waste of time to
> me. General kernel code should not be safe only when working with SLxB
> due to current implementation details of SLxB, it should be safe
> according to memory allocator contract. And this contract seem to be:
> memory allocator can do arbitrary reads and writes to the object
> inside of kmalloc and kfree.
> Similarly for memory model. There is officially documented kernel
> memory model, which all general kernel code must adhere to. Reasoning
> about whether a particular piece of code works on architecture X, or
> how exactly it can break on architecture Y in unnecessary in such
> context. In the end, there can be memory allocator implementation and
> new architectures.
> 
> My question is about contracts, not about current implementation
> details or specific architectures.
> 
> There are memory allocator implementations that do reads and writes of
> the object, and there are memory allocator implementations that do not
> do any barriers on fast paths. From this follows that objects must be
> passed in quiescent state to kfree.
> Now, kernel memory model says "A load-load control dependency requires
> a full read memory barrier".
> From this follows that the following code is broken:
> 
> // kernel/pid.c
>          if ((atomic_read(&pid->count) == 1) ||
>               atomic_dec_and_test(&pid->count)) {
>                  kmem_cache_free(ns->pid_cachep, pid);
>                  put_pid_ns(ns);
>          }
> 
> and it should be:
> 
> // kernel/pid.c
>          if ((smp_load_acquire(&pid->count) == 1) ||
>               atomic_dec_and_test(&pid->count)) {
>                  kmem_cache_free(ns->pid_cachep, pid);
>                  put_pid_ns(ns);
>          }
> 

This reminds me of some code in the network stack[1] in kfree_skb()
where we have a smp_rmb().  Should we have used smp_load_acquire() ?

 void kfree_skb(struct sk_buff *skb)
 {
	if (unlikely(!skb))
		return;
	if (likely(atomic_read(&skb->users) == 1))
		smp_rmb();
	else if (likely(!atomic_dec_and_test(&skb->users)))
		return;
	trace_kfree_skb(skb, __builtin_return_address(0));
	__kfree_skb(skb);
 }
 EXPORT_SYMBOL(kfree_skb);

[1] https://github.com/torvalds/linux/blob/v4.2-rc8/net/core/skbuff.c#L690

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Sr. Network Kernel Developer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10 10:42                                         ` Jesper Dangaard Brouer
@ 2015-09-10 12:08                                           ` Dmitry Vyukov
  2015-09-10 13:37                                             ` Eric Dumazet
  0 siblings, 1 reply; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-10 12:08 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Christoph Lameter, Paul E. McKenney, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-mm,
	Andrey Konovalov, Alexander Potapenko, Eric Dumazet

On Thu, Sep 10, 2015 at 12:42 PM, Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
> On Thu, 10 Sep 2015 11:55:35 +0200 Dmitry Vyukov <dvyukov@google.com> wrote:
>
>> On Thu, Sep 10, 2015 at 1:31 AM, Christoph Lameter <cl@linux.com> wrote:
>> > On Wed, 9 Sep 2015, Paul E. McKenney wrote:
>> >
>> >> Either way, Dmitry's tool got a hit on real code using the slab
>> >> allocators.  If that hit is a false positive, then clearly Dmitry
>> >> needs to fix his tool, however, I am not (yet) convinced that it is a
>> >> false positive.  If it is not a false positive, we might well need to
>> >> articulate the rules for use of the slab allocators.
>> >
>> > Could I get a clear definiton as to what exactly is positive? Was this
>> > using SLAB, SLUB or SLOB?
>> >
>> >> > This would all use per cpu data. As soon as a handoff is required within
>> >> > the allocators locks are being used. So I would say no.
>> >>
>> >> As in "no, it is not necessary for the caller of kfree() to invoke barrier()
>> >> in this example", right?
>> >
>> > Actually SLUB contains a barrier already in kfree(). Has to be there
>> > because of the way the per cpu pointer is being handled.
>>
>> The positive was reporting of data races in the following code:
>>
>> // kernel/pid.c
>>          if ((atomic_read(&pid->count) == 1) ||
>>               atomic_dec_and_test(&pid->count)) {
>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }
>>
>> //drivers/tty/tty_buffer.c
>> while ((next = buf->head->next) != NULL) {
>>      tty_buffer_free(port, buf->head);
>>      buf->head = next;
>> }
>>
>> Namely, the tool reported data races between usage of the object in
>> other threads before they released the object and kfree.
>>
>> I am not sure why we are so concentrated on details like SLAB vs SLUB
>> vs SLOB or cache coherency protocols. This looks like waste of time to
>> me. General kernel code should not be safe only when working with SLxB
>> due to current implementation details of SLxB, it should be safe
>> according to memory allocator contract. And this contract seem to be:
>> memory allocator can do arbitrary reads and writes to the object
>> inside of kmalloc and kfree.
>> Similarly for memory model. There is officially documented kernel
>> memory model, which all general kernel code must adhere to. Reasoning
>> about whether a particular piece of code works on architecture X, or
>> how exactly it can break on architecture Y in unnecessary in such
>> context. In the end, there can be memory allocator implementation and
>> new architectures.
>>
>> My question is about contracts, not about current implementation
>> details or specific architectures.
>>
>> There are memory allocator implementations that do reads and writes of
>> the object, and there are memory allocator implementations that do not
>> do any barriers on fast paths. From this follows that objects must be
>> passed in quiescent state to kfree.
>> Now, kernel memory model says "A load-load control dependency requires
>> a full read memory barrier".
>> From this follows that the following code is broken:
>>
>> // kernel/pid.c
>>          if ((atomic_read(&pid->count) == 1) ||
>>               atomic_dec_and_test(&pid->count)) {
>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }
>>
>> and it should be:
>>
>> // kernel/pid.c
>>          if ((smp_load_acquire(&pid->count) == 1) ||
>>               atomic_dec_and_test(&pid->count)) {
>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }
>>
>
> This reminds me of some code in the network stack[1] in kfree_skb()
> where we have a smp_rmb().  Should we have used smp_load_acquire() ?
>
>  void kfree_skb(struct sk_buff *skb)
>  {
>         if (unlikely(!skb))
>                 return;
>         if (likely(atomic_read(&skb->users) == 1))
>                 smp_rmb();
>         else if (likely(!atomic_dec_and_test(&skb->users)))
>                 return;
>         trace_kfree_skb(skb, __builtin_return_address(0));
>         __kfree_skb(skb);
>  }
>  EXPORT_SYMBOL(kfree_skb);

rmb is much better than nothing :)
I generally prefer to use smp_load_acquire just because it's more
explicit (you see what memory access the barrier relates to), fewer
lines of code, agrees with modern atomic APIs in C, C++, Java, etc,
and FWIW is much better for dynamic race detectors.
As for semantic difference between rmb and smp_load_acquire, rmb does
not order stores, so stores from __kfree_skb can hoist above the
atomic_read(&skb->users) == 1 check. The only architecture that can do
that is Alpha, I don't know enough about Alpha and barrier
implementation on Alpha (maybe rmb and smp_load_acquire do the same
hardware barrier on Alpha) to say whether it can break in real life or
not. But I would still consider smp_load_acquire as safer and cleaner
alternative.


-- 
Dmitry Vyukov, Software Engineer, dvyukov@google.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10  9:55                                       ` Dmitry Vyukov
  2015-09-10 10:42                                         ` Jesper Dangaard Brouer
@ 2015-09-10 12:47                                         ` Vlastimil Babka
  2015-09-10 13:17                                           ` Dmitry Vyukov
  2015-09-10 17:13                                         ` Paul E. McKenney
  2 siblings, 1 reply; 45+ messages in thread
From: Vlastimil Babka @ 2015-09-10 12:47 UTC (permalink / raw)
  To: Dmitry Vyukov, Christoph Lameter
  Cc: Paul E. McKenney, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On 09/10/2015 11:55 AM, Dmitry Vyukov wrote:
> On Thu, Sep 10, 2015 at 1:31 AM, Christoph Lameter <cl@linux.com> wrote:
>> On Wed, 9 Sep 2015, Paul E. McKenney wrote:
>>
>>> Either way, Dmitry's tool got a hit on real code using the slab
>>> allocators.  If that hit is a false positive, then clearly Dmitry
>>> needs to fix his tool, however, I am not (yet) convinced that it is a
>>> false positive.  If it is not a false positive, we might well need to
>>> articulate the rules for use of the slab allocators.
>>
>> Could I get a clear definiton as to what exactly is positive? Was this
>> using SLAB, SLUB or SLOB?
>>
>>> > This would all use per cpu data. As soon as a handoff is required within
>>> > the allocators locks are being used. So I would say no.
>>>
>>> As in "no, it is not necessary for the caller of kfree() to invoke barrier()
>>> in this example", right?
>>
>> Actually SLUB contains a barrier already in kfree(). Has to be there
>> because of the way the per cpu pointer is being handled.
> 
> The positive was reporting of data races in the following code:
> 
> // kernel/pid.c
>          if ((atomic_read(&pid->count) == 1) ||
>               atomic_dec_and_test(&pid->count)) {
>                  kmem_cache_free(ns->pid_cachep, pid);
>                  put_pid_ns(ns);
>          }
> 
> //drivers/tty/tty_buffer.c
> while ((next = buf->head->next) != NULL) {
>      tty_buffer_free(port, buf->head);
>      buf->head = next;
> }
> 
> Namely, the tool reported data races between usage of the object in
> other threads before they released the object and kfree.
> 

[...]

> There are memory allocator implementations that do reads and writes of
> the object, and there are memory allocator implementations that do not
> do any barriers on fast paths. From this follows that objects must be
> passed in quiescent state to kfree.
> Now, kernel memory model says "A load-load control dependency requires
> a full read memory barrier".

But a load-load dependency is something different than writes from
kmem_cache_free() being visible before the atomic_read(), right?

So the problem you are seeing is a different one, that some other cpu's are
still writing to the object after they decrese the count to 1?.

> From this follows that the following code is broken:
> 
> // kernel/pid.c
>          if ((atomic_read(&pid->count) == 1) ||
>               atomic_dec_and_test(&pid->count)) {
>                  kmem_cache_free(ns->pid_cachep, pid);
>                  put_pid_ns(ns);
>          }
> 
> and it should be:
> 
> // kernel/pid.c
>          if ((smp_load_acquire(&pid->count) == 1) ||

Is that enough? Doesn't it need a pairing smp_store_release?

>               atomic_dec_and_test(&pid->count)) {

A prior release from another thread (that sets the counter to 1) would be done
by this atomic_dec_and_test() (this all is put_pid() function).
Does that act as a release? memory-barriers.txt seems to say it does.

So yeah your patch seems to be needed and I don't think it should be the sl*b
providing the necessary barrier here. It should be on the refcounting IMHO. That
has the knowledge of correct ordering depending on the pid->count, sl*b has no
such knowledge.

>                  kmem_cache_free(ns->pid_cachep, pid);
>                  put_pid_ns(ns);
>          }
> 
> 
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10 12:47                                         ` Vlastimil Babka
@ 2015-09-10 13:17                                           ` Dmitry Vyukov
  0 siblings, 0 replies; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-10 13:17 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Christoph Lameter, Paul E. McKenney, Pekka Enberg,
	David Rientjes, Joonsoo Kim, Andrew Morton, linux-mm,
	Andrey Konovalov, Alexander Potapenko

On Thu, Sep 10, 2015 at 2:47 PM, Vlastimil Babka <vbabka@suse.cz> wrote:
> On 09/10/2015 11:55 AM, Dmitry Vyukov wrote:
>> On Thu, Sep 10, 2015 at 1:31 AM, Christoph Lameter <cl@linux.com> wrote:
>>> On Wed, 9 Sep 2015, Paul E. McKenney wrote:
>>>
>>>> Either way, Dmitry's tool got a hit on real code using the slab
>>>> allocators.  If that hit is a false positive, then clearly Dmitry
>>>> needs to fix his tool, however, I am not (yet) convinced that it is a
>>>> false positive.  If it is not a false positive, we might well need to
>>>> articulate the rules for use of the slab allocators.
>>>
>>> Could I get a clear definiton as to what exactly is positive? Was this
>>> using SLAB, SLUB or SLOB?
>>>
>>>> > This would all use per cpu data. As soon as a handoff is required within
>>>> > the allocators locks are being used. So I would say no.
>>>>
>>>> As in "no, it is not necessary for the caller of kfree() to invoke barrier()
>>>> in this example", right?
>>>
>>> Actually SLUB contains a barrier already in kfree(). Has to be there
>>> because of the way the per cpu pointer is being handled.
>>
>> The positive was reporting of data races in the following code:
>>
>> // kernel/pid.c
>>          if ((atomic_read(&pid->count) == 1) ||
>>               atomic_dec_and_test(&pid->count)) {
>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }
>>
>> //drivers/tty/tty_buffer.c
>> while ((next = buf->head->next) != NULL) {
>>      tty_buffer_free(port, buf->head);
>>      buf->head = next;
>> }
>>
>> Namely, the tool reported data races between usage of the object in
>> other threads before they released the object and kfree.
>>
>
> [...]
>
>> There are memory allocator implementations that do reads and writes of
>> the object, and there are memory allocator implementations that do not
>> do any barriers on fast paths. From this follows that objects must be
>> passed in quiescent state to kfree.
>> Now, kernel memory model says "A load-load control dependency requires
>> a full read memory barrier".
>
> But a load-load dependency is something different than writes from
> kmem_cache_free() being visible before the atomic_read(), right?

Right.
As of now, the code has problem with both reads and writes from kfree
hoisting above the pid->count==1 check.
I've just demonstrated the problem for reads.

> So the problem you are seeing is a different one, that some other cpu's are
> still writing to the object after they decrese the count to 1?.

I don't see any actual manifestation of the issue.  Our tool works on
x86, so chances of actual manifestation are pretty low. But the tool
verifies code according to abstract memory model and thus can detect
potential manifestations on ARM, POWER, Alpha or whatever.


>> From this follows that the following code is broken:
>>
>> // kernel/pid.c
>>          if ((atomic_read(&pid->count) == 1) ||
>>               atomic_dec_and_test(&pid->count)) {
>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }
>>
>> and it should be:
>>
>> // kernel/pid.c
>>          if ((smp_load_acquire(&pid->count) == 1) ||
>
> Is that enough? Doesn't it need a pairing smp_store_release?
>
>>               atomic_dec_and_test(&pid->count)) {
>
> A prior release from another thread (that sets the counter to 1) would be done
> by this atomic_dec_and_test() (this all is put_pid() function).
> Does that act as a release? memory-barriers.txt seems to say it does.

Yes, release is required and it is provided by atomic_dec_and_test.


> So yeah your patch seems to be needed and I don't think it should be the sl*b
> providing the necessary barrier here. It should be on the refcounting IMHO. That
> has the knowledge of correct ordering depending on the pid->count, sl*b has no
> such knowledge.

Thanks for confirmation!
Yes, I completely agree that code that calls kfree must provide
necessary synchronization. It would just massive pessimization if
kfree do barrier.


>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }


-- 
Dmitry Vyukov, Software Engineer, dvyukov@google.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10 12:08                                           ` Dmitry Vyukov
@ 2015-09-10 13:37                                             ` Eric Dumazet
  0 siblings, 0 replies; 45+ messages in thread
From: Eric Dumazet @ 2015-09-10 13:37 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Jesper Dangaard Brouer, Christoph Lameter, Paul E. McKenney,
	Pekka Enberg, David Rientjes, Joonsoo Kim, Andrew Morton,
	linux-mm, Andrey Konovalov, Alexander Potapenko

On Thu, 2015-09-10 at 14:08 +0200, Dmitry Vyukov wrote:
> On Thu, Sep 10, 2015 at 12:42 PM, Jesper Dangaard Brouer

> > This reminds me of some code in the network stack[1] in kfree_skb()
> > where we have a smp_rmb().  Should we have used smp_load_acquire() ?
> >
> >  void kfree_skb(struct sk_buff *skb)
> >  {
> >         if (unlikely(!skb))
> >                 return;
> >         if (likely(atomic_read(&skb->users) == 1))
> >                 smp_rmb();
> >         else if (likely(!atomic_dec_and_test(&skb->users)))
> >                 return;
> >         trace_kfree_skb(skb, __builtin_return_address(0));
> >         __kfree_skb(skb);
> >  }
> >  EXPORT_SYMBOL(kfree_skb);
> 
> rmb is much better than nothing :)
> I generally prefer to use smp_load_acquire just because it's more
> explicit (you see what memory access the barrier relates to), fewer
> lines of code, agrees with modern atomic APIs in C, C++, Java, etc,
> and FWIW is much better for dynamic race detectors.
> As for semantic difference between rmb and smp_load_acquire, rmb does
> not order stores, so stores from __kfree_skb can hoist above the
> atomic_read(&skb->users) == 1 check. The only architecture that can do
> that is Alpha, I don't know enough about Alpha and barrier
> implementation on Alpha (maybe rmb and smp_load_acquire do the same
> hardware barrier on Alpha) to say whether it can break in real life or
> not. But I would still consider smp_load_acquire as safer and cleaner
> alternative.

smp_load_acquire() is a kid compared to kfree_skb() code written decades
ago.

Sure, new code has plenty of ways to implement all this stuff, and now
we can discuss days about choosing right variant in a single spot.

In the old days, Alexei was writing thousand of lines of code per day,
and he got it mostly right, even for the Alpha ;)

Another discussion is whether or not reading the value before attempting
the lock atomic dec is a win on modern cpus, because it might incur an
additional bus transaction in the case skbs are allocated/freed on
different cpus. I believe I made tests ~4 years ago and it was worth
keeping it.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?)
  2015-09-10  7:22                                       ` Vlastimil Babka
@ 2015-09-10 16:36                                         ` Christoph Lameter
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Lameter @ 2015-09-10 16:36 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Paul E. McKenney, Dmitry Vyukov, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, linux-mm, Andrey Konovalov,
	Alexander Potapenko

On Thu, 10 Sep 2015, Vlastimil Babka wrote:

> > For a partial cacheline it would have to read the rest of the cacheline
> > before updating. And I would expect the processor to have exclusive access
> > to the cacheline that is held in a store buffer. If not then there is
> > trouble afoot.
>
> IIRC that (or something similar with same guarantees) basically happens on x86
> when you use the LOCK prefix, i.e. for atomic inc etc. Doing that always would
> destroy performance.

Well yes but it also happens anytime you try to write to a cacheline.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?)
  2015-09-10  7:38                                               ` Vlastimil Babka
@ 2015-09-10 16:37                                                 ` Christoph Lameter
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Lameter @ 2015-09-10 16:37 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Paul E. McKenney, Dmitry Vyukov, Pekka Enberg, David Rientjes,
	Joonsoo Kim, Andrew Morton, linux-mm, Andrey Konovalov,
	Alexander Potapenko

On Thu, 10 Sep 2015, Vlastimil Babka wrote:

> > kfree at some point calls slab_free(). That function has a barrier. All
> > free operations go through it.
>
> SLAB doesn't have such barrier AFAICS. It will put the object on per-cpu cache
> and that's it. Only flushing the full cache takes a spin lock.

SLAB disables and enables interrupts. Isnt that also considered a form of
barrier?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10  9:55                                       ` Dmitry Vyukov
  2015-09-10 10:42                                         ` Jesper Dangaard Brouer
  2015-09-10 12:47                                         ` Vlastimil Babka
@ 2015-09-10 17:13                                         ` Paul E. McKenney
  2015-09-10 17:21                                           ` Paul E. McKenney
                                                             ` (2 more replies)
  2 siblings, 3 replies; 45+ messages in thread
From: Paul E. McKenney @ 2015-09-10 17:13 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Thu, Sep 10, 2015 at 11:55:35AM +0200, Dmitry Vyukov wrote:
> On Thu, Sep 10, 2015 at 1:31 AM, Christoph Lameter <cl@linux.com> wrote:
> > On Wed, 9 Sep 2015, Paul E. McKenney wrote:
> >
> >> Either way, Dmitry's tool got a hit on real code using the slab
> >> allocators.  If that hit is a false positive, then clearly Dmitry
> >> needs to fix his tool, however, I am not (yet) convinced that it is a
> >> false positive.  If it is not a false positive, we might well need to
> >> articulate the rules for use of the slab allocators.
> >
> > Could I get a clear definiton as to what exactly is positive? Was this
> > using SLAB, SLUB or SLOB?
> >
> >> > This would all use per cpu data. As soon as a handoff is required within
> >> > the allocators locks are being used. So I would say no.
> >>
> >> As in "no, it is not necessary for the caller of kfree() to invoke barrier()
> >> in this example", right?
> >
> > Actually SLUB contains a barrier already in kfree(). Has to be there
> > because of the way the per cpu pointer is being handled.
> 
> The positive was reporting of data races in the following code:
> 
> // kernel/pid.c
>          if ((atomic_read(&pid->count) == 1) ||
>               atomic_dec_and_test(&pid->count)) {
>                  kmem_cache_free(ns->pid_cachep, pid);
>                  put_pid_ns(ns);
>          }
> 
> //drivers/tty/tty_buffer.c
> while ((next = buf->head->next) != NULL) {
>      tty_buffer_free(port, buf->head);
>      buf->head = next;
> }
> 
> Namely, the tool reported data races between usage of the object in
> other threads before they released the object and kfree.
> 
> I am not sure why we are so concentrated on details like SLAB vs SLUB
> vs SLOB or cache coherency protocols. This looks like waste of time to
> me. General kernel code should not be safe only when working with SLxB
> due to current implementation details of SLxB, it should be safe
> according to memory allocator contract. And this contract seem to be:
> memory allocator can do arbitrary reads and writes to the object
> inside of kmalloc and kfree.

The reason we poked at this was to see if any of SLxB touched the
memory being freed.  If none of them touched the memory being freed,
and if that was a policy, then the idiom above would be legal.  However,
one of them does touch the memory being freed, so, yes, the above code
needs to be fixed.

> Similarly for memory model. There is officially documented kernel
> memory model, which all general kernel code must adhere to. Reasoning
> about whether a particular piece of code works on architecture X, or
> how exactly it can break on architecture Y in unnecessary in such
> context. In the end, there can be memory allocator implementation and
> new architectures.
> 
> My question is about contracts, not about current implementation
> details or specific architectures.
> 
> There are memory allocator implementations that do reads and writes of
> the object, and there are memory allocator implementations that do not
> do any barriers on fast paths. From this follows that objects must be
> passed in quiescent state to kfree.
> Now, kernel memory model says "A load-load control dependency requires
> a full read memory barrier".
> >From this follows that the following code is broken:
> 
> // kernel/pid.c
>          if ((atomic_read(&pid->count) == 1) ||
>               atomic_dec_and_test(&pid->count)) {
>                  kmem_cache_free(ns->pid_cachep, pid);
>                  put_pid_ns(ns);
>          }
> 
> and it should be:
> 
> // kernel/pid.c
>          if ((smp_load_acquire(&pid->count) == 1) ||

If Will Deacon's patch providing generic support for relaxed atomics
made it in, we want:

	  if ((atomic_read_acquire(&pid->count) == 1) ||

Otherwise, we need an explicit barrier.

							Thanx, Paul

>               atomic_dec_and_test(&pid->count)) {
>                  kmem_cache_free(ns->pid_cachep, pid);
>                  put_pid_ns(ns);
>          }
> 
> 
> 
> -- 
> Dmitry Vyukov, Software Engineer, dvyukov@google.com
> Google Germany GmbH, Dienerstrasse 12, 80331, Munchen
> Geschaftsfuhrer: Graham Law, Christine Elizabeth Flores
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg
> Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
> sind, leiten Sie diese bitte nicht weiter, informieren Sie den
> Absender und loschen Sie die E-Mail und alle Anhange. Vielen Dank.
> This e-mail is confidential. If you are not the right addressee please
> do not forward it, please inform the sender, and please erase this
> e-mail including any attachments. Thanks.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10 17:13                                         ` Paul E. McKenney
@ 2015-09-10 17:21                                           ` Paul E. McKenney
  2015-09-10 17:26                                           ` Dmitry Vyukov
  2015-09-10 18:01                                           ` Christoph Lameter
  2 siblings, 0 replies; 45+ messages in thread
From: Paul E. McKenney @ 2015-09-10 17:21 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Thu, Sep 10, 2015 at 10:13:33AM -0700, Paul E. McKenney wrote:
> On Thu, Sep 10, 2015 at 11:55:35AM +0200, Dmitry Vyukov wrote:
> > On Thu, Sep 10, 2015 at 1:31 AM, Christoph Lameter <cl@linux.com> wrote:
> > > On Wed, 9 Sep 2015, Paul E. McKenney wrote:

[ . . . ]

> > There are memory allocator implementations that do reads and writes of
> > the object, and there are memory allocator implementations that do not
> > do any barriers on fast paths. From this follows that objects must be
> > passed in quiescent state to kfree.
> > Now, kernel memory model says "A load-load control dependency requires
> > a full read memory barrier".
> > >From this follows that the following code is broken:
> > 
> > // kernel/pid.c
> >          if ((atomic_read(&pid->count) == 1) ||
> >               atomic_dec_and_test(&pid->count)) {
> >                  kmem_cache_free(ns->pid_cachep, pid);
> >                  put_pid_ns(ns);
> >          }
> > 
> > and it should be:
> > 
> > // kernel/pid.c
> >          if ((smp_load_acquire(&pid->count) == 1) ||
> 
> If Will Deacon's patch providing generic support for relaxed atomics
> made it in, we want:
> 
> 	  if ((atomic_read_acquire(&pid->count) == 1) ||
> 
> Otherwise, we need an explicit barrier.

And atomic_read_acquire() is in fact now in mainline, so it is the
best choice here.

							Thanx, Paul

> >               atomic_dec_and_test(&pid->count)) {
> >                  kmem_cache_free(ns->pid_cachep, pid);
> >                  put_pid_ns(ns);
> >          }
> > 
> > 
> > 
> > -- 
> > Dmitry Vyukov, Software Engineer, dvyukov@google.com
> > Google Germany GmbH, Dienerstrasse 12, 80331, Munchen
> > Geschaftsfuhrer: Graham Law, Christine Elizabeth Flores
> > Registergericht und -nummer: Hamburg, HRB 86891
> > Sitz der Gesellschaft: Hamburg
> > Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
> > sind, leiten Sie diese bitte nicht weiter, informieren Sie den
> > Absender und loschen Sie die E-Mail und alle Anhange. Vielen Dank.
> > This e-mail is confidential. If you are not the right addressee please
> > do not forward it, please inform the sender, and please erase this
> > e-mail including any attachments. Thanks.
> > 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10 17:13                                         ` Paul E. McKenney
  2015-09-10 17:21                                           ` Paul E. McKenney
@ 2015-09-10 17:26                                           ` Dmitry Vyukov
  2015-09-10 17:44                                             ` Paul E. McKenney
  2015-09-10 18:01                                           ` Christoph Lameter
  2 siblings, 1 reply; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-10 17:26 UTC (permalink / raw)
  To: Paul McKenney
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Thu, Sep 10, 2015 at 7:13 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Thu, Sep 10, 2015 at 11:55:35AM +0200, Dmitry Vyukov wrote:
>> On Thu, Sep 10, 2015 at 1:31 AM, Christoph Lameter <cl@linux.com> wrote:
>> > On Wed, 9 Sep 2015, Paul E. McKenney wrote:
>> >
>> >> Either way, Dmitry's tool got a hit on real code using the slab
>> >> allocators.  If that hit is a false positive, then clearly Dmitry
>> >> needs to fix his tool, however, I am not (yet) convinced that it is a
>> >> false positive.  If it is not a false positive, we might well need to
>> >> articulate the rules for use of the slab allocators.
>> >
>> > Could I get a clear definiton as to what exactly is positive? Was this
>> > using SLAB, SLUB or SLOB?
>> >
>> >> > This would all use per cpu data. As soon as a handoff is required within
>> >> > the allocators locks are being used. So I would say no.
>> >>
>> >> As in "no, it is not necessary for the caller of kfree() to invoke barrier()
>> >> in this example", right?
>> >
>> > Actually SLUB contains a barrier already in kfree(). Has to be there
>> > because of the way the per cpu pointer is being handled.
>>
>> The positive was reporting of data races in the following code:
>>
>> // kernel/pid.c
>>          if ((atomic_read(&pid->count) == 1) ||
>>               atomic_dec_and_test(&pid->count)) {
>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }
>>
>> //drivers/tty/tty_buffer.c
>> while ((next = buf->head->next) != NULL) {
>>      tty_buffer_free(port, buf->head);
>>      buf->head = next;
>> }
>>
>> Namely, the tool reported data races between usage of the object in
>> other threads before they released the object and kfree.
>>
>> I am not sure why we are so concentrated on details like SLAB vs SLUB
>> vs SLOB or cache coherency protocols. This looks like waste of time to
>> me. General kernel code should not be safe only when working with SLxB
>> due to current implementation details of SLxB, it should be safe
>> according to memory allocator contract. And this contract seem to be:
>> memory allocator can do arbitrary reads and writes to the object
>> inside of kmalloc and kfree.
>
> The reason we poked at this was to see if any of SLxB touched the
> memory being freed.  If none of them touched the memory being freed,
> and if that was a policy, then the idiom above would be legal.  However,
> one of them does touch the memory being freed, so, yes, the above code
> needs to be fixed.

No. The object can be instantly reallocated and user can write to the
object. Consider:

if (READ_ONCE(p->free))
  kfree(p);
y = kmalloc(8);
// assuming p's size is 8, y is most likely equal to p and there are
no barriers on the kmalloc fast path
*(void**)y = 0;

This is equivalent to kmalloc writing to the object in this respect.



>> Similarly for memory model. There is officially documented kernel
>> memory model, which all general kernel code must adhere to. Reasoning
>> about whether a particular piece of code works on architecture X, or
>> how exactly it can break on architecture Y in unnecessary in such
>> context. In the end, there can be memory allocator implementation and
>> new architectures.
>>
>> My question is about contracts, not about current implementation
>> details or specific architectures.
>>
>> There are memory allocator implementations that do reads and writes of
>> the object, and there are memory allocator implementations that do not
>> do any barriers on fast paths. From this follows that objects must be
>> passed in quiescent state to kfree.
>> Now, kernel memory model says "A load-load control dependency requires
>> a full read memory barrier".
>> >From this follows that the following code is broken:
>>
>> // kernel/pid.c
>>          if ((atomic_read(&pid->count) == 1) ||
>>               atomic_dec_and_test(&pid->count)) {
>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }
>>
>> and it should be:
>>
>> // kernel/pid.c
>>          if ((smp_load_acquire(&pid->count) == 1) ||
>
> If Will Deacon's patch providing generic support for relaxed atomics
> made it in, we want:
>
>           if ((atomic_read_acquire(&pid->count) == 1) ||
>
> Otherwise, we need an explicit barrier.
>
>                                                         Thanx, Paul
>
>>               atomic_dec_and_test(&pid->count)) {
>>                  kmem_cache_free(ns->pid_cachep, pid);
>>                  put_pid_ns(ns);
>>          }
>>
>>
>>
>> --
>> Dmitry Vyukov, Software Engineer, dvyukov@google.com
>> Google Germany GmbH, Dienerstraße 12, 80331, München
>> Geschäftsführer: Graham Law, Christine Elizabeth Flores
>> Registergericht und -nummer: Hamburg, HRB 86891
>> Sitz der Gesellschaft: Hamburg
>> Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
>> sind, leiten Sie diese bitte nicht weiter, informieren Sie den
>> Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
>> This e-mail is confidential. If you are not the right addressee please
>> do not forward it, please inform the sender, and please erase this
>> e-mail including any attachments. Thanks.
>>
>



-- 
Dmitry Vyukov, Software Engineer, dvyukov@google.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10 17:26                                           ` Dmitry Vyukov
@ 2015-09-10 17:44                                             ` Paul E. McKenney
  0 siblings, 0 replies; 45+ messages in thread
From: Paul E. McKenney @ 2015-09-10 17:44 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Thu, Sep 10, 2015 at 07:26:09PM +0200, Dmitry Vyukov wrote:
> On Thu, Sep 10, 2015 at 7:13 PM, Paul E. McKenney
> <paulmck@linux.vnet.ibm.com> wrote:
> > On Thu, Sep 10, 2015 at 11:55:35AM +0200, Dmitry Vyukov wrote:
> >> On Thu, Sep 10, 2015 at 1:31 AM, Christoph Lameter <cl@linux.com> wrote:
> >> > On Wed, 9 Sep 2015, Paul E. McKenney wrote:
> >> >
> >> >> Either way, Dmitry's tool got a hit on real code using the slab
> >> >> allocators.  If that hit is a false positive, then clearly Dmitry
> >> >> needs to fix his tool, however, I am not (yet) convinced that it is a
> >> >> false positive.  If it is not a false positive, we might well need to
> >> >> articulate the rules for use of the slab allocators.
> >> >
> >> > Could I get a clear definiton as to what exactly is positive? Was this
> >> > using SLAB, SLUB or SLOB?
> >> >
> >> >> > This would all use per cpu data. As soon as a handoff is required within
> >> >> > the allocators locks are being used. So I would say no.
> >> >>
> >> >> As in "no, it is not necessary for the caller of kfree() to invoke barrier()
> >> >> in this example", right?
> >> >
> >> > Actually SLUB contains a barrier already in kfree(). Has to be there
> >> > because of the way the per cpu pointer is being handled.
> >>
> >> The positive was reporting of data races in the following code:
> >>
> >> // kernel/pid.c
> >>          if ((atomic_read(&pid->count) == 1) ||
> >>               atomic_dec_and_test(&pid->count)) {
> >>                  kmem_cache_free(ns->pid_cachep, pid);
> >>                  put_pid_ns(ns);
> >>          }
> >>
> >> //drivers/tty/tty_buffer.c
> >> while ((next = buf->head->next) != NULL) {
> >>      tty_buffer_free(port, buf->head);
> >>      buf->head = next;
> >> }
> >>
> >> Namely, the tool reported data races between usage of the object in
> >> other threads before they released the object and kfree.
> >>
> >> I am not sure why we are so concentrated on details like SLAB vs SLUB
> >> vs SLOB or cache coherency protocols. This looks like waste of time to
> >> me. General kernel code should not be safe only when working with SLxB
> >> due to current implementation details of SLxB, it should be safe
> >> according to memory allocator contract. And this contract seem to be:
> >> memory allocator can do arbitrary reads and writes to the object
> >> inside of kmalloc and kfree.
> >
> > The reason we poked at this was to see if any of SLxB touched the
> > memory being freed.  If none of them touched the memory being freed,
> > and if that was a policy, then the idiom above would be legal.  However,
> > one of them does touch the memory being freed, so, yes, the above code
> > needs to be fixed.
> 
> No. The object can be instantly reallocated and user can write to the
> object. Consider:
> 
> if (READ_ONCE(p->free))
>   kfree(p);
> y = kmalloc(8);
> // assuming p's size is 8, y is most likely equal to p and there are
> no barriers on the kmalloc fast path
> *(void**)y = 0;
> 
> This is equivalent to kmalloc writing to the object in this respect.

Fair point!

							Thanx, Paul

> >> Similarly for memory model. There is officially documented kernel
> >> memory model, which all general kernel code must adhere to. Reasoning
> >> about whether a particular piece of code works on architecture X, or
> >> how exactly it can break on architecture Y in unnecessary in such
> >> context. In the end, there can be memory allocator implementation and
> >> new architectures.
> >>
> >> My question is about contracts, not about current implementation
> >> details or specific architectures.
> >>
> >> There are memory allocator implementations that do reads and writes of
> >> the object, and there are memory allocator implementations that do not
> >> do any barriers on fast paths. From this follows that objects must be
> >> passed in quiescent state to kfree.
> >> Now, kernel memory model says "A load-load control dependency requires
> >> a full read memory barrier".
> >> >From this follows that the following code is broken:
> >>
> >> // kernel/pid.c
> >>          if ((atomic_read(&pid->count) == 1) ||
> >>               atomic_dec_and_test(&pid->count)) {
> >>                  kmem_cache_free(ns->pid_cachep, pid);
> >>                  put_pid_ns(ns);
> >>          }
> >>
> >> and it should be:
> >>
> >> // kernel/pid.c
> >>          if ((smp_load_acquire(&pid->count) == 1) ||
> >
> > If Will Deacon's patch providing generic support for relaxed atomics
> > made it in, we want:
> >
> >           if ((atomic_read_acquire(&pid->count) == 1) ||
> >
> > Otherwise, we need an explicit barrier.
> >
> >                                                         Thanx, Paul
> >
> >>               atomic_dec_and_test(&pid->count)) {
> >>                  kmem_cache_free(ns->pid_cachep, pid);
> >>                  put_pid_ns(ns);
> >>          }
> >>
> >>
> >>
> >> --
> >> Dmitry Vyukov, Software Engineer, dvyukov@google.com
> >> Google Germany GmbH, Dienerstrasse 12, 80331, Munchen
> >> Geschaftsfuhrer: Graham Law, Christine Elizabeth Flores
> >> Registergericht und -nummer: Hamburg, HRB 86891
> >> Sitz der Gesellschaft: Hamburg
> >> Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
> >> sind, leiten Sie diese bitte nicht weiter, informieren Sie den
> >> Absender und loschen Sie die E-Mail und alle Anhange. Vielen Dank.
> >> This e-mail is confidential. If you are not the right addressee please
> >> do not forward it, please inform the sender, and please erase this
> >> e-mail including any attachments. Thanks.
> >>
> >
> 
> 
> 
> -- 
> Dmitry Vyukov, Software Engineer, dvyukov@google.com
> Google Germany GmbH, Dienerstrasse 12, 80331, Munchen
> Geschaftsfuhrer: Graham Law, Christine Elizabeth Flores
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg
> Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
> sind, leiten Sie diese bitte nicht weiter, informieren Sie den
> Absender und loschen Sie die E-Mail und alle Anhange. Vielen Dank.
> This e-mail is confidential. If you are not the right addressee please
> do not forward it, please inform the sender, and please erase this
> e-mail including any attachments. Thanks.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10 17:13                                         ` Paul E. McKenney
  2015-09-10 17:21                                           ` Paul E. McKenney
  2015-09-10 17:26                                           ` Dmitry Vyukov
@ 2015-09-10 18:01                                           ` Christoph Lameter
  2015-09-10 18:11                                             ` Dmitry Vyukov
  2 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-10 18:01 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Dmitry Vyukov, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Thu, 10 Sep 2015, Paul E. McKenney wrote:

> The reason we poked at this was to see if any of SLxB touched the
> memory being freed.  If none of them touched the memory being freed,
> and if that was a policy, then the idiom above would be legal.  However,
> one of them does touch the memory being freed, so, yes, the above code
> needs to be fixed.

The one that touches the object has a barrier() before it touches the
memory.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10 18:01                                           ` Christoph Lameter
@ 2015-09-10 18:11                                             ` Dmitry Vyukov
  2015-09-10 18:13                                               ` Christoph Lameter
  0 siblings, 1 reply; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-10 18:11 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Thu, Sep 10, 2015 at 8:01 PM, Christoph Lameter <cl@linux.com> wrote:
> On Thu, 10 Sep 2015, Paul E. McKenney wrote:
>
>> The reason we poked at this was to see if any of SLxB touched the
>> memory being freed.  If none of them touched the memory being freed,
>> and if that was a policy, then the idiom above would be legal.  However,
>> one of them does touch the memory being freed, so, yes, the above code
>> needs to be fixed.
>
> The one that touches the object has a barrier() before it touches the
> memory.

It does not change anything, right?


-- 
Dmitry Vyukov, Software Engineer, dvyukov@google.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10 18:11                                             ` Dmitry Vyukov
@ 2015-09-10 18:13                                               ` Christoph Lameter
  2015-09-10 18:26                                                 ` Dmitry Vyukov
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Lameter @ 2015-09-10 18:13 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Paul E. McKenney, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Thu, 10 Sep 2015, Dmitry Vyukov wrote:

> On Thu, Sep 10, 2015 at 8:01 PM, Christoph Lameter <cl@linux.com> wrote:
> > On Thu, 10 Sep 2015, Paul E. McKenney wrote:
> >
> >> The reason we poked at this was to see if any of SLxB touched the
> >> memory being freed.  If none of them touched the memory being freed,
> >> and if that was a policy, then the idiom above would be legal.  However,
> >> one of them does touch the memory being freed, so, yes, the above code
> >> needs to be fixed.
> >
> > The one that touches the object has a barrier() before it touches the
> > memory.
>
> It does not change anything, right?

It changes the first word of the object after the barrier. The first word
is used in SLUB as the pointer to the next free object.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10 18:13                                               ` Christoph Lameter
@ 2015-09-10 18:26                                                 ` Dmitry Vyukov
  2015-09-10 18:56                                                   ` Paul E. McKenney
  2015-09-10 22:00                                                   ` Christoph Lameter
  0 siblings, 2 replies; 45+ messages in thread
From: Dmitry Vyukov @ 2015-09-10 18:26 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Paul E. McKenney, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Thu, Sep 10, 2015 at 8:13 PM, Christoph Lameter <cl@linux.com> wrote:
> On Thu, 10 Sep 2015, Dmitry Vyukov wrote:
>
>> On Thu, Sep 10, 2015 at 8:01 PM, Christoph Lameter <cl@linux.com> wrote:
>> > On Thu, 10 Sep 2015, Paul E. McKenney wrote:
>> >
>> >> The reason we poked at this was to see if any of SLxB touched the
>> >> memory being freed.  If none of them touched the memory being freed,
>> >> and if that was a policy, then the idiom above would be legal.  However,
>> >> one of them does touch the memory being freed, so, yes, the above code
>> >> needs to be fixed.
>> >
>> > The one that touches the object has a barrier() before it touches the
>> > memory.
>>
>> It does not change anything, right?
>
> It changes the first word of the object after the barrier. The first word
> is used in SLUB as the pointer to the next free object.

User can also write to this object after it is reallocated. It is
equivalent to kmalloc writing to the object.
And barrier is not the kind of barrier that would make it correct.
So I do not see how it is relevant.

-- 
Dmitry Vyukov, Software Engineer, dvyukov@google.com
Google Germany GmbH, Dienerstraße 12, 80331, München
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und löschen Sie die E-Mail und alle Anhänge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10 18:26                                                 ` Dmitry Vyukov
@ 2015-09-10 18:56                                                   ` Paul E. McKenney
  2015-09-10 22:00                                                   ` Christoph Lameter
  1 sibling, 0 replies; 45+ messages in thread
From: Paul E. McKenney @ 2015-09-10 18:56 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Thu, Sep 10, 2015 at 08:26:59PM +0200, Dmitry Vyukov wrote:
> On Thu, Sep 10, 2015 at 8:13 PM, Christoph Lameter <cl@linux.com> wrote:
> > On Thu, 10 Sep 2015, Dmitry Vyukov wrote:
> >
> >> On Thu, Sep 10, 2015 at 8:01 PM, Christoph Lameter <cl@linux.com> wrote:
> >> > On Thu, 10 Sep 2015, Paul E. McKenney wrote:
> >> >
> >> >> The reason we poked at this was to see if any of SLxB touched the
> >> >> memory being freed.  If none of them touched the memory being freed,
> >> >> and if that was a policy, then the idiom above would be legal.  However,
> >> >> one of them does touch the memory being freed, so, yes, the above code
> >> >> needs to be fixed.
> >> >
> >> > The one that touches the object has a barrier() before it touches the
> >> > memory.
> >>
> >> It does not change anything, right?
> >
> > It changes the first word of the object after the barrier. The first word
> > is used in SLUB as the pointer to the next free object.
> 
> User can also write to this object after it is reallocated. It is
> equivalent to kmalloc writing to the object.
> And barrier is not the kind of barrier that would make it correct.
> So I do not see how it is relevant.

I believe that the two of you are talking past each other.  It sounds
to me that Christoph is arguing that SL*B is correctly implemented,
and that Dmitry is arguing that the use case is broken.

>From what I can see, both are correct.

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Is it OK to pass non-acquired objects to kfree?
  2015-09-10 18:26                                                 ` Dmitry Vyukov
  2015-09-10 18:56                                                   ` Paul E. McKenney
@ 2015-09-10 22:00                                                   ` Christoph Lameter
  1 sibling, 0 replies; 45+ messages in thread
From: Christoph Lameter @ 2015-09-10 22:00 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Paul E. McKenney, Pekka Enberg, David Rientjes, Joonsoo Kim,
	Andrew Morton, linux-mm, Andrey Konovalov, Alexander Potapenko

On Thu, 10 Sep 2015, Dmitry Vyukov wrote:

> > It changes the first word of the object after the barrier. The first word
> > is used in SLUB as the pointer to the next free object.
>
> User can also write to this object after it is reallocated. It is
> equivalent to kmalloc writing to the object.
> And barrier is not the kind of barrier that would make it correct.
> So I do not see how it is relevant.

This is a compiler barrier so nothing can be moved below that into the
code where the freelist pointer is handled. That was the issue from what
I heard?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2015-09-10 22:01 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-08  7:51 Is it OK to pass non-acquired objects to kfree? Dmitry Vyukov
2015-09-08 14:13 ` Christoph Lameter
2015-09-08 14:41   ` Dmitry Vyukov
2015-09-08 15:13     ` Christoph Lameter
2015-09-08 15:23       ` Dmitry Vyukov
2015-09-08 15:33         ` Christoph Lameter
2015-09-08 15:37           ` Dmitry Vyukov
2015-09-08 17:09             ` Christoph Lameter
2015-09-08 19:24               ` Dmitry Vyukov
2015-09-09 14:02                 ` Christoph Lameter
2015-09-09 14:19                   ` Dmitry Vyukov
2015-09-09 14:36                     ` Christoph Lameter
2015-09-09 15:30                       ` Dmitry Vyukov
2015-09-09 15:44                         ` Christoph Lameter
2015-09-09 16:09                           ` Dmitry Vyukov
2015-09-09 17:56                             ` Christoph Lameter
2015-09-09 18:44                               ` Paul E. McKenney
2015-09-09 19:01                                 ` Christoph Lameter
2015-09-09 20:36                                   ` Paul E. McKenney
2015-09-09 23:23                                     ` Store Buffers (was Re: Is it OK to pass non-acquired objects to kfree?) Christoph Lameter
2015-09-10  0:08                                       ` Paul E. McKenney
2015-09-10  0:21                                         ` Christoph Lameter
2015-09-10  1:10                                           ` Paul E. McKenney
2015-09-10  1:47                                             ` Christoph Lameter
2015-09-10  7:38                                               ` Vlastimil Babka
2015-09-10 16:37                                                 ` Christoph Lameter
2015-09-10  7:22                                       ` Vlastimil Babka
2015-09-10 16:36                                         ` Christoph Lameter
2015-09-09 23:31                                     ` Is it OK to pass non-acquired objects to kfree? Christoph Lameter
2015-09-10  9:55                                       ` Dmitry Vyukov
2015-09-10 10:42                                         ` Jesper Dangaard Brouer
2015-09-10 12:08                                           ` Dmitry Vyukov
2015-09-10 13:37                                             ` Eric Dumazet
2015-09-10 12:47                                         ` Vlastimil Babka
2015-09-10 13:17                                           ` Dmitry Vyukov
2015-09-10 17:13                                         ` Paul E. McKenney
2015-09-10 17:21                                           ` Paul E. McKenney
2015-09-10 17:26                                           ` Dmitry Vyukov
2015-09-10 17:44                                             ` Paul E. McKenney
2015-09-10 18:01                                           ` Christoph Lameter
2015-09-10 18:11                                             ` Dmitry Vyukov
2015-09-10 18:13                                               ` Christoph Lameter
2015-09-10 18:26                                                 ` Dmitry Vyukov
2015-09-10 18:56                                                   ` Paul E. McKenney
2015-09-10 22:00                                                   ` Christoph Lameter

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.