All of lore.kernel.org
 help / color / mirror / Atom feed
* maybe dumb question about RCU
@ 2015-04-07 22:52 Jeff Haran
  2015-04-08  2:08 ` Rock Lee
  2015-04-09 23:35 ` Jeff Haran
  0 siblings, 2 replies; 16+ messages in thread
From: Jeff Haran @ 2015-04-07 22:52 UTC (permalink / raw)
  To: kernelnewbies

Hi all,

I've been trying to understand the usage of RCU and am struggling a bit with one issue.
Let's say one intends to use "classic RCU" as described in whatisRCU.txt where readers use
sequences of rcu_read_lock(), rcu_dereference(), rcu_read_unlock) and updaters use
rcu_assign_pointer()/synchronize_rcu().

If one uses rcu_dereference() to access an RCU protect pointer twice within the same rcu
read locked critical section, is it guaranteed that the pointers returned by rcu_dereference()
will point to the same block of memory?

For instance, in the following code (trying make this as simple as possible even if it does nothing useful):

char *global;
char x[1];
char y[1];

void reader(void)
{
                char *a;
                char *b;

                rcu_read_lock();
                a = rcu_dereference(global);
                b = rcu_dereference(global);
                rcu_read_unlock();
}

void updater(void)
{
                rcu_assign_pointer(global, x);
                rcu_assign_pointer(global, y);
}

Following the second call to rcu_deference() in reader(), is it guaranteed that a and b will be
equal (either both equal to x or both equal y)? Or is it possible what a will equal x and b will equal y?

whatisRCU.txt seems to imply that though this is bad practice because its wasteful, both calls
to rcu_dereference() will return the same pointer:

"228 rcu_dereference()
...
244         Common coding practice uses rcu_dereference() to copy an
245         RCU-protected pointer to a local variable, then dereferences
246         this local variable, for example as follows:
247
248                 p = rcu_dereference(head.next);
249                 return p->data;
250
251         However, in this case, one could just as easily combine these
252         into one statement:
253
254                 return rcu_dereference(head.next)->data;
255
256         If you are going to be fetching multiple fields from the
257         RCU-protected structure, using the local variable is of
258         course preferred.  Repeated rcu_dereference() calls look
259         ugly and incur unnecessary overhead on Alpha CPUs."

>From lines 256 to 259 I conclude that reader()'s code is considered ugly and wasteful,
but a will always equal b.

But looking at how rcu_dereference() and rcu_assign_pointer() are implemented, I'm having a
hard time seeing how reader() would always see a and b equal.

Thanks in advance,

Jeff Haran

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20150407/840ba65e/attachment-0001.html 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-07 22:52 maybe dumb question about RCU Jeff Haran
@ 2015-04-08  2:08 ` Rock Lee
  2015-04-08  2:16   ` Jeff Haran
  2015-04-09 23:35 ` Jeff Haran
  1 sibling, 1 reply; 16+ messages in thread
From: Rock Lee @ 2015-04-08  2:08 UTC (permalink / raw)
  To: kernelnewbies

>
> 256         If you are going to be fetching multiple fields from the
>
> 257         RCU-protected structure, using the local variable is of
>
> 258         course preferred.  Repeated rcu_dereference() calls look
>
> 259         ugly and incur unnecessary overhead on Alpha CPUs.?
>
>  From lines 256 to 259 I conclude that reader()?s code is considered
> ugly and wasteful,
>
> but a will always equal b.
>
> But looking at how rcu_dereference() and rcu_assign_pointer() are
> implemented, I?m having a
>
> hard time seeing how reader() would always see a and b equal.
>
This is the implementation of rcu_dereference(). It is a little old, but 
useful as well.

#define rcu_dereference(p)     ({ \
				typeof(p) _________p1 = ACCESS_ONCE(p); \
				smp_read_barrier_depends(); \
				(_________p1); \
				})

It uses memory barrier to guarantee the order of code execution.
rcu_read_lock() actually disables preemption, so writer has no chance to 
modify critical section in the rcu_read_lock()/rcu_read_unlock() pair.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-08  2:08 ` Rock Lee
@ 2015-04-08  2:16   ` Jeff Haran
  2015-04-08 11:20     ` Rock Lee
  2015-04-09 17:44     ` Andev
  0 siblings, 2 replies; 16+ messages in thread
From: Jeff Haran @ 2015-04-08  2:16 UTC (permalink / raw)
  To: kernelnewbies

-----Original Message-----
From: Rock Lee [mailto:rocklee_104 at outlook.com] 
Sent: Tuesday, April 07, 2015 7:09 PM
To: Jeff Haran
Cc: kernelnewbies
Subject: Re: maybe dumb question about RCU

>
> 256         If you are going to be fetching multiple fields from the
>
> 257         RCU-protected structure, using the local variable is of
>
> 258         course preferred.  Repeated rcu_dereference() calls look
>
> 259         ugly and incur unnecessary overhead on Alpha CPUs."
>
>  From lines 256 to 259 I conclude that reader()'s code is considered 
> ugly and wasteful,
>
> but a will always equal b.
>
> But looking at how rcu_dereference() and rcu_assign_pointer() are 
> implemented, I'm having a
>
> hard time seeing how reader() would always see a and b equal.
>


>This is the implementation of rcu_dereference(). It is a little old, but useful as well.
>
>#define rcu_dereference(p)     ({ \
>				typeof(p) _________p1 = ACCESS_ONCE(p); \
>				smp_read_barrier_depends(); \
>				(_________p1); \
>				})
>
>It uses memory barrier to guarantee the order of code execution.
>rcu_read_lock() actually disables preemption, so writer has no chance to modify critical section in the rcu_read_lock()/rcu_read_unlock() pair.

Thanks for getting back to me, Rock.

Disabling preemption would prevent a writer on the same core as the reader from changing the pointer in the read critical section.

But what happens if the writer is running on another core of a multi-core system?

Seems like a writer on another core could still get in there and change the value of the pointer between the two rcu_dereference() calls in the reader.

Jeff Haran

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-08  2:16   ` Jeff Haran
@ 2015-04-08 11:20     ` Rock Lee
  2015-04-09 17:44     ` Andev
  1 sibling, 0 replies; 16+ messages in thread
From: Rock Lee @ 2015-04-08 11:20 UTC (permalink / raw)
  To: kernelnewbies

> From: Rock Lee [mailto:rocklee_104 at outlook.com]
> Sent: Tuesday, April 07, 2015 7:09 PM
> To: Jeff Haran
> Cc: kernelnewbies
> Subject: Re: maybe dumb question about RCU
>
>>
>> 256         If you are going to be fetching multiple fields from the
>>
>> 257         RCU-protected structure, using the local variable is of
>>
>> 258         course preferred.  Repeated rcu_dereference() calls look
>>
>> 259         ugly and incur unnecessary overhead on Alpha CPUs."
>>
>>   From lines 256 to 259 I conclude that reader()'s code is considered
>> ugly and wasteful,
>>
>> but a will always equal b.
>>
>> But looking at how rcu_dereference() and rcu_assign_pointer() are
>> implemented, I'm having a
>>
>> hard time seeing how reader() would always see a and b equal.
>>
>
>
>> This is the implementation of rcu_dereference(). It is a little old, but useful as well.
>>
>> #define rcu_dereference(p)     ({ \
>> 				typeof(p) _________p1 = ACCESS_ONCE(p); \
>> 				smp_read_barrier_depends(); \
>> 				(_________p1); \
>> 				})
>>
>> It uses memory barrier to guarantee the order of code execution.
>> rcu_read_lock() actually disables preemption, so writer has no chance to modify critical section in the rcu_read_lock()/rcu_read_unlock() pair.
>
> Thanks for getting back to me, Rock.
>
> Disabling preemption would prevent a writer on the same core as the reader from changing the pointer in the read critical section.
>
> But what happens if the writer is running on another core of a multi-core system?
>
> Seems like a writer on another core could still get in there and change the value of the pointer between the two rcu_dereference() calls in the reader.
>
> Jeff Haran
>

Yeeees, rcu_read_lock() and rcu_read_unlock() calls never spin or block, 
nor do they prevent the writer from changing the value of the critical 
section concurrently.

I am still confused about this part, sorry. I think this article will 
help, although my poor English hasn't make me fully understand it.

http://lwn.net/Articles/262464/

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-08  2:16   ` Jeff Haran
  2015-04-08 11:20     ` Rock Lee
@ 2015-04-09 17:44     ` Andev
  1 sibling, 0 replies; 16+ messages in thread
From: Andev @ 2015-04-09 17:44 UTC (permalink / raw)
  To: kernelnewbies

On Tue, Apr 7, 2015 at 10:16 PM, Jeff Haran <Jeff.Haran@citrix.com> wrote:
> Disabling preemption would prevent a writer on the same core as the reader from changing the pointer in the read critical section.
>
> But what happens if the writer is running on another core of a multi-core system?
>
> Seems like a writer on another core could still get in there and change the value of the pointer between the two rcu_dereference() calls in the reader.

Think of it this way. You have a global variable which is being
managed by RCU i.e., will be accessed and modified using rcu_*() API.

When you enter a rcu read critical section using rcu_read_lock(),
conceptually what happens is you create a copy of that variable say
global_v1. As long as you are within the critical section, any
references to global will return the value of global_v1.

Now let's say that a write has modified this variable. If another read
critical section starts after this modification, the version in that
read critical section will be global_v2 which has the new value of
global. Bear in mind that the first read critical section is not yet
complete, so RCU needs to ensure that both copies of global still
exist. Again, as long as this read critical section is active, it will
get the new value from global_v2.

At a later point, the first read critical section will complete and
RCU will then delete/remove global_v1 as this is no longer necessary.

There are some details missing, but this is the crux of the RCU concept.

-- 
Andev

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-07 22:52 maybe dumb question about RCU Jeff Haran
  2015-04-08  2:08 ` Rock Lee
@ 2015-04-09 23:35 ` Jeff Haran
  2015-04-10  0:00   ` Mandeep Sandhu
  2015-04-10  2:22   ` Andev
  1 sibling, 2 replies; 16+ messages in thread
From: Jeff Haran @ 2015-04-09 23:35 UTC (permalink / raw)
  To: kernelnewbies

Hi all,

I've written some sample code to test out the question I raised earlier and it seems that if you call rcu_dereference() on a given RCU protected pointer more than once within a read critical section delimited by rcu_read_lock() and rcu_read_unlock(), you can get different values. Maybe this is well known to experienced kernel developers, but it was unclear to me from the documentation I'd read so I figured I'd share what I found with my fellow newbies.

The code for the sample kernel module is copied at the end if you'd like to try it yourself. Just insmod the resulting rcutest.ko. When I do so, I get this on my console:

[root at s01b01 ~]# [  496.970678] rcutest initialized
[  499.968538] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 1 count = 3
[  503.975844] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 2 count = 7
[  507.983410] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 3 count = 11
[  511.990554] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 4 count = 15
[  515.997486] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 5 count = 19
[  520.005163] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 6 count = 23
[  524.012020] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 7 count = 27
[  528.019520] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 8 count = 31
[  532.026716] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 9 count = 35
[  536.034235] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 10 count = 39
[  540.041010] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 11 count = 43
[  544.048265] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 12 count = 47
...

Indicating that the values obtained within the critical section can be different. So it would seem that the best practice when reading these RCU protected pointers is to get them once and only once within the critical section via a single call to rcu_dereference_pointer(), store that pointer someplace local and then operate on the copy rather than make multiple calls to rcu_dereference_pointer().

The files copied below are the source rcutest.c, a Makefile and a Kbuild. You might have to fiddle with the Makefile to get it to work on your system. We build against many kernel versions here so the Makefile is setup to point to different kernel versions.

Jeff Haran

$ cat rcutest.c
#include <linux/module.h>
#include <linux/moduleparam.h>
#include <linux/version.h>
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/kdev_t.h>
#include <linux/fs.h>
#include <linux/device.h>
#include <linux/cdev.h>
#include <linux/uaccess.h>
#include <linux/sched.h>
#include <linux/wait.h>
#include <linux/list.h>
#include <linux/kthread.h>

char rcu_test_1[1];
char rcu_test_2[1];
char *rcu_pointer = rcu_test_1;
struct task_struct *updater_task;
struct task_struct *reader_task;
int off_count = 0;

static int rcu_updater_thread(void *arg)
{
        int count = 0;

        while (!kthread_should_stop()) {
                if (count & 1) {
                        rcu_assign_pointer(rcu_pointer, rcu_test_2);
                } else {
                        rcu_assign_pointer(rcu_pointer, rcu_test_1);
                }

                synchronize_rcu();
                count++;
                msleep(1500);
        }

        return 0;
}

static int rcu_reader_thread(void *arg)
{
        int count = 0;

        while (!kthread_should_stop()) {
                char *a;
                char *b;
                u64 start;

                count++;
                rcu_read_lock();
                a = rcu_dereference(rcu_pointer);
                start = get_jiffies_64();
                while (get_jiffies_64() < (start + 1000));
                b = rcu_dereference(rcu_pointer);
                rcu_read_unlock();
                if (a != b) {
                        off_count++;
                        printk(KERN_INFO "rcu_reader_thread a = 0x%p b = 0x%p off_count = %d count = %d\n",
                                a, b, off_count, count);
                }
                if (need_resched()) {
                        schedule();
                }
        }

        return 0;
}

static int startrcuupdater(void)
{
        int rc = 0;

        updater_task = kthread_create(rcu_updater_thread, 0, "rcuupdater");
        if (IS_ERR(updater_task)) {
                rc = PTR_ERR(updater_task);
                printk(KERN_ERR "rcu_updater create failed %d\n", rc);
                goto err;
        }

        wake_up_process(updater_task);

err:
        return rc;
}

static int startrcureader(void)
{
        int rc = 0;

        reader_task = kthread_create(rcu_reader_thread, 0, "rcureader");
        if (IS_ERR(reader_task)) {
                rc = PTR_ERR(reader_task);
                printk(KERN_ERR "rcu_updater create failed %d\n", rc);
                goto err;
        }

        wake_up_process(reader_task);

err:
        return rc;
}

static int __init rcu_test_init(void)
{
        int rc;

        rc = startrcuupdater();
        if (rc) {
                goto err;
        }
        rc = startrcureader();
        if (rc) {
                kthread_stop(updater_task);
                goto err;
        }
        printk(KERN_INFO "rcutest initialized\n");

err:

        return rc;
}

static void __exit rcu_test_exit(void)
{
        kthread_stop(reader_task);
        kthread_stop(updater_task);
    printk(KERN_INFO "rcutest exited\n");
}

module_init(rcu_test_init);
module_exit(rcu_test_exit);

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Jeff Haran");
MODULE_DESCRIPTION("a test of RCU");

$ cat Makefile

SANDBOX:=/s/fusion/33

#UNAME-R := $(subst ${SANDBOX}/lib/modules/,,${VER})
UNAME-R:=2.6.32-504.8.1.el6.x86_64

all:
        make -C ${SANDBOX}/lib/modules/${UNAME-R}/build M=$(CURDIR) RCUTEST_BASE=${CURDIR} modules

clean:
        make -C ${SANDBOX}/lib/modules/${UNAME-R}/build M=$(CURDIR) clean
        rm -f Module.markers Module.symvers

$ cat Kbuild
ccflags-y += -I$(RCUTEST_BASE)/include -Werror

obj-m += rcutest.o

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-09 23:35 ` Jeff Haran
@ 2015-04-10  0:00   ` Mandeep Sandhu
  2015-04-10  0:15     ` Jeff Haran
  2015-04-10  2:22   ` Andev
  1 sibling, 1 reply; 16+ messages in thread
From: Mandeep Sandhu @ 2015-04-10  0:00 UTC (permalink / raw)
  To: kernelnewbies

Indeed the 2 rcu_deref's are giving different pointer values on my
machine too (which is running a much newer kernel compared to the one
you used in the example, 2.6.32 vs 3.16 on x86_64).

Here's the result from my machine:

[545078.854213] rcutest initialized
[545094.859095] rcu_reader_thread a = 0xffffffffc0ae22b9 b =
0xffffffffc0ae22b8 off_count = 1 count = 4
[545110.866479] rcu_reader_thread a = 0xffffffffc0ae22b8 b =
0xffffffffc0ae22b9 off_count = 2 count = 8
[545126.873861] rcu_reader_thread a = 0xffffffffc0ae22b9 b =
0xffffffffc0ae22b8 off_count = 3 count = 12
[545138.879398] rcu_reader_thread a = 0xffffffffc0ae22b8 b =
0xffffffffc0ae22b9 off_count = 4 count = 15

I think this warrants updating the docs, to make this point more distinct?


On Thu, Apr 9, 2015 at 4:35 PM, Jeff Haran <Jeff.Haran@citrix.com> wrote:
> Hi all,
>
> I've written some sample code to test out the question I raised earlier and it seems that if you call rcu_dereference() on a given RCU protected pointer more than once within a read critical section delimited by rcu_read_lock() and rcu_read_unlock(), you can get different values. Maybe this is well known to experienced kernel developers, but it was unclear to me from the documentation I'd read so I figured I'd share what I found with my fellow newbies.
>
> The code for the sample kernel module is copied at the end if you'd like to try it yourself. Just insmod the resulting rcutest.ko. When I do so, I get this on my console:
>
> [root at s01b01 ~]# [  496.970678] rcutest initialized
> [  499.968538] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 1 count = 3
> [  503.975844] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 2 count = 7
> [  507.983410] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 3 count = 11
> [  511.990554] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 4 count = 15
> [  515.997486] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 5 count = 19
> [  520.005163] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 6 count = 23
> [  524.012020] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 7 count = 27
> [  528.019520] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 8 count = 31
> [  532.026716] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 9 count = 35
> [  536.034235] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 10 count = 39
> [  540.041010] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 11 count = 43
> [  544.048265] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 12 count = 47
> ...
>
> Indicating that the values obtained within the critical section can be different. So it would seem that the best practice when reading these RCU protected pointers is to get them once and only once within the critical section via a single call to rcu_dereference_pointer(), store that pointer someplace local and then operate on the copy rather than make multiple calls to rcu_dereference_pointer().
>
> The files copied below are the source rcutest.c, a Makefile and a Kbuild. You might have to fiddle with the Makefile to get it to work on your system. We build against many kernel versions here so the Makefile is setup to point to different kernel versions.
>
> Jeff Haran
>
> $ cat rcutest.c
> #include <linux/module.h>
> #include <linux/moduleparam.h>
> #include <linux/version.h>
> #include <linux/kernel.h>
> #include <linux/types.h>
> #include <linux/kdev_t.h>
> #include <linux/fs.h>
> #include <linux/device.h>
> #include <linux/cdev.h>
> #include <linux/uaccess.h>
> #include <linux/sched.h>
> #include <linux/wait.h>
> #include <linux/list.h>
> #include <linux/kthread.h>
>
> char rcu_test_1[1];
> char rcu_test_2[1];
> char *rcu_pointer = rcu_test_1;
> struct task_struct *updater_task;
> struct task_struct *reader_task;
> int off_count = 0;
>
> static int rcu_updater_thread(void *arg)
> {
>         int count = 0;
>
>         while (!kthread_should_stop()) {
>                 if (count & 1) {
>                         rcu_assign_pointer(rcu_pointer, rcu_test_2);
>                 } else {
>                         rcu_assign_pointer(rcu_pointer, rcu_test_1);
>                 }
>
>                 synchronize_rcu();
>                 count++;
>                 msleep(1500);
>         }
>
>         return 0;
> }
>
> static int rcu_reader_thread(void *arg)
> {
>         int count = 0;
>
>         while (!kthread_should_stop()) {
>                 char *a;
>                 char *b;
>                 u64 start;
>
>                 count++;
>                 rcu_read_lock();
>                 a = rcu_dereference(rcu_pointer);
>                 start = get_jiffies_64();
>                 while (get_jiffies_64() < (start + 1000));
>                 b = rcu_dereference(rcu_pointer);
>                 rcu_read_unlock();
>                 if (a != b) {
>                         off_count++;
>                         printk(KERN_INFO "rcu_reader_thread a = 0x%p b = 0x%p off_count = %d count = %d\n",
>                                 a, b, off_count, count);
>                 }
>                 if (need_resched()) {
>                         schedule();
>                 }
>         }
>
>         return 0;
> }
>
> static int startrcuupdater(void)
> {
>         int rc = 0;
>
>         updater_task = kthread_create(rcu_updater_thread, 0, "rcuupdater");
>         if (IS_ERR(updater_task)) {
>                 rc = PTR_ERR(updater_task);
>                 printk(KERN_ERR "rcu_updater create failed %d\n", rc);
>                 goto err;
>         }
>
>         wake_up_process(updater_task);
>
> err:
>         return rc;
> }
>
> static int startrcureader(void)
> {
>         int rc = 0;
>
>         reader_task = kthread_create(rcu_reader_thread, 0, "rcureader");
>         if (IS_ERR(reader_task)) {
>                 rc = PTR_ERR(reader_task);
>                 printk(KERN_ERR "rcu_updater create failed %d\n", rc);
>                 goto err;
>         }
>
>         wake_up_process(reader_task);
>
> err:
>         return rc;
> }
>
> static int __init rcu_test_init(void)
> {
>         int rc;
>
>         rc = startrcuupdater();
>         if (rc) {
>                 goto err;
>         }
>         rc = startrcureader();
>         if (rc) {
>                 kthread_stop(updater_task);
>                 goto err;
>         }
>         printk(KERN_INFO "rcutest initialized\n");
>
> err:
>
>         return rc;
> }
>
> static void __exit rcu_test_exit(void)
> {
>         kthread_stop(reader_task);
>         kthread_stop(updater_task);
>     printk(KERN_INFO "rcutest exited\n");
> }
>
> module_init(rcu_test_init);
> module_exit(rcu_test_exit);
>
> MODULE_LICENSE("GPL");
> MODULE_AUTHOR("Jeff Haran");
> MODULE_DESCRIPTION("a test of RCU");
>
> $ cat Makefile
>
> SANDBOX:=/s/fusion/33
>
> #UNAME-R := $(subst ${SANDBOX}/lib/modules/,,${VER})
> UNAME-R:=2.6.32-504.8.1.el6.x86_64
>
> all:
>         make -C ${SANDBOX}/lib/modules/${UNAME-R}/build M=$(CURDIR) RCUTEST_BASE=${CURDIR} modules
>
> clean:
>         make -C ${SANDBOX}/lib/modules/${UNAME-R}/build M=$(CURDIR) clean
>         rm -f Module.markers Module.symvers
>
> $ cat Kbuild
> ccflags-y += -I$(RCUTEST_BASE)/include -Werror
>
> obj-m += rcutest.o
>
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-10  0:00   ` Mandeep Sandhu
@ 2015-04-10  0:15     ` Jeff Haran
  2015-04-10  0:21       ` Mandeep Sandhu
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff Haran @ 2015-04-10  0:15 UTC (permalink / raw)
  To: kernelnewbies

-----Original Message-----
>From: Mandeep Sandhu [mailto:mandeepsandhu.chd at gmail.com] 
>Sent: Thursday, April 09, 2015 5:01 PM
>To: Jeff Haran
>Cc: kernelnewbies
>Subject: Re: maybe dumb question about RCU
>
>Indeed the 2 rcu_deref's are giving different pointer values on my machine too (which is running a much newer kernel compared to the one you used in the example, 2.6.32 vs 3.16 on x86_64).

Glad somebody ran it on something newer than the Redhat EL 6.4 that I use and saw the same results. Thanks, it confirms to me that this is intended behavior.

>Here's the result from my machine:
>
>[545078.854213] rcutest initialized
>[545094.859095] rcu_reader_thread a = 0xffffffffc0ae22b9 b =
>0xffffffffc0ae22b8 off_count = 1 count = 4 [545110.866479] rcu_reader_thread a = 0xffffffffc0ae22b8 b =
>0xffffffffc0ae22b9 off_count = 2 count = 8 [545126.873861] rcu_reader_thread a = 0xffffffffc0ae22b9 b =
>0xffffffffc0ae22b8 off_count = 3 count = 12 [545138.879398] rcu_reader_thread a = 0xffffffffc0ae22b8 b =
>0xffffffffc0ae22b9 off_count = 4 count = 15
>
>I think this warrants updating the docs, to make this point more distinct?

I would seem to be a good idea to me. This quote from whatisRCU.txt in the Documentation/RCU directory really through me off:

256         If you are going to be fetching multiple fields from the
257         RCU-protected structure, using the local variable is of
258         course preferred.  Repeated rcu_dereference() calls look
259         ugly and incur unnecessary overhead on Alpha CPUs.

"look ugly and incur unnecessary overhead on Alpha CPUs" is a long way from "can return different pointer values", which is what it should say.

I guess the question would be where to put it. I note there is a rcu_dereference.txt file in the same directory. It talks about various ways gcc can mess you up when using these pointers and ways to avoid that, but I didn't see anything in it that mentions this particular issue.

Jeff Haran

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-10  0:15     ` Jeff Haran
@ 2015-04-10  0:21       ` Mandeep Sandhu
  2015-04-10  0:29         ` Jeff Haran
  0 siblings, 1 reply; 16+ messages in thread
From: Mandeep Sandhu @ 2015-04-10  0:21 UTC (permalink / raw)
  To: kernelnewbies

>
> 256         If you are going to be fetching multiple fields from the
> 257         RCU-protected structure, using the local variable is of
> 258         course preferred.  Repeated rcu_dereference() calls look
> 259         ugly and incur unnecessary overhead on Alpha CPUs.
>
> "look ugly and incur unnecessary overhead on Alpha CPUs" is a long way from "can return different pointer values", which is what it should say.

I agree, that statement is not only misleading, but incorrect too.
Maybe we can delete it and update it with a "Note:" about your
findings? How about sending a doc patch for it?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-10  0:21       ` Mandeep Sandhu
@ 2015-04-10  0:29         ` Jeff Haran
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Haran @ 2015-04-10  0:29 UTC (permalink / raw)
  To: kernelnewbies

-----Original Message-----
From: Mandeep Sandhu [mailto:mandeepsandhu.chd at gmail.com] 
Sent: Thursday, April 09, 2015 5:21 PM
To: Jeff Haran
Cc: kernelnewbies
Subject: Re: maybe dumb question about RCU

>
> 256         If you are going to be fetching multiple fields from the
> 257         RCU-protected structure, using the local variable is of
> 258         course preferred.  Repeated rcu_dereference() calls look
> 259         ugly and incur unnecessary overhead on Alpha CPUs.
>
> "look ugly and incur unnecessary overhead on Alpha CPUs" is a long way from "can return different pointer values", which is what it should say.

>I agree, that statement is not only misleading, but incorrect too.
>Maybe we can delete it and update it with a "Note:" about your findings? How about sending a doc patch for it?

I can try. Posting patches unmolested through my employer's Outlook email servers has been fraught with frustration in the past when I've tried it. 8^(

But it would be nice to hear from an RCU expert whether this is a feature or a bug. I suspect the former, I really can't see how it could work otherwise, but what do I know?

Jeff Haran

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-09 23:35 ` Jeff Haran
  2015-04-10  0:00   ` Mandeep Sandhu
@ 2015-04-10  2:22   ` Andev
  2015-04-10 16:47     ` Jeff Haran
  1 sibling, 1 reply; 16+ messages in thread
From: Andev @ 2015-04-10  2:22 UTC (permalink / raw)
  To: kernelnewbies

On Thu, Apr 9, 2015 at 7:35 PM, Jeff Haran <Jeff.Haran@citrix.com> wrote:
> Hi all,
>
> I've written some sample code to test out the question I raised earlier and it seems that if you call rcu_dereference() on a given RCU protected pointer more than once within a read critical section delimited by rcu_read_lock() and rcu_read_unlock(), you can get different values. Maybe this is well known to experienced kernel developers, but it was unclear to me from the documentation I'd read so I figured I'd share what I found with my fellow newbies.

Not exactly, but it is good that you wrote the test! Let us see what
is happening...

>
> The code for the sample kernel module is copied at the end if you'd like to try it yourself. Just insmod the resulting rcutest.ko. When I do so, I get this on my console:
>
> [root at s01b01 ~]# [  496.970678] rcutest initialized
> [  499.968538] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 1 count = 3
> [  503.975844] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 2 count = 7
> [  507.983410] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 3 count = 11
> [  511.990554] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 4 count = 15
> [  515.997486] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 5 count = 19
> [  520.005163] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 6 count = 23
> [  524.012020] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 7 count = 27
> [  528.019520] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 8 count = 31
> [  532.026716] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 9 count = 35
> [  536.034235] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 10 count = 39
> [  540.041010] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 11 count = 43
> [  544.048265] rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 12 count = 47
> ...
>
> Indicating that the values obtained within the critical section can be different. So it would seem that the best practice when reading these RCU protected pointers is to get them once and only once within the critical section via a single call to rcu_dereference_pointer(), store that pointer someplace local and then operate on the copy rather than make multiple calls to rcu_dereference_pointer().
>
> The files copied below are the source rcutest.c, a Makefile and a Kbuild. You might have to fiddle with the Makefile to get it to work on your system. We build against many kernel versions here so the Makefile is setup to point to different kernel versions.
>
> Jeff Haran
>
> $ cat rcutest.c
> #include <linux/module.h>
> #include <linux/moduleparam.h>
> #include <linux/version.h>
> #include <linux/kernel.h>
> #include <linux/types.h>
> #include <linux/kdev_t.h>
> #include <linux/fs.h>
> #include <linux/device.h>
> #include <linux/cdev.h>
> #include <linux/uaccess.h>
> #include <linux/sched.h>
> #include <linux/wait.h>
> #include <linux/list.h>
> #include <linux/kthread.h>
>
> char rcu_test_1[1];
> char rcu_test_2[1];
> char *rcu_pointer = rcu_test_1;
> struct task_struct *updater_task;
> struct task_struct *reader_task;
> int off_count = 0;
>
> static int rcu_updater_thread(void *arg)
> {
>         int count = 0;
>
>         while (!kthread_should_stop()) {
>                 if (count & 1) {
>                         rcu_assign_pointer(rcu_pointer, rcu_test_2);
>                 } else {
>                         rcu_assign_pointer(rcu_pointer, rcu_test_1);
>                 }
>
>                 synchronize_rcu();

This is the key to why you are seeing different values.
synchronize_rcu() returns only after a grace period which is the
duration for which rcu pointers are held without being modified. If
you have read my previous example global_v1 will be valid only within
the grace period when the read critical section started.

The crucial thing to note here is that grace periods cannot last
forever. Grace period is extended if a read critical section is still
active. But the period will not be extended forever. A grace period
will be forced to end after giving it a sufficient time, hence the
requirement to keep your read side critical sections short so that a
grace period can end and rcu can free/complete all the enqueued
callbacks.


>                 count++;
>                 msleep(1500);

This is a huge time. Many grace periods will be elapsed in this time

>         }
>
>         return 0;
> }
>

Great test example. Nothing like learning from experience.

-- 
Andev

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-10  2:22   ` Andev
@ 2015-04-10 16:47     ` Jeff Haran
  2015-04-10 20:12       ` Andev
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff Haran @ 2015-04-10 16:47 UTC (permalink / raw)
  To: kernelnewbies

> -----Original Message-----
> From: Andev [mailto:debiandev at gmail.com]
> Sent: Thursday, April 09, 2015 7:22 PM
> To: Jeff Haran
> Cc: kernelnewbies
> Subject: Re: maybe dumb question about RCU
> 
> On Thu, Apr 9, 2015 at 7:35 PM, Jeff Haran <Jeff.Haran@citrix.com> wrote:
> > Hi all,
> >
> > I've written some sample code to test out the question I raised earlier and
> it seems that if you call rcu_dereference() on a given RCU protected pointer
> more than once within a read critical section delimited by rcu_read_lock()
> and rcu_read_unlock(), you can get different values. Maybe this is well
> known to experienced kernel developers, but it was unclear to me from the
> documentation I'd read so I figured I'd share what I found with my fellow
> newbies.
> 
> Not exactly, but it is good that you wrote the test! Let us see what is
> happening...
> 
> >
> > The code for the sample kernel module is copied at the end if you'd like to
> try it yourself. Just insmod the resulting rcutest.ko. When I do so, I get this on
> my console:
> >
> > [root at s01b01 ~]# [  496.970678] rcutest initialized [  499.968538]
> > rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475
> > off_count = 1 count = 3 [  503.975844] rcu_reader_thread a =
> > 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 2 count = 7 [
> > 507.983410] rcu_reader_thread a = 0xffffffffa00ab474 b =
> > 0xffffffffa00ab475 off_count = 3 count = 11 [  511.990554]
> > rcu_reader_thread a = 0xffffffffa00ab475 b = 0xffffffffa00ab474
> > off_count = 4 count = 15 [  515.997486] rcu_reader_thread a =
> > 0xffffffffa00ab474 b = 0xffffffffa00ab475 off_count = 5 count = 19 [
> > 520.005163] rcu_reader_thread a = 0xffffffffa00ab475 b =
> > 0xffffffffa00ab474 off_count = 6 count = 23 [  524.012020]
> > rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475
> > off_count = 7 count = 27 [  528.019520] rcu_reader_thread a =
> > 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 8 count = 31 [
> 532.026716] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475
> off_count = 9 count = 35 [  536.034235] rcu_reader_thread a =
> 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 10 count = 39 [
> 540.041010] rcu_reader_thread a = 0xffffffffa00ab474 b = 0xffffffffa00ab475
> off_count = 11 count = 43 [  544.048265] rcu_reader_thread a =
> 0xffffffffa00ab475 b = 0xffffffffa00ab474 off_count = 12 count = 47 ...
> >
> > Indicating that the values obtained within the critical section can be
> different. So it would seem that the best practice when reading these RCU
> protected pointers is to get them once and only once within the critical
> section via a single call to rcu_dereference_pointer(), store that pointer
> someplace local and then operate on the copy rather than make multiple
> calls to rcu_dereference_pointer().
> >
> > The files copied below are the source rcutest.c, a Makefile and a Kbuild.
> You might have to fiddle with the Makefile to get it to work on your system.
> We build against many kernel versions here so the Makefile is setup to point
> to different kernel versions.
> >
> > Jeff Haran
> >
> > $ cat rcutest.c
> > #include <linux/module.h>
> > #include <linux/moduleparam.h>
> > #include <linux/version.h>
> > #include <linux/kernel.h>
> > #include <linux/types.h>
> > #include <linux/kdev_t.h>
> > #include <linux/fs.h>
> > #include <linux/device.h>
> > #include <linux/cdev.h>
> > #include <linux/uaccess.h>
> > #include <linux/sched.h>
> > #include <linux/wait.h>
> > #include <linux/list.h>
> > #include <linux/kthread.h>
> >
> > char rcu_test_1[1];
> > char rcu_test_2[1];
> > char *rcu_pointer = rcu_test_1;
> > struct task_struct *updater_task;
> > struct task_struct *reader_task;
> > int off_count = 0;
> >
> > static int rcu_updater_thread(void *arg) {
> >         int count = 0;
> >
> >         while (!kthread_should_stop()) {
> >                 if (count & 1) {
> >                         rcu_assign_pointer(rcu_pointer, rcu_test_2);
> >                 } else {
> >                         rcu_assign_pointer(rcu_pointer, rcu_test_1);
> >                 }
> >
> >                 synchronize_rcu();
> 
> This is the key to why you are seeing different values.
> synchronize_rcu() returns only after a grace period which is the duration for
> which rcu pointers are held without being modified. If you have read my
> previous example global_v1 will be valid only within the grace period when
> the read critical section started.
> 
> The crucial thing to note here is that grace periods cannot last forever. Grace
> period is extended if a read critical section is still active. But the period will
> not be extended forever. A grace period will be forced to end after giving it a
> sufficient time, hence the requirement to keep your read side critical
> sections short so that a grace period can end and rcu can free/complete all
> the enqueued callbacks.
> 
> 
> >                 count++;
> >                 msleep(1500);
> 
> This is a huge time. Many grace periods will be elapsed in this time

The msleep() happens after the rcu_assign_pointer()/synchronize_rcu() sequence, not in the middle of it. All that msleep() is for is to make sure the updater isn't spinning that core and I chose the value so that the updater which thus runs every 1.5 seconds doesn't beat with the reader that is holding the rcu read lock for 1 second. Now one can argue that this in the reader is an atypically long time to hold a RCU read lock:

                rcu_read_lock();
                a = rcu_dereference(rcu_pointer);
                start = get_jiffies_64();
                while (get_jiffies_64() < (start + 1000));
                b = rcu_dereference(rcu_pointer);
                rcu_read_unlock();

That causes the RCU read "lock" to be held for 1 second (at least on systems where a jiffie is a millisecond). But if that's "too long" then how long exactly is "too long"? If 1 second is too long, is 1 millisecond too long? How is the developer using this interface to know either how long is too long or how long the code they've written will take to execute given the plethora of different systems it might run on?

The documentation I've seen says that the only restriction on RCU readers is that they don't block and the above doesn't so it satisfies the requirements of the interface near as I can tell.

The conclusion I've come to is that the read critical section is *NOT* there to make sure that the value returned by rcu_dereference() of a given pointer doesn't change within the RCU read critical section (the code delimited by the rcu_read_lock() and rcu_read_unlock()).

Rather the purpose of rcu_read_lock()/rcu_read_unlock() is there to keep synchronize_rcu()@bay so that the updater can safely free any memory block that may have been overwritten in the most recent call to rcu_assign_pointer(). The example I provided where rcu_pointer just oscillates between two different addresses of static buffers was done that way to make it simple. More typically, the pointers passed as the second parameter to rcu_assign_pointer() point to allocated memory. When updaters do that, they need to free those pointers eventually but they can't do that until they know that all readers aren't accessing them anymore. Normally, any freeing would happen after synchronize_rcu() returned and the read critical section is there to tell synchronize_rcu() when its safe to return and let the updater free the old memory block.

At least that's how I read it now.

Jeff Haran

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-10 16:47     ` Jeff Haran
@ 2015-04-10 20:12       ` Andev
  2015-04-10 20:34         ` Jeff Haran
  0 siblings, 1 reply; 16+ messages in thread
From: Andev @ 2015-04-10 20:12 UTC (permalink / raw)
  To: kernelnewbies

On Fri, Apr 10, 2015 at 12:47 PM, Jeff Haran <Jeff.Haran@citrix.com> wrote:
>
> The msleep() happens after the rcu_assign_pointer()/synchronize_rcu() sequence, not in the middle of it. All that msleep() is for is to make sure the updater isn't spinning that core and I chose the value so that the updater which thus runs every 1.5 seconds doesn't beat with the reader that is holding the rcu read lock for 1 second. Now one can argue that this in the reader is an atypically long time to hold a RCU read lock:
>
>                 rcu_read_lock();
>                 a = rcu_dereference(rcu_pointer);
>                 start = get_jiffies_64();
>                 while (get_jiffies_64() < (start + 1000));
>                 b = rcu_dereference(rcu_pointer);
>                 rcu_read_unlock();
>
> That causes the RCU read "lock" to be held for 1 second (at least on systems where a jiffie is a millisecond). But if that's "too long" then how long exactly is "too long"? If 1 second is too long, is 1 millisecond too long? How is the developer using this interface to know either how long is too long or how long the code they've written will take to execute given the plethora of different systems it might run on?
>
> The documentation I've seen says that the only restriction on RCU readers is that they don't block and the above doesn't so it satisfies the requirements of the interface near as I can tell.

This section of the code might still be pre-empted. Try putting a
preempt_disable() at the beginning and see if you are getting the same
value or not.

Relevant comment here:
http://lxr.free-electrons.com/source/include/linux/rcupdate.h#L837

>
> The conclusion I've come to is that the read critical section is *NOT* there to make sure that the value returned by rcu_dereference() of a given pointer doesn't change within the RCU read critical section (the code delimited by the rcu_read_lock() and rcu_read_unlock()).
>
> Rather the purpose of rcu_read_lock()/rcu_read_unlock() is there to keep synchronize_rcu() at bay so that the updater can safely free any memory block that may have been overwritten in the most recent call to rcu_assign_pointer(). The example I provided where rcu_pointer just oscillates between two different addresses of static buffers was done that way to make it simple. More typically, the pointers passed as the second parameter to rcu_assign_pointer() point to allocated memory. When updaters do that, they need to free those pointers eventually but they can't do that until they know that all readers aren't accessing them anymore. Normally, any freeing would happen after synchronize_rcu() returned and the read critical section is there to tell synchronize_rcu() when its safe to return and let the updater free the old memory block.
>
> At least that's how I read it now.
>
> Jeff Haran
>



-- 
Andev

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-10 20:12       ` Andev
@ 2015-04-10 20:34         ` Jeff Haran
  2015-04-13 21:00           ` Andev
  0 siblings, 1 reply; 16+ messages in thread
From: Jeff Haran @ 2015-04-10 20:34 UTC (permalink / raw)
  To: kernelnewbies

> -----Original Message-----
> From: Andev [mailto:debiandev at gmail.com]
> Sent: Friday, April 10, 2015 1:13 PM
> To: Jeff Haran
> Cc: kernelnewbies
> Subject: Re: maybe dumb question about RCU
> 
> On Fri, Apr 10, 2015 at 12:47 PM, Jeff Haran <Jeff.Haran@citrix.com> wrote:
> >
> > The msleep() happens after the rcu_assign_pointer()/synchronize_rcu()
> sequence, not in the middle of it. All that msleep() is for is to make sure the
> updater isn't spinning that core and I chose the value so that the updater
> which thus runs every 1.5 seconds doesn't beat with the reader that is
> holding the rcu read lock for 1 second. Now one can argue that this in the
> reader is an atypically long time to hold a RCU read lock:
> >
> >                 rcu_read_lock();
> >                 a = rcu_dereference(rcu_pointer);
> >                 start = get_jiffies_64();
> >                 while (get_jiffies_64() < (start + 1000));
> >                 b = rcu_dereference(rcu_pointer);
> >                 rcu_read_unlock();
> >
> > That causes the RCU read "lock" to be held for 1 second (at least on
> systems where a jiffie is a millisecond). But if that's "too long" then how long
> exactly is "too long"? If 1 second is too long, is 1 millisecond too long? How is
> the developer using this interface to know either how long is too long or how
> long the code they've written will take to execute given the plethora of
> different systems it might run on?
> >
> > The documentation I've seen says that the only restriction on RCU readers
> is that they don't block and the above doesn't so it satisfies the requirements
> of the interface near as I can tell.
> 
> This section of the code might still be pre-empted. Try putting a
> preempt_disable() at the beginning and see if you are getting the same value
> or not.
> 
> Relevant comment here:
> http://lxr.free-electrons.com/source/include/linux/rcupdate.h#L837

When you say "at the beginning", where do you mean? Right before the rcu_read_lock()?
And I assume a preempt_enable() should be in there too? Where would that go?

Though, it doesn't seem like this should make any difference. Disabling preemption would keep other processes from running on the core the reader is on, but the writing via rcu_assign_pointer() could (and probably is) happening on another core.

Thanks,

Jeff Haran

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-10 20:34         ` Jeff Haran
@ 2015-04-13 21:00           ` Andev
  2015-04-13 21:13             ` Jeff Haran
  0 siblings, 1 reply; 16+ messages in thread
From: Andev @ 2015-04-13 21:00 UTC (permalink / raw)
  To: kernelnewbies

On Fri, Apr 10, 2015 at 4:34 PM, Jeff Haran <Jeff.Haran@citrix.com> wrote:
>> > The msleep() happens after the rcu_assign_pointer()/synchronize_rcu()
>> sequence, not in the middle of it. All that msleep() is for is to make sure the
>> updater isn't spinning that core and I chose the value so that the updater
>> which thus runs every 1.5 seconds doesn't beat with the reader that is
>> holding the rcu read lock for 1 second. Now one can argue that this in the
>> reader is an atypically long time to hold a RCU read lock:
>> >
>> >                 rcu_read_lock();
>> >                 a = rcu_dereference(rcu_pointer);
>> >                 start = get_jiffies_64();
>> >                 while (get_jiffies_64() < (start + 1000));
>> >                 b = rcu_dereference(rcu_pointer);
>> >                 rcu_read_unlock();
>> >
>> > That causes the RCU read "lock" to be held for 1 second (at least on
>> systems where a jiffie is a millisecond). But if that's "too long" then how long
>> exactly is "too long"? If 1 second is too long, is 1 millisecond too long? How is
>> the developer using this interface to know either how long is too long or how
>> long the code they've written will take to execute given the plethora of
>> different systems it might run on?


So I looked closely at your code and found the cause for your seeing different
values for the pointer.

What is happening is as follows:


         reader                               updater
--------------------------------------------------------------------------
                                    rcu_assign_pointer(rcu_pointer, p1);
                                    synchronize_rcu(); // returns immediately
                                                    // no rcu read CS active
       rcu_read_lock();
       a = rcu_pointer;//p1         // loop again
                                    rcu_assign_pointer(rcu_pointer, p2);
                                    // this is immediately visible to reader
       b = rcu_pointer; //p2
       rcu_read_unlock();
                                    synchronize_rcu(); // returns immediately
                                    // no rcu read CS active

This is just one possible run which will cause you to see different values of
rcu_pointer. There are possibly more.

Now what can you do to prevent such updates happening to pointers?

The obvious solution is to use mutex between readers and updaters. But we want
to use RCU.

Dereference the pointer only once within the read critical section. If you are
dereferencing twice, be ready to handle the case where the pointer has been
changed.

The guarantee that RCU provides is that p1 will not be freed until the read
critical section which holds a reference to p1 has completed.

-- 
Andev

^ permalink raw reply	[flat|nested] 16+ messages in thread

* maybe dumb question about RCU
  2015-04-13 21:00           ` Andev
@ 2015-04-13 21:13             ` Jeff Haran
  0 siblings, 0 replies; 16+ messages in thread
From: Jeff Haran @ 2015-04-13 21:13 UTC (permalink / raw)
  To: kernelnewbies

> -----Original Message-----
> From: Andev [mailto:debiandev at gmail.com]
> Sent: Monday, April 13, 2015 2:01 PM
> To: Jeff Haran
> Cc: kernelnewbies
> Subject: Re: maybe dumb question about RCU
> 
> On Fri, Apr 10, 2015 at 4:34 PM, Jeff Haran <Jeff.Haran@citrix.com> wrote:
> >> > The msleep() happens after the
> >> > rcu_assign_pointer()/synchronize_rcu()
> >> sequence, not in the middle of it. All that msleep() is for is to
> >> make sure the updater isn't spinning that core and I chose the value
> >> so that the updater which thus runs every 1.5 seconds doesn't beat
> >> with the reader that is holding the rcu read lock for 1 second. Now
> >> one can argue that this in the reader is an atypically long time to hold a
> RCU read lock:
> >> >
> >> >                 rcu_read_lock();
> >> >                 a = rcu_dereference(rcu_pointer);
> >> >                 start = get_jiffies_64();
> >> >                 while (get_jiffies_64() < (start + 1000));
> >> >                 b = rcu_dereference(rcu_pointer);
> >> >                 rcu_read_unlock();
> >> >
> >> > That causes the RCU read "lock" to be held for 1 second (at least
> >> > on
> >> systems where a jiffie is a millisecond). But if that's "too long"
> >> then how long exactly is "too long"? If 1 second is too long, is 1
> >> millisecond too long? How is the developer using this interface to
> >> know either how long is too long or how long the code they've written
> >> will take to execute given the plethora of different systems it might run
> on?
> 
> 
> So I looked closely at your code and found the cause for your seeing
> different values for the pointer.
> 
> What is happening is as follows:
> 
> 
>          reader                               updater
> --------------------------------------------------------------------------
>                                     rcu_assign_pointer(rcu_pointer, p1);
>                                     synchronize_rcu(); // returns immediately
>                                                     // no rcu read CS active
>        rcu_read_lock();
>        a = rcu_pointer;//p1         // loop again
>                                     rcu_assign_pointer(rcu_pointer, p2);
>                                     // this is immediately visible to reader
>        b = rcu_pointer; //p2
>        rcu_read_unlock();
>                                     synchronize_rcu(); // returns immediately
>                                     // no rcu read CS active
> 
> This is just one possible run which will cause you to see different values of
> rcu_pointer. There are possibly more.
> 
> Now what can you do to prevent such updates happening to pointers?
> 
> The obvious solution is to use mutex between readers and updaters. But we
> want to use RCU.
> 
> Dereference the pointer only once within the read critical section. If you are
> dereferencing twice, be ready to handle the case where the pointer has
> been changed.
> 
> The guarantee that RCU provides is that p1 will not be freed until the read
> critical section which holds a reference to p1 has completed.
> 
> --
> Andev

Sounds like we have come to the same conclusion then.

Thanks,

Jeff Haran

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-04-13 21:13 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-07 22:52 maybe dumb question about RCU Jeff Haran
2015-04-08  2:08 ` Rock Lee
2015-04-08  2:16   ` Jeff Haran
2015-04-08 11:20     ` Rock Lee
2015-04-09 17:44     ` Andev
2015-04-09 23:35 ` Jeff Haran
2015-04-10  0:00   ` Mandeep Sandhu
2015-04-10  0:15     ` Jeff Haran
2015-04-10  0:21       ` Mandeep Sandhu
2015-04-10  0:29         ` Jeff Haran
2015-04-10  2:22   ` Andev
2015-04-10 16:47     ` Jeff Haran
2015-04-10 20:12       ` Andev
2015-04-10 20:34         ` Jeff Haran
2015-04-13 21:00           ` Andev
2015-04-13 21:13             ` Jeff Haran

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.