* [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro @ 2013-05-21 9:05 Roman Gushchin 2013-05-21 10:40 ` David Laight 2013-05-21 12:09 ` Paul E. McKenney 0 siblings, 2 replies; 57+ messages in thread From: Roman Gushchin @ 2013-05-21 9:05 UTC (permalink / raw) To: Dipankar Sarma, Paul E. McKenney Cc: zhmurov, linux-kernel, netdev, David S. Miller, Eric Dumazet, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy Hi, all! This is a fix for a problem described here: https://lkml.org/lkml/2013/4/16/371 . --- Some network functions (udp4_lib_lookup2(), for instance) use the hlist_nulls_for_each_entry_rcu macro in a way that assumes restarting of a loop. In this case, it is strictly necessary to reread the head->first value from the memory before each scan. Without additional hints, gcc caches this value in a register. In this case, if a cached node is moved to another chain during the scan, we can loop forever getting wrong nulls values and restarting the loop uninterruptedly. Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru> --- include/linux/rculist_nulls.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h index 2ae1371..efd51bf 100644 --- a/include/linux/rculist_nulls.h +++ b/include/linux/rculist_nulls.h @@ -37,8 +37,9 @@ static inline void hlist_nulls_del_init_rcu(struct hlist_nulls_node *n) } } -#define hlist_nulls_first_rcu(head) \ - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) +#define hlist_nulls_first_rcu(head) \ + (*((struct hlist_nulls_node __rcu __force **) \ + &((volatile typeof(*head) *)head)->first)) #define hlist_nulls_next_rcu(node) \ (*((struct hlist_nulls_node __rcu __force **)&(node)->next)) -- 1.8.1.2 ^ permalink raw reply related [flat|nested] 57+ messages in thread
* RE: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 9:05 [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro Roman Gushchin @ 2013-05-21 10:40 ` David Laight 2013-05-21 11:55 ` Roman Gushchin 2013-05-21 12:09 ` Paul E. McKenney 1 sibling, 1 reply; 57+ messages in thread From: David Laight @ 2013-05-21 10:40 UTC (permalink / raw) To: Roman Gushchin, Dipankar Sarma, Paul E. McKenney Cc: zhmurov, linux-kernel, netdev, David S. Miller, Eric Dumazet, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy > Some network functions (udp4_lib_lookup2(), for instance) use the > hlist_nulls_for_each_entry_rcu macro in a way that assumes restarting > of a loop. In this case, it is strictly necessary to reread the head->first > value from the memory before each scan. > Without additional hints, gcc caches this value in a register. In this case, > if a cached node is moved to another chain during the scan, we can loop > forever getting wrong nulls values and restarting the loop uninterruptedly. Hmmm.... if either inet_ehashfn() or next_pseudo_random32() is called gcc must reread it anyway. I'm surprised gcc is generating separate code for all the conditional loop endings. So why is it caching head->first. The 'list empty' might be short-circuited - but that would only be relevant after a rescan. I suspect something else is going on. I'd also have thought that this code needs to scan the entire hash list. If things are moved under its feet this won't happen. If it can end up on a different list (because a node got moved) it is also possible for a later node to move it back. In that case it would end up on the correct list ... > -#define hlist_nulls_first_rcu(head) \ > - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) > +#define hlist_nulls_first_rcu(head) \ > + (*((struct hlist_nulls_node __rcu __force **) \ > + &((volatile typeof(*head) *)head)->first)) I'd have thought it would be better to change hlist_nulls_first_rcu(). David ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 10:40 ` David Laight @ 2013-05-21 11:55 ` Roman Gushchin 2013-05-21 13:42 ` David Laight 0 siblings, 1 reply; 57+ messages in thread From: Roman Gushchin @ 2013-05-21 11:55 UTC (permalink / raw) To: David Laight Cc: Dipankar Sarma, Paul E. McKenney, zhmurov, linux-kernel, netdev, David S. Miller, Eric Dumazet, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 21.05.2013 14:40, David Laight wrote: >> Some network functions (udp4_lib_lookup2(), for instance) use the >> hlist_nulls_for_each_entry_rcu macro in a way that assumes restarting >> of a loop. In this case, it is strictly necessary to reread the head->first >> value from the memory before each scan. >> Without additional hints, gcc caches this value in a register. In this case, >> if a cached node is moved to another chain during the scan, we can loop >> forever getting wrong nulls values and restarting the loop uninterruptedly. > > Hmmm.... if either inet_ehashfn() or next_pseudo_random32() is > called gcc must reread it anyway. > I'm surprised gcc is generating separate code for all the conditional > loop endings. So why is it caching head->first. > The 'list empty' might be short-circuited - but that would only > be relevant after a rescan. > I suspect something else is going on. What do you mean? > I'd also have thought that this code needs to scan the entire > hash list. If things are moved under its feet this won't happen. > If it can end up on a different list (because a node got moved) > it is also possible for a later node to move it back. > In that case it would end up on the correct list Things are always moved to the head of the list, so, it's not a problem. > ... >> -#define hlist_nulls_first_rcu(head) \ >> - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) >> +#define hlist_nulls_first_rcu(head) \ >> + (*((struct hlist_nulls_node __rcu __force **) \ >> + &((volatile typeof(*head) *)head)->first)) > > I'd have thought it would be better to change hlist_nulls_first_rcu(). It's exactly what I suggest. May be I miss something? Please, clarify. Regards, Roman ^ permalink raw reply [flat|nested] 57+ messages in thread
* RE: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 11:55 ` Roman Gushchin @ 2013-05-21 13:42 ` David Laight 0 siblings, 0 replies; 57+ messages in thread From: David Laight @ 2013-05-21 13:42 UTC (permalink / raw) To: Roman Gushchin Cc: Dipankar Sarma, Paul E. McKenney, zhmurov, linux-kernel, netdev, David S. Miller, Eric Dumazet, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy > >> -#define hlist_nulls_first_rcu(head) \ > >> - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) > >> +#define hlist_nulls_first_rcu(head) \ > >> + (*((struct hlist_nulls_node __rcu __force **) \ > >> + &((volatile typeof(*head) *)head)->first)) > > > > I'd have thought it would be better to change hlist_nulls_first_rcu(). > > It's exactly what I suggest. May be I miss something? Please, clarify. Gah - chasing through too many defines in too many header files! Actually making the 'head' structure have volatile pointers might be better. Possibly in struct hlist_nulls_node itself. There are toooo many casts for sanity :-) David ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 9:05 [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro Roman Gushchin 2013-05-21 10:40 ` David Laight @ 2013-05-21 12:09 ` Paul E. McKenney 2013-05-21 12:46 ` Roman Gushchin ` (2 more replies) 1 sibling, 3 replies; 57+ messages in thread From: Paul E. McKenney @ 2013-05-21 12:09 UTC (permalink / raw) To: Roman Gushchin Cc: Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Eric Dumazet, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, May 21, 2013 at 01:05:48PM +0400, Roman Gushchin wrote: > Hi, all! > > This is a fix for a problem described here: > https://lkml.org/lkml/2013/4/16/371 . > --- > > Some network functions (udp4_lib_lookup2(), for instance) use the > hlist_nulls_for_each_entry_rcu macro in a way that assumes restarting > of a loop. In this case, it is strictly necessary to reread the head->first > value from the memory before each scan. > Without additional hints, gcc caches this value in a register. In this case, > if a cached node is moved to another chain during the scan, we can loop > forever getting wrong nulls values and restarting the loop uninterruptedly. > > Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> > Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru> > --- > include/linux/rculist_nulls.h | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h > index 2ae1371..efd51bf 100644 > --- a/include/linux/rculist_nulls.h > +++ b/include/linux/rculist_nulls.h > @@ -37,8 +37,9 @@ static inline void hlist_nulls_del_init_rcu(struct > hlist_nulls_node *n) > } > } > > -#define hlist_nulls_first_rcu(head) \ > - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) > +#define hlist_nulls_first_rcu(head) \ > + (*((struct hlist_nulls_node __rcu __force **) \ > + &((volatile typeof(*head) *)head)->first)) Why not use ACCESS_ONCE() or (better) rcu_dereference_raw() here? Thanx, Paul > #define hlist_nulls_next_rcu(node) \ > (*((struct hlist_nulls_node __rcu __force **)&(node)->next)) > -- > 1.8.1.2 > ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 12:09 ` Paul E. McKenney @ 2013-05-21 12:46 ` Roman Gushchin 2013-05-21 12:58 ` Paul E. McKenney 2013-05-21 13:37 ` Eric Dumazet 2013-05-21 13:44 ` Eric Dumazet 2 siblings, 1 reply; 57+ messages in thread From: Roman Gushchin @ 2013-05-21 12:46 UTC (permalink / raw) To: paulmck Cc: Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Eric Dumazet, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 21.05.2013 16:09, Paul E. McKenney wrote: > On Tue, May 21, 2013 at 01:05:48PM +0400, Roman Gushchin wrote: >> Hi, all! >> >> This is a fix for a problem described here: >> https://lkml.org/lkml/2013/4/16/371 . >> --- >> >> Some network functions (udp4_lib_lookup2(), for instance) use the >> hlist_nulls_for_each_entry_rcu macro in a way that assumes restarting >> of a loop. In this case, it is strictly necessary to reread the head->first >> value from the memory before each scan. >> Without additional hints, gcc caches this value in a register. In this case, >> if a cached node is moved to another chain during the scan, we can loop >> forever getting wrong nulls values and restarting the loop uninterruptedly. >> >> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> >> Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru> >> --- >> include/linux/rculist_nulls.h | 5 +++-- >> 1 file changed, 3 insertions(+), 2 deletions(-) >> >> diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h >> index 2ae1371..efd51bf 100644 >> --- a/include/linux/rculist_nulls.h >> +++ b/include/linux/rculist_nulls.h >> @@ -37,8 +37,9 @@ static inline void hlist_nulls_del_init_rcu(struct >> hlist_nulls_node *n) >> } >> } >> >> -#define hlist_nulls_first_rcu(head) \ >> - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) >> +#define hlist_nulls_first_rcu(head) \ >> + (*((struct hlist_nulls_node __rcu __force **) \ >> + &((volatile typeof(*head) *)head)->first)) > > Why not use ACCESS_ONCE() or (better) rcu_dereference_raw() here? It will be nice, but will require to keep the old variant too (for using in hlist_nulls_add_head_rcu() as in rcu_assign_pointer() argument). Do you think, it's better? Regards, Roman ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 12:46 ` Roman Gushchin @ 2013-05-21 12:58 ` Paul E. McKenney 0 siblings, 0 replies; 57+ messages in thread From: Paul E. McKenney @ 2013-05-21 12:58 UTC (permalink / raw) To: Roman Gushchin Cc: Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Eric Dumazet, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, May 21, 2013 at 04:46:54PM +0400, Roman Gushchin wrote: > On 21.05.2013 16:09, Paul E. McKenney wrote: > >On Tue, May 21, 2013 at 01:05:48PM +0400, Roman Gushchin wrote: > >>Hi, all! > >> > >>This is a fix for a problem described here: > >>https://lkml.org/lkml/2013/4/16/371 . > >>--- > >> > >>Some network functions (udp4_lib_lookup2(), for instance) use the > >>hlist_nulls_for_each_entry_rcu macro in a way that assumes restarting > >>of a loop. In this case, it is strictly necessary to reread the head->first > >>value from the memory before each scan. > >>Without additional hints, gcc caches this value in a register. In this case, > >>if a cached node is moved to another chain during the scan, we can loop > >>forever getting wrong nulls values and restarting the loop uninterruptedly. > >> > >>Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> > >>Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru> > >>--- > >> include/linux/rculist_nulls.h | 5 +++-- > >> 1 file changed, 3 insertions(+), 2 deletions(-) > >> > >>diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h > >>index 2ae1371..efd51bf 100644 > >>--- a/include/linux/rculist_nulls.h > >>+++ b/include/linux/rculist_nulls.h > >>@@ -37,8 +37,9 @@ static inline void hlist_nulls_del_init_rcu(struct > >>hlist_nulls_node *n) > >> } > >> } > >> > >>-#define hlist_nulls_first_rcu(head) \ > >>- (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) > >>+#define hlist_nulls_first_rcu(head) \ > >>+ (*((struct hlist_nulls_node __rcu __force **) \ > >>+ &((volatile typeof(*head) *)head)->first)) > > > >Why not use ACCESS_ONCE() or (better) rcu_dereference_raw() here? > > It will be nice, but will require to keep the old variant too (for > using in hlist_nulls_add_head_rcu() as in rcu_assign_pointer() > argument). Do you think, it's better? Both ACCESS_ONCE() and rcu_dereference_raw() can be used by updaters as well as readers, so yes, I do think that it is better. Better to keep the encapsulation rather than having to search for lots of volatile casts should this idiom ever need to change. Thanx, Paul ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 12:09 ` Paul E. McKenney 2013-05-21 12:46 ` Roman Gushchin @ 2013-05-21 13:37 ` Eric Dumazet 2013-05-21 13:44 ` Eric Dumazet 2 siblings, 0 replies; 57+ messages in thread From: Eric Dumazet @ 2013-05-21 13:37 UTC (permalink / raw) To: paulmck Cc: Roman Gushchin, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, 2013-05-21 at 05:09 -0700, Paul E. McKenney wrote: > > > > -#define hlist_nulls_first_rcu(head) \ > > - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) > > +#define hlist_nulls_first_rcu(head) \ > > + (*((struct hlist_nulls_node __rcu __force **) \ > > + &((volatile typeof(*head) *)head)->first)) > > Why not use ACCESS_ONCE() or (better) rcu_dereference_raw() here? > +1 Very good catch Roman ! Thanks ! ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 12:09 ` Paul E. McKenney 2013-05-21 12:46 ` Roman Gushchin 2013-05-21 13:37 ` Eric Dumazet @ 2013-05-21 13:44 ` Eric Dumazet 2013-05-21 14:47 ` Roman Gushchin 2 siblings, 1 reply; 57+ messages in thread From: Eric Dumazet @ 2013-05-21 13:44 UTC (permalink / raw) To: paulmck Cc: Roman Gushchin, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, 2013-05-21 at 05:09 -0700, Paul E. McKenney wrote: > > > > -#define hlist_nulls_first_rcu(head) \ > > - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) > > +#define hlist_nulls_first_rcu(head) \ > > + (*((struct hlist_nulls_node __rcu __force **) \ > > + &((volatile typeof(*head) *)head)->first)) > > Why not use ACCESS_ONCE() or (better) rcu_dereference_raw() here? More exactly we have : #define list_entry_rcu(ptr, type, member) \ ({typeof (*ptr) __rcu *__ptr = (typeof (*ptr) __rcu __force *)ptr; \ container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \ }) #define list_for_each_entry_rcu(pos, head, member) \ for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \ &pos->member != (head); \ pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) << and >> #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ (!is_a_nulls(pos)) && \ ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) We need to change hlist_nulls_for_each_entry_rcu() to use same construct, so that the rcu_dereference_raw() is performed at the right place. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 13:44 ` Eric Dumazet @ 2013-05-21 14:47 ` Roman Gushchin 2013-05-21 15:16 ` Eric Dumazet 2013-05-21 15:38 ` Eric Dumazet 0 siblings, 2 replies; 57+ messages in thread From: Roman Gushchin @ 2013-05-21 14:47 UTC (permalink / raw) To: Eric Dumazet Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 21.05.2013 17:44, Eric Dumazet wrote: > On Tue, 2013-05-21 at 05:09 -0700, Paul E. McKenney wrote: > >>> >>> -#define hlist_nulls_first_rcu(head) \ >>> - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) >>> +#define hlist_nulls_first_rcu(head) \ >>> + (*((struct hlist_nulls_node __rcu __force **) \ >>> + &((volatile typeof(*head) *)head)->first)) >> >> Why not use ACCESS_ONCE() or (better) rcu_dereference_raw() here? > > More exactly we have : > > #define list_entry_rcu(ptr, type, member) \ > ({typeof (*ptr) __rcu *__ptr = (typeof (*ptr) __rcu __force *)ptr; \ > container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \ > }) > > #define list_for_each_entry_rcu(pos, head, member) \ > for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \ > &pos->member != (head); \ > pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) > << and >> > > #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ > for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ > (!is_a_nulls(pos)) && \ > ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ > pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) > > We need to change hlist_nulls_for_each_entry_rcu() to use same construct, > so that the rcu_dereference_raw() is performed at the right place. No. This code has the same mistake: it is rcu_dereference_raw(head->first), so there is nothing that prevents gcc to store the (head->first) value in a register. Let's see on assembler (3.4.46, net/ipv4/udp.c, udp4_lib_lookup2()): (0): here we get head address from stack (1): here we get head->first (2): this is a place, where we jump (by mistake) in original version from (3), (4) and (5). But in (4) we make same as in (1), so it's not a problem. In patched version we always jump to (1). 1) original version: 0000000000002840 <udp4_lib_lookup2.isra.35>: 2840: 41 57 push %r15 2842: 41 89 d7 mov %edx,%r15d 2845: 41 56 push %r14 2847: 41 55 push %r13 2849: 41 54 push %r12 284b: 55 push %rbp 284c: 48 89 fd mov %rdi,%rbp 284f: 53 push %rbx 2850: 48 83 ec 28 sub $0x28,%rsp (0) 2854: 48 8b 44 24 60 mov 0x60(%rsp),%rax 2859: 44 8b 64 24 68 mov 0x68(%rsp),%r12d (1) 285e: 4c 8b 28 mov (%rax),%r13 (2) 2861: 31 ff xor %edi,%edi 2863: 41 f6 c5 01 test $0x1,%r13b 2867: 4d 89 ea mov %r13,%r10 286a: bb ff ff ff ff mov $0xffffffff,%ebx 286f: 74 32 je 28a3 <udp4_lib_lookup2.isra.35+0x63> 2871: e9 20 02 00 00 jmpq 2a96 <udp4_lib_lookup2.isra.35+0x256> ... 2940: 49 d1 ea shr %r10 2943: 4d 39 e2 cmp %r12,%r10 (3) 2946: 0f 85 15 ff ff ff jne 2861 <udp4_lib_lookup2.isra.35+0x21> 294c: 48 85 ff test %rdi,%rdi 294f: 74 27 je 2978 <udp4_lib_lookup2.isra.35+0x138> 2951: 41 89 db mov %ebx,%r11d 2954: 48 8d 5f 4c lea 0x4c(%rdi),%rbx 2958: 41 ba 02 00 00 00 mov $0x2,%r10d 295e: 66 90 xchg %ax,%ax 2960: 45 8d 6a 01 lea 0x1(%r10),%r13d 2964: 44 89 d0 mov %r10d,%eax 2967: f0 44 0f b1 2b lock cmpxchg %r13d,(%rbx) 296c: 41 39 c2 cmp %eax,%r10d 296f: 74 36 je 29a7 <udp4_lib_lookup2.isra.35+0x167> 2971: 85 c0 test %eax,%eax 2973: 41 89 c2 mov %eax,%r10d 2976: 75 e8 jne 2960 <udp4_lib_lookup2.isra.35+0x120> 2978: 31 ff xor %edi,%edi 297a: 48 83 c4 28 add $0x28,%rsp 297e: 48 89 f8 mov %rdi,%rax 2981: 5b pop %rbx 2982: 5d pop %rbp 2983: 41 5c pop %r12 2985: 41 5d pop %r13 2987: 41 5e pop %r14 2989: 41 5f pop %r15 298b: c3 retq 298c: 0f 1f 40 00 nopl 0x0(%rax) 2990: 4d 8b b2 58 02 00 00 mov 0x258(%r10),%r14 2997: 41 f6 46 6e 10 testb $0x10,0x6e(%r14) 299c: 0f 85 de fe ff ff jne 2880 <udp4_lib_lookup2.isra.35+0x40> 29a2: e9 17 ff ff ff jmpq 28be <udp4_lib_lookup2.isra.35+0x7e> 29a7: 48 3b 6f 30 cmp 0x30(%rdi),%rbp 29ab: 74 43 je 29f0 <udp4_lib_lookup2.isra.35+0x1b0> 29ad: b8 ff ff ff ff mov $0xffffffff,%eax 29b2: 41 39 c3 cmp %eax,%r11d 29b5: 7e c3 jle 297a <udp4_lib_lookup2.isra.35+0x13a> 29b7: 89 4c 24 10 mov %ecx,0x10(%rsp) 29bb: 89 74 24 18 mov %esi,0x18(%rsp) 29bf: 44 89 44 24 08 mov %r8d,0x8(%rsp) 29c4: 44 89 0c 24 mov %r9d,(%rsp) 29c8: e8 c3 dc ff ff callq 690 <sock_put> 29cd: 48 8b 44 24 60 mov 0x60(%rsp),%rax 29d2: 8b 4c 24 10 mov 0x10(%rsp),%ecx 29d6: 8b 74 24 18 mov 0x18(%rsp),%esi 29da: 44 8b 44 24 08 mov 0x8(%rsp),%r8d 29df: 44 8b 0c 24 mov (%rsp),%r9d 29e3: 4c 8b 28 mov (%rax),%r13 (4) 29e6: e9 76 fe ff ff jmpq 2861 <udp4_lib_lookup2.isra.35+0x21> 29eb: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 29f0: 44 0f b7 57 0c movzwl 0xc(%rdi),%r10d 29f5: 66 41 83 fa 0a cmp $0xa,%r10w 29fa: 74 7f je 2a7b <udp4_lib_lookup2.isra.35+0x23b> 29fc: 3b 4f 04 cmp 0x4(%rdi),%ecx 29ff: b8 ff ff ff ff mov $0xffffffff,%eax 2a04: 75 ac jne 29b2 <udp4_lib_lookup2.isra.35+0x172> 2a06: 0f b7 9f 7a 02 00 00 movzwl 0x27a(%rdi),%ebx 2a0d: 41 39 d8 cmp %ebx,%r8d 2a10: 75 a0 jne 29b2 <udp4_lib_lookup2.isra.35+0x172> 2a12: 8b 1f mov (%rdi),%ebx 2a14: 66 41 83 fa 02 cmp $0x2,%r10w 2a19: 41 0f 94 c2 sete %r10b 2a1d: 45 0f b6 d2 movzbl %r10b,%r10d 2a21: 85 db test %ebx,%ebx 2a23: 44 89 d0 mov %r10d,%eax 2a26: 74 0d je 2a35 <udp4_lib_lookup2.isra.35+0x1f5> 2a28: 39 de cmp %ebx,%esi 2a2a: b8 ff ff ff ff mov $0xffffffff,%eax 2a2f: 75 81 jne 29b2 <udp4_lib_lookup2.isra.35+0x172> 2a31: 41 8d 42 02 lea 0x2(%r10),%eax 2a35: 44 0f b7 97 78 02 00 movzwl 0x278(%rdi),%r10d 2a3c: 00 2a3d: 66 45 85 d2 test %r10w,%r10w 2a41: 74 0d je 2a50 <udp4_lib_lookup2.isra.35+0x210> 2a43: 66 45 39 d7 cmp %r10w,%r15w 2a47: 0f 85 60 ff ff ff jne 29ad <udp4_lib_lookup2.isra.35+0x16d> 2a4d: 83 c0 02 add $0x2,%eax 2a50: 44 8b 57 10 mov 0x10(%rdi),%r10d 2a54: 45 85 d2 test %r10d,%r10d 2a57: 0f 84 55 ff ff ff je 29b2 <udp4_lib_lookup2.isra.35+0x172> 2a5d: 8d 58 02 lea 0x2(%rax),%ebx 2a60: 45 39 d1 cmp %r10d,%r9d 2a63: b8 ff ff ff ff mov $0xffffffff,%eax 2a68: 0f 44 c3 cmove %ebx,%eax 2a6b: e9 42 ff ff ff jmpq 29b2 <udp4_lib_lookup2.isra.35+0x172> 2a70: 41 bb ff ff ff ff mov $0xffffffff,%r11d 2a76: e9 05 fe ff ff jmpq 2880 <udp4_lib_lookup2.isra.35+0x40> 2a7b: 48 8b 9f 70 02 00 00 mov 0x270(%rdi),%rbx 2a82: b8 ff ff ff ff mov $0xffffffff,%eax 2a87: f6 43 6e 10 testb $0x10,0x6e(%rbx) 2a8b: 0f 85 21 ff ff ff jne 29b2 <udp4_lib_lookup2.isra.35+0x172> 2a91: e9 66 ff ff ff jmpq 29fc <udp4_lib_lookup2.isra.35+0x1bc> 2a96: 4c 89 e8 mov %r13,%rax 2a99: 48 d1 e8 shr %rax 2a9c: 4c 39 e0 cmp %r12,%rax (5) 2a9f: 0f 85 bc fd ff ff jne 2861 <udp4_lib_lookup2.isra.35+0x21> 2aa5: e9 ce fe ff ff jmpq 2978 <udp4_lib_lookup2.isra.35+0x138> 2aaa: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) 2) patched version: 0000000000002840 <udp4_lib_lookup2>: 2840: 41 57 push %r15 2842: 41 89 d7 mov %edx,%r15d 2845: 41 56 push %r14 2847: 41 55 push %r13 2849: 41 54 push %r12 284b: 55 push %rbp 284c: 48 89 fd mov %rdi,%rbp 284f: 53 push %rbx 2850: 48 83 ec 28 sub $0x28,%rsp 2854: 44 8b 6c 24 68 mov 0x68(%rsp),%r13d (0) 2859: 4c 8b 64 24 60 mov 0x60(%rsp),%r12 (1) 285e: 4d 8b 14 24 mov (%r12),%r10 (2) 2862: 31 ff xor %edi,%edi 2864: bb ff ff ff ff mov $0xffffffff,%ebx 2869: 41 f6 c2 01 test $0x1,%r10b 286d: 74 2c je 289b <udp4_lib_lookup2+0x5b> 286f: e9 0a 02 00 00 jmpq 2a7e <udp4_lib_lookup2+0x23e> ... 2930: 49 d1 ea shr %r10 2933: 4d 39 d5 cmp %r10,%r13 (3) 2936: 0f 85 22 ff ff ff jne 285e <udp4_lib_lookup2+0x1e> 293c: 48 85 ff test %rdi,%rdi 293f: 74 27 je 2968 <udp4_lib_lookup2+0x128> 2941: 41 89 db mov %ebx,%r11d 2944: 48 8d 5f 4c lea 0x4c(%rdi),%rbx 2948: 41 ba 02 00 00 00 mov $0x2,%r10d 294e: 66 90 xchg %ax,%ax 2950: 45 8d 72 01 lea 0x1(%r10),%r14d 2954: 44 89 d0 mov %r10d,%eax 2957: f0 44 0f b1 33 lock cmpxchg %r14d,(%rbx) 295c: 41 39 c2 cmp %eax,%r10d 295f: 74 36 je 2997 <udp4_lib_lookup2+0x157> 2961: 85 c0 test %eax,%eax 2963: 41 89 c2 mov %eax,%r10d 2966: 75 e8 jne 2950 <udp4_lib_lookup2+0x110> 2968: 31 ff xor %edi,%edi 296a: 48 83 c4 28 add $0x28,%rsp 296e: 48 89 f8 mov %rdi,%rax 2971: 5b pop %rbx 2972: 5d pop %rbp 2973: 41 5c pop %r12 2975: 41 5d pop %r13 2977: 41 5e pop %r14 2979: 41 5f pop %r15 297b: c3 retq 297c: 0f 1f 40 00 nopl 0x0(%rax) 2980: 4d 8b b2 58 02 00 00 mov 0x258(%r10),%r14 2987: 41 f6 46 6e 10 testb $0x10,0x6e(%r14) 298c: 0f 85 e6 fe ff ff jne 2878 <udp4_lib_lookup2+0x38> 2992: e9 1f ff ff ff jmpq 28b6 <udp4_lib_lookup2+0x76> 2997: 48 3b 6f 30 cmp 0x30(%rdi),%rbp 299b: 74 3b je 29d8 <udp4_lib_lookup2+0x198> 299d: b8 ff ff ff ff mov $0xffffffff,%eax 29a2: 41 39 c3 cmp %eax,%r11d 29a5: 7e c3 jle 296a <udp4_lib_lookup2+0x12a> 29a7: 89 4c 24 10 mov %ecx,0x10(%rsp) 29ab: 89 74 24 18 mov %esi,0x18(%rsp) 29af: 44 89 44 24 08 mov %r8d,0x8(%rsp) 29b4: 44 89 0c 24 mov %r9d,(%rsp) 29b8: e8 d3 dc ff ff callq 690 <sock_put> 29bd: 8b 4c 24 10 mov 0x10(%rsp),%ecx 29c1: 8b 74 24 18 mov 0x18(%rsp),%esi 29c5: 44 8b 44 24 08 mov 0x8(%rsp),%r8d 29ca: 44 8b 0c 24 mov (%rsp),%r9d (4) 29ce: e9 8b fe ff ff jmpq 285e <udp4_lib_lookup2+0x1e> 29d3: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) 29d8: 44 0f b7 57 0c movzwl 0xc(%rdi),%r10d 29dd: 66 41 83 fa 0a cmp $0xa,%r10w 29e2: 74 7f je 2a63 <udp4_lib_lookup2+0x223> 29e4: 3b 4f 04 cmp 0x4(%rdi),%ecx 29e7: b8 ff ff ff ff mov $0xffffffff,%eax 29ec: 75 b4 jne 29a2 <udp4_lib_lookup2+0x162> 29ee: 0f b7 9f 7a 02 00 00 movzwl 0x27a(%rdi),%ebx 29f5: 41 39 d8 cmp %ebx,%r8d 29f8: 75 a8 jne 29a2 <udp4_lib_lookup2+0x162> 29fa: 8b 1f mov (%rdi),%ebx 29fc: 66 41 83 fa 02 cmp $0x2,%r10w 2a01: 41 0f 94 c2 sete %r10b 2a05: 45 0f b6 d2 movzbl %r10b,%r10d 2a09: 85 db test %ebx,%ebx 2a0b: 44 89 d0 mov %r10d,%eax 2a0e: 74 0d je 2a1d <udp4_lib_lookup2+0x1dd> 2a10: 39 de cmp %ebx,%esi 2a12: b8 ff ff ff ff mov $0xffffffff,%eax 2a17: 75 89 jne 29a2 <udp4_lib_lookup2+0x162> 2a19: 41 8d 42 02 lea 0x2(%r10),%eax 2a1d: 44 0f b7 97 78 02 00 movzwl 0x278(%rdi),%r10d 2a24: 00 2a25: 66 45 85 d2 test %r10w,%r10w 2a29: 74 0d je 2a38 <udp4_lib_lookup2+0x1f8> 2a2b: 66 45 39 d7 cmp %r10w,%r15w 2a2f: 0f 85 68 ff ff ff jne 299d <udp4_lib_lookup2+0x15d> 2a35: 83 c0 02 add $0x2,%eax 2a38: 44 8b 57 10 mov 0x10(%rdi),%r10d 2a3c: 45 85 d2 test %r10d,%r10d 2a3f: 0f 84 5d ff ff ff je 29a2 <udp4_lib_lookup2+0x162> 2a45: 8d 58 02 lea 0x2(%rax),%ebx 2a48: 45 39 d1 cmp %r10d,%r9d 2a4b: b8 ff ff ff ff mov $0xffffffff,%eax 2a50: 0f 44 c3 cmove %ebx,%eax 2a53: e9 4a ff ff ff jmpq 29a2 <udp4_lib_lookup2+0x162> 2a58: 41 bb ff ff ff ff mov $0xffffffff,%r11d 2a5e: e9 15 fe ff ff jmpq 2878 <udp4_lib_lookup2+0x38> 2a63: 48 8b 9f 70 02 00 00 mov 0x270(%rdi),%rbx 2a6a: b8 ff ff ff ff mov $0xffffffff,%eax 2a6f: f6 43 6e 10 testb $0x10,0x6e(%rbx) 2a73: 0f 85 29 ff ff ff jne 29a2 <udp4_lib_lookup2+0x162> 2a79: e9 66 ff ff ff jmpq 29e4 <udp4_lib_lookup2+0x1a4> 2a7e: 49 d1 ea shr %r10 2a81: 4d 39 d5 cmp %r10,%r13 (5) 2a84: 0f 85 d4 fd ff ff jne 285e <udp4_lib_lookup2+0x1e> 2a8a: e9 d9 fe ff ff jmpq 2968 <udp4_lib_lookup2+0x128> 2a8f: 90 nop > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 14:47 ` Roman Gushchin @ 2013-05-21 15:16 ` Eric Dumazet 2013-05-21 15:51 ` Roman Gushchin 2013-05-21 15:38 ` Eric Dumazet 1 sibling, 1 reply; 57+ messages in thread From: Eric Dumazet @ 2013-05-21 15:16 UTC (permalink / raw) To: Roman Gushchin Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, 2013-05-21 at 18:47 +0400, Roman Gushchin wrote: > On 21.05.2013 17:44, Eric Dumazet wrote: > > On Tue, 2013-05-21 at 05:09 -0700, Paul E. McKenney wrote: > > > >>> > >>> -#define hlist_nulls_first_rcu(head) \ > >>> - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) > >>> +#define hlist_nulls_first_rcu(head) \ > >>> + (*((struct hlist_nulls_node __rcu __force **) \ > >>> + &((volatile typeof(*head) *)head)->first)) > >> > >> Why not use ACCESS_ONCE() or (better) rcu_dereference_raw() here? > > > > More exactly we have : > > > > #define list_entry_rcu(ptr, type, member) \ > > ({typeof (*ptr) __rcu *__ptr = (typeof (*ptr) __rcu __force *)ptr; \ > > container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \ > > }) > > > > #define list_for_each_entry_rcu(pos, head, member) \ > > for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \ > > &pos->member != (head); \ > > pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) > > << and >> > > > > #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ > > for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ > > (!is_a_nulls(pos)) && \ > > ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ > > pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) > > > > We need to change hlist_nulls_for_each_entry_rcu() to use same construct, > > so that the rcu_dereference_raw() is performed at the right place. > > No. > > This code has the same mistake: it is rcu_dereference_raw(head->first), > so there is nothing that prevents gcc to store the (head->first) value > in a register. Please read again what I wrote, you misundertood. hlist_nulls_for_each_entry_rcu() should use same construct than list_for_each_entry_rcu(), and not use rcu_dereference_raw() Is that clear, or do you want me to send the patch ? ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 15:16 ` Eric Dumazet @ 2013-05-21 15:51 ` Roman Gushchin 0 siblings, 0 replies; 57+ messages in thread From: Roman Gushchin @ 2013-05-21 15:51 UTC (permalink / raw) To: Eric Dumazet Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 21.05.2013 19:16, Eric Dumazet wrote: > On Tue, 2013-05-21 at 18:47 +0400, Roman Gushchin wrote: >> On 21.05.2013 17:44, Eric Dumazet wrote: >>> On Tue, 2013-05-21 at 05:09 -0700, Paul E. McKenney wrote: >>> >>>>> >>>>> -#define hlist_nulls_first_rcu(head) \ >>>>> - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) >>>>> +#define hlist_nulls_first_rcu(head) \ >>>>> + (*((struct hlist_nulls_node __rcu __force **) \ >>>>> + &((volatile typeof(*head) *)head)->first)) >>>> >>>> Why not use ACCESS_ONCE() or (better) rcu_dereference_raw() here? >>> >>> More exactly we have : >>> >>> #define list_entry_rcu(ptr, type, member) \ >>> ({typeof (*ptr) __rcu *__ptr = (typeof (*ptr) __rcu __force *)ptr; \ >>> container_of((typeof(ptr))rcu_dereference_raw(__ptr), type, member); \ >>> }) >>> >>> #define list_for_each_entry_rcu(pos, head, member) \ >>> for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \ >>> &pos->member != (head); \ >>> pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) >>> << and >> >>> >>> #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ >>> for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ >>> (!is_a_nulls(pos)) && \ >>> ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ >>> pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) >>> >>> We need to change hlist_nulls_for_each_entry_rcu() to use same construct, >>> so that the rcu_dereference_raw() is performed at the right place. >> >> No. >> >> This code has the same mistake: it is rcu_dereference_raw(head->first), >> so there is nothing that prevents gcc to store the (head->first) value >> in a register. > > Please read again what I wrote, you misundertood. > > hlist_nulls_for_each_entry_rcu() should use same construct than > list_for_each_entry_rcu(), and not use rcu_dereference_raw() > > Is that clear, or do you want me to send the patch ? If you think, that it will solve the problem, please, send a patch. I think, you are wrong here. If you think only that it will look better, I agree. Regards, Roman ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 14:47 ` Roman Gushchin 2013-05-21 15:16 ` Eric Dumazet @ 2013-05-21 15:38 ` Eric Dumazet 2013-05-21 15:51 ` Roman Gushchin 2013-05-21 18:12 ` [PATCH v2] " Roman Gushchin 1 sibling, 2 replies; 57+ messages in thread From: Eric Dumazet @ 2013-05-21 15:38 UTC (permalink / raw) To: Roman Gushchin Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, 2013-05-21 at 18:47 +0400, Roman Gushchin wrote: > This code has the same mistake: it is rcu_dereference_raw(head->first), > so there is nothing that prevents gcc to store the (head->first) value > in a register. If other rcu accessors have the same problem, a more complete patch is needed. Thanks ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 15:38 ` Eric Dumazet @ 2013-05-21 15:51 ` Roman Gushchin 2013-05-21 18:12 ` [PATCH v2] " Roman Gushchin 1 sibling, 0 replies; 57+ messages in thread From: Roman Gushchin @ 2013-05-21 15:51 UTC (permalink / raw) To: Eric Dumazet Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 21.05.2013 19:38, Eric Dumazet wrote: > On Tue, 2013-05-21 at 18:47 +0400, Roman Gushchin wrote: > >> This code has the same mistake: it is rcu_dereference_raw(head->first), >> so there is nothing that prevents gcc to store the (head->first) value >> in a register. > > If other rcu accessors have the same problem, a more complete patch is > needed. Ok, I'll try to provide a complete patch ASAP. > > Thanks > > > ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 15:38 ` Eric Dumazet 2013-05-21 15:51 ` Roman Gushchin @ 2013-05-21 18:12 ` Roman Gushchin 2013-05-22 2:01 ` Eric Dumazet 1 sibling, 1 reply; 57+ messages in thread From: Roman Gushchin @ 2013-05-21 18:12 UTC (permalink / raw) To: Eric Dumazet Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy > If other rcu accessors have the same problem, a more complete patch is > needed. [PATCH] rcu: fix a race in rcu lists traverse macros Some network functions (udp4_lib_lookup2(), for instance) use the rcu lists traverse macros (hlist_nulls_for_each_entry_rcu, for instance) in a way that assumes restarting of a loop. In this case, it is strictly necessary to reread the head->first value from the memory before each scan. Without additional hints, gcc caches this value in a register. In this case, if a cached node is moved to another chain during the scan, we can loop forever getting wrong nulls values and restarting the loop uninterruptedly. Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru> --- include/linux/compiler.h | 6 ++++++ include/linux/rculist.h | 6 ++++-- include/linux/rculist_nulls.h | 3 ++- 3 files changed, 12 insertions(+), 3 deletions(-) diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 92669cd..0e05d7c 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -351,6 +351,12 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect); */ #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) +/* + * Prevent the compiler from optimizing accesses to structure member. + */ +#define GET_STRUCT_FIELD_VOLATILE(struct_ptr, field) \ + (((volatile typeof(*struct_ptr) *)struct_ptr)->field) + /* Ignore/forbid kprobes attach on very low level functions marked by this attribute: */ #ifdef CONFIG_KPROBES # define __kprobes __attribute__((__section__(".kprobes.text"))) diff --git a/include/linux/rculist.h b/include/linux/rculist.h index 8089e35..6c3eea7 100644 --- a/include/linux/rculist.h +++ b/include/linux/rculist.h @@ -282,7 +282,8 @@ static inline void list_splice_init_rcu(struct list_head *list, * as long as the traversal is guarded by rcu_read_lock(). */ #define list_for_each_entry_rcu(pos, head, member) \ - for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \ + for (pos = list_entry_rcu(GET_STRUCT_FIELD_VOLATILE(head, next), \ + typeof(*pos), member); \ &pos->member != (head); \ pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) @@ -348,7 +349,8 @@ static inline void hlist_replace_rcu(struct hlist_node *old, /* * return the first or the next element in an RCU protected hlist */ -#define hlist_first_rcu(head) (*((struct hlist_node __rcu **)(&(head)->first))) +#define hlist_first_rcu(head) (*((struct hlist_node __rcu **) \ + (&GET_STRUCT_FIELD_VOLATILE(head, first)))) #define hlist_next_rcu(node) (*((struct hlist_node __rcu **)(&(node)->next))) #define hlist_pprev_rcu(node) (*((struct hlist_node __rcu **)((node)->pprev))) diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h index 2ae1371..b04fd0a 100644 --- a/include/linux/rculist_nulls.h +++ b/include/linux/rculist_nulls.h @@ -38,7 +38,8 @@ static inline void hlist_nulls_del_init_rcu(struct hlist_nulls_node *n) } #define hlist_nulls_first_rcu(head) \ - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) + (*((struct hlist_nulls_node __rcu __force **) \ + &(GET_STRUCT_FIELD_VOLATILE(head, first)))) #define hlist_nulls_next_rcu(node) \ (*((struct hlist_nulls_node __rcu __force **)&(node)->next)) -- 1.8.1.2 ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-21 18:12 ` [PATCH v2] " Roman Gushchin @ 2013-05-22 2:01 ` Eric Dumazet 2013-05-22 5:49 ` Eric Dumazet 2013-05-22 9:58 ` Paul E. McKenney 0 siblings, 2 replies; 57+ messages in thread From: Eric Dumazet @ 2013-05-22 2:01 UTC (permalink / raw) To: Roman Gushchin Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, 2013-05-21 at 22:12 +0400, Roman Gushchin wrote: > > If other rcu accessors have the same problem, a more complete patch is > > needed. > > [PATCH] rcu: fix a race in rcu lists traverse macros > > Some network functions (udp4_lib_lookup2(), for instance) use the > rcu lists traverse macros (hlist_nulls_for_each_entry_rcu, for instance) > in a way that assumes restarting of a loop. In this case, it is strictly > necessary to reread the head->first value from the memory before each scan. > Without additional hints, gcc caches this value in a register. In this case, > if a cached node is moved to another chain during the scan, we can loop > forever getting wrong nulls values and restarting the loop uninterruptedly. > > Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> > Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru> > --- > include/linux/compiler.h | 6 ++++++ > include/linux/rculist.h | 6 ++++-- > include/linux/rculist_nulls.h | 3 ++- > 3 files changed, 12 insertions(+), 3 deletions(-) > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > index 92669cd..0e05d7c 100644 > --- a/include/linux/compiler.h > +++ b/include/linux/compiler.h > @@ -351,6 +351,12 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect); > */ > #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) > > +/* > + * Prevent the compiler from optimizing accesses to structure member. > + */ > +#define GET_STRUCT_FIELD_VOLATILE(struct_ptr, field) \ > + (((volatile typeof(*struct_ptr) *)struct_ptr)->field) > + > /* Ignore/forbid kprobes attach on very low level functions marked by this attribute: */ > #ifdef CONFIG_KPROBES > # define __kprobes __attribute__((__section__(".kprobes.text"))) > diff --git a/include/linux/rculist.h b/include/linux/rculist.h > index 8089e35..6c3eea7 100644 > --- a/include/linux/rculist.h > +++ b/include/linux/rculist.h > @@ -282,7 +282,8 @@ static inline void list_splice_init_rcu(struct list_head *list, > * as long as the traversal is guarded by rcu_read_lock(). > */ > #define list_for_each_entry_rcu(pos, head, member) \ > - for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \ > + for (pos = list_entry_rcu(GET_STRUCT_FIELD_VOLATILE(head, next), \ > + typeof(*pos), member); \ > &pos->member != (head); \ > pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) > > @@ -348,7 +349,8 @@ static inline void hlist_replace_rcu(struct hlist_node *old, > /* > * return the first or the next element in an RCU protected hlist > */ > -#define hlist_first_rcu(head) (*((struct hlist_node __rcu **)(&(head)->first))) > +#define hlist_first_rcu(head) (*((struct hlist_node __rcu **) \ > + (&GET_STRUCT_FIELD_VOLATILE(head, first)))) > #define hlist_next_rcu(node) (*((struct hlist_node __rcu **)(&(node)->next))) > #define hlist_pprev_rcu(node) (*((struct hlist_node __rcu **)((node)->pprev))) > > diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h > index 2ae1371..b04fd0a 100644 > --- a/include/linux/rculist_nulls.h > +++ b/include/linux/rculist_nulls.h > @@ -38,7 +38,8 @@ static inline void hlist_nulls_del_init_rcu(struct hlist_nulls_node *n) > } > > #define hlist_nulls_first_rcu(head) \ > - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) > + (*((struct hlist_nulls_node __rcu __force **) \ > + &(GET_STRUCT_FIELD_VOLATILE(head, first)))) > > #define hlist_nulls_next_rcu(node) \ > (*((struct hlist_nulls_node __rcu __force **)&(node)->next)) Please use ACCESS_ONCE(), which is the standard way to deal with this, and remove the rcu_dereference_raw() in hlist_nulls_for_each_entry_rcu() something like : (for the nulls part only) diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h index 2ae1371..1d485d3 100644 --- a/include/linux/rculist_nulls.h +++ b/include/linux/rculist_nulls.h @@ -40,6 +40,9 @@ static inline void hlist_nulls_del_init_rcu(struct hlist_nulls_node *n) #define hlist_nulls_first_rcu(head) \ (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) +#define hlist_nulls_first_rcu_once(head) \ + ACCESS_ONCE(*((struct hlist_nulls_node __rcu __force **)&(head)->first)) + #define hlist_nulls_next_rcu(node) \ (*((struct hlist_nulls_node __rcu __force **)&(node)->next)) @@ -107,7 +110,7 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n, * */ #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ - for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ + for (pos = hlist_nulls_first_rcu_once(head); \ (!is_a_nulls(pos)) && \ ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 2:01 ` Eric Dumazet @ 2013-05-22 5:49 ` Eric Dumazet 2013-05-22 11:58 ` Roman Gushchin 2013-05-22 9:58 ` Paul E. McKenney 1 sibling, 1 reply; 57+ messages in thread From: Eric Dumazet @ 2013-05-22 5:49 UTC (permalink / raw) To: Roman Gushchin Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, 2013-05-21 at 19:01 -0700, Eric Dumazet wrote: > Please use ACCESS_ONCE(), which is the standard way to deal with this, > and remove the rcu_dereference_raw() in > hlist_nulls_for_each_entry_rcu() > > something like : (for the nulls part only) Thinking a bit more about this, I am a bit worried by other uses of ACCESS_ONCE(ptr->field) rcu_dereference(ptr->field) intent is to reload ptr->field every time from memory. Do we really need a #define ACCESS_FIELD_ONCE(PTR, FIELD) (((volatile typeof(*PTR) *)PTR)->FIELD) #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) and change all rcu_dererence(ptr->field) occurrences ? I probably miss something obvious. Anyway, following patch seems to also solve the problem, I need a sleep to understand why. diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 0bf5d39..10b5c54 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -410,7 +410,7 @@ static inline int compute_score2(struct sock *sk, struct net *net, static struct sock *udp4_lib_lookup2(struct net *net, __be32 saddr, __be16 sport, __be32 daddr, unsigned int hnum, int dif, - struct udp_hslot *hslot2, unsigned int slot2) + struct hlist_nulls_head *head, unsigned int slot2) { struct sock *sk, *result; struct hlist_nulls_node *node; @@ -420,7 +420,7 @@ static struct sock *udp4_lib_lookup2(struct net *net, begin: result = NULL; badness = 0; - udp_portaddr_for_each_entry_rcu(sk, node, &hslot2->head) { + udp_portaddr_for_each_entry_rcu(sk, node, head) { score = compute_score2(sk, net, saddr, sport, daddr, hnum, dif); if (score > badness) { @@ -483,7 +483,7 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 saddr, result = udp4_lib_lookup2(net, saddr, sport, daddr, hnum, dif, - hslot2, slot2); + &hslot2->head, slot2); if (!result) { hash2 = udp4_portaddr_hash(net, htonl(INADDR_ANY), hnum); slot2 = hash2 & udptable->mask; @@ -493,7 +493,7 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 saddr, result = udp4_lib_lookup2(net, saddr, sport, htonl(INADDR_ANY), hnum, dif, - hslot2, slot2); + &hslot2->head, slot2); } rcu_read_unlock(); return result; diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 42923b1..b5bc667 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -200,7 +200,7 @@ static inline int compute_score2(struct sock *sk, struct net *net, static struct sock *udp6_lib_lookup2(struct net *net, const struct in6_addr *saddr, __be16 sport, const struct in6_addr *daddr, unsigned int hnum, int dif, - struct udp_hslot *hslot2, unsigned int slot2) + struct hlist_nulls_head *head, unsigned int slot2) { struct sock *sk, *result; struct hlist_nulls_node *node; @@ -210,7 +210,7 @@ static struct sock *udp6_lib_lookup2(struct net *net, begin: result = NULL; badness = -1; - udp_portaddr_for_each_entry_rcu(sk, node, &hslot2->head) { + udp_portaddr_for_each_entry_rcu(sk, node, head) { score = compute_score2(sk, net, saddr, sport, daddr, hnum, dif); if (score > badness) { @@ -274,7 +274,7 @@ struct sock *__udp6_lib_lookup(struct net *net, result = udp6_lib_lookup2(net, saddr, sport, daddr, hnum, dif, - hslot2, slot2); + &hslot2->head, slot2); if (!result) { hash2 = udp6_portaddr_hash(net, &in6addr_any, hnum); slot2 = hash2 & udptable->mask; @@ -284,7 +284,7 @@ struct sock *__udp6_lib_lookup(struct net *net, result = udp6_lib_lookup2(net, saddr, sport, &in6addr_any, hnum, dif, - hslot2, slot2); + &hslot2->head, slot2); } rcu_read_unlock(); return result; ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 5:49 ` Eric Dumazet @ 2013-05-22 11:58 ` Roman Gushchin 2013-05-22 12:30 ` Eric Dumazet 0 siblings, 1 reply; 57+ messages in thread From: Roman Gushchin @ 2013-05-22 11:58 UTC (permalink / raw) To: Eric Dumazet Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 22.05.2013 09:49, Eric Dumazet wrote: > On Tue, 2013-05-21 at 19:01 -0700, Eric Dumazet wrote: > >> Please use ACCESS_ONCE(), which is the standard way to deal with this, >> and remove the rcu_dereference_raw() in >> hlist_nulls_for_each_entry_rcu() >> >> something like : (for the nulls part only) > > Thinking a bit more about this, I am a bit worried by other uses of > ACCESS_ONCE(ptr->field) > > rcu_dereference(ptr->field) intent is to reload ptr->field every time > from memory. Exactly. > > Do we really need a > > #define ACCESS_FIELD_ONCE(PTR, FIELD) (((volatile typeof(*PTR) *)PTR)->FIELD) > > #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) > > and change all rcu_dererence(ptr->field) occurrences ? Yes. But not all: only "head->first". The node variable (or any other iterator) is always a local variable that changes every cycle. So, we can rely on gcc here. > > I probably miss something obvious. > > Anyway, following patch seems to also solve the problem, I need a sleep to understand why. > > struct hlist_nulls_node *node; > @@ -420,7 +420,7 @@ static struct sock *udp4_lib_lookup2(struct net *net, > begin: > result = NULL; > badness = 0; > - udp_portaddr_for_each_entry_rcu(sk, node, &hslot2->head) { > + udp_portaddr_for_each_entry_rcu(sk, node, head) { > score = compute_score2(sk, net, saddr, sport, > daddr, hnum, dif); > if (score > badness) { IMHO, it doesn't solve. Ok, ok, it can actually solve, as fast ANY modification of related code. For instance, adding something like unsigned int c = 0; udp_portaddr_for_each_entry_rcu(sk, node, &hslot2->head) { ... c++; if (c > 1000000) printk(...); } also "solves" the problem. It looks like, that it's semantically strange to use the ACCESS_(FIELD)_ONCE macro for telling gcc to reread something every time. So, I created corresponding rcu_dereference_field(_raw/_bh) macros. Please, find my third attempt to create a patch below. Regards, Roman ----------------------------------------------------------------------- rcu: fix a race in rcu lists traverse macros Some network functions (udp4_lib_lookup2(), for instance) use the rcu lists traverse macros (hlist_nulls_for_each_entry_rcu, for instance) in a way that assumes restarting of a loop. In this case, it is strictly necessary to reread the head->first value from the memory before each scan. Without additional hints, gcc caches this value in a register. In this case, if a cached node is moved to another chain during the scan, we can loop forever getting wrong nulls values and restarting the loop uninterruptedly. Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru> --- include/linux/compiler.h | 6 ++++++ include/linux/rculist.h | 9 +++++---- include/linux/rculist_nulls.h | 2 +- include/linux/rcupdate.h | 9 +++++++++ 4 files changed, 21 insertions(+), 5 deletions(-) diff --git a/include/linux/compiler.h b/include/linux/compiler.h index 92669cd..4109fab 100644 --- a/include/linux/compiler.h +++ b/include/linux/compiler.h @@ -351,6 +351,12 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect); */ #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) +/* + * Same as ACCESS_ONCE(), but used for accessing field of a structure. + * The main goal is preventing compiler to store &ptr->field in a register. + */ +#define ACCESS_FIELD_ONCE(PTR, FIELD) (((volatile typeof(*PTR) *)PTR)->FIELD) + /* Ignore/forbid kprobes attach on very low level functions marked by this attribute: */ #ifdef CONFIG_KPROBES # define __kprobes __attribute__((__section__(".kprobes.text"))) diff --git a/include/linux/rculist.h b/include/linux/rculist.h index 8089e35..7582cfe 100644 --- a/include/linux/rculist.h +++ b/include/linux/rculist.h @@ -282,7 +282,8 @@ static inline void list_splice_init_rcu(struct list_head *list, * as long as the traversal is guarded by rcu_read_lock(). */ #define list_for_each_entry_rcu(pos, head, member) \ - for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \ + for (pos = list_entry_rcu(ACCESS_FIELD_ONCE(head, next), \ + typeof(*pos), member); \ &pos->member != (head); \ pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) @@ -439,7 +440,7 @@ static inline void hlist_add_after_rcu(struct hlist_node *prev, } #define __hlist_for_each_rcu(pos, head) \ - for (pos = rcu_dereference(hlist_first_rcu(head)); \ + for (pos = rcu_dereference_field(head, first); \ pos; \ pos = rcu_dereference(hlist_next_rcu(pos))) @@ -454,7 +455,7 @@ static inline void hlist_add_after_rcu(struct hlist_node *prev, * as long as the traversal is guarded by rcu_read_lock(). */ #define hlist_for_each_entry_rcu(pos, head, member) \ - for (pos = hlist_entry_safe (rcu_dereference_raw(hlist_first_rcu(head)),\ + for (pos = hlist_entry_safe(rcu_dereference_field_raw(head, first), \ typeof(*(pos)), member); \ pos; \ pos = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(\ @@ -471,7 +472,7 @@ static inline void hlist_add_after_rcu(struct hlist_node *prev, * as long as the traversal is guarded by rcu_read_lock(). */ #define hlist_for_each_entry_rcu_bh(pos, head, member) \ - for (pos = hlist_entry_safe(rcu_dereference_bh(hlist_first_rcu(head)),\ + for (pos = hlist_entry_safe(rcu_dereference_field_bh(head, first), \ typeof(*(pos)), member); \ pos; \ pos = hlist_entry_safe(rcu_dereference_bh(hlist_next_rcu(\ diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h index 2ae1371..ef431b4 100644 --- a/include/linux/rculist_nulls.h +++ b/include/linux/rculist_nulls.h @@ -107,7 +107,7 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n, * */ #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ - for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ + for (pos = rcu_dereference_field_raw(head, first); \ (!is_a_nulls(pos)) && \ ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index 4ccd68e..6fef0c2 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -640,6 +640,9 @@ static inline void rcu_preempt_sleep_check(void) #define rcu_dereference_raw(p) rcu_dereference_check(p, 1) /*@@@ needed? @@@*/ +#define rcu_dereference_field_raw(PTR, FIELD) \ + rcu_dereference_raw(ACCESS_FIELD_ONCE(PTR, FIELD)) + /** * rcu_access_index() - fetch RCU index with no dereferencing * @p: The index to read @@ -704,6 +707,9 @@ static inline void rcu_preempt_sleep_check(void) */ #define rcu_dereference(p) rcu_dereference_check(p, 0) +#define rcu_dereference_field(PTR, FIELD) \ + rcu_dereference(ACCESS_FIELD_ONCE(PTR, FIELD)) + /** * rcu_dereference_bh() - fetch an RCU-bh-protected pointer for dereferencing * @p: The pointer to read, prior to dereferencing @@ -712,6 +718,9 @@ static inline void rcu_preempt_sleep_check(void) */ #define rcu_dereference_bh(p) rcu_dereference_bh_check(p, 0) +#define rcu_dereference_field_bh(PTR, FIELD) \ + rcu_dereference_bh(ACCESS_FIELD_ONCE(PTR, FIELD)) + /** * rcu_dereference_sched() - fetch RCU-sched-protected pointer for dereferencing * @p: The pointer to read, prior to dereferencing -- 1.8.1.2 ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 11:58 ` Roman Gushchin @ 2013-05-22 12:30 ` Eric Dumazet 2013-05-22 13:07 ` Roman Gushchin 2013-05-22 13:27 ` David Laight 0 siblings, 2 replies; 57+ messages in thread From: Eric Dumazet @ 2013-05-22 12:30 UTC (permalink / raw) To: Roman Gushchin Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Wed, 2013-05-22 at 15:58 +0400, Roman Gushchin wrote: > +/* > + * Same as ACCESS_ONCE(), but used for accessing field of a structure. > + * The main goal is preventing compiler to store &ptr->field in a register. But &ptr->field is a constant during the whole duration of udp4_lib_lookup2() and could be in a register, in my case field is at offset 0, and ptr is a parameter (so could be in a 'register') The bug you found is that compiler caches the indirection (ptr->field) into a register, not that compiler stores &ptr->field into a register. > + */ > +#define ACCESS_FIELD_ONCE(PTR, FIELD) (((volatile typeof(*PTR) *)PTR)->FIELD) > + Here we force the compiler to consider ptr as volatile, but semantically it is not required in rcu_dereference(ptr->field) We want field to be reloaded, not ptr. So yes, the patch appears to fix the bug, but it sounds not logical to me. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 12:30 ` Eric Dumazet @ 2013-05-22 13:07 ` Roman Gushchin 2013-05-22 17:45 ` Paul E. McKenney 2013-05-22 13:27 ` David Laight 1 sibling, 1 reply; 57+ messages in thread From: Roman Gushchin @ 2013-05-22 13:07 UTC (permalink / raw) To: Eric Dumazet Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 22.05.2013 16:30, Eric Dumazet wrote: > On Wed, 2013-05-22 at 15:58 +0400, Roman Gushchin wrote: > >> +/* >> + * Same as ACCESS_ONCE(), but used for accessing field of a structure. >> + * The main goal is preventing compiler to store &ptr->field in a register. > > But &ptr->field is a constant during the whole duration of > udp4_lib_lookup2() and could be in a register, in my case field is at > offset 0, and ptr is a parameter (so could be in a 'register') > > The bug you found is that compiler caches the indirection (ptr->field) > into a register, not that compiler stores &ptr->field into a register. > >> + */ >> +#define ACCESS_FIELD_ONCE(PTR, FIELD) (((volatile typeof(*PTR) *)PTR)->FIELD) >> + > > Here we force the compiler to consider ptr as volatile, but semantically > it is not required in rcu_dereference(ptr->field) Actually, we need to mark an "address of a place" where the field value is located as volatile before dereferencing. I have no idea how to do it in another way, except using multiple casts and offsetof's, but, IMHO, it will be even more complex: ACCESS_ONCE(typeof(&ptr->field)((char*)ptr + offsetof(typeof(*ptr), field))) > > We want field to be reloaded, not ptr. > > So yes, the patch appears to fix the bug, but it sounds not logical to > me. > May be we can enhance it by providing better/more detailed comments here? Have you any suggestions? Thanks, Roman ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 13:07 ` Roman Gushchin @ 2013-05-22 17:45 ` Paul E. McKenney 2013-05-22 19:17 ` Roman Gushchin 0 siblings, 1 reply; 57+ messages in thread From: Paul E. McKenney @ 2013-05-22 17:45 UTC (permalink / raw) To: Roman Gushchin Cc: Eric Dumazet, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Wed, May 22, 2013 at 05:07:07PM +0400, Roman Gushchin wrote: > On 22.05.2013 16:30, Eric Dumazet wrote: > >On Wed, 2013-05-22 at 15:58 +0400, Roman Gushchin wrote: > > > >>+/* > >>+ * Same as ACCESS_ONCE(), but used for accessing field of a structure. > >>+ * The main goal is preventing compiler to store &ptr->field in a register. > > > >But &ptr->field is a constant during the whole duration of > >udp4_lib_lookup2() and could be in a register, in my case field is at > >offset 0, and ptr is a parameter (so could be in a 'register') > > > >The bug you found is that compiler caches the indirection (ptr->field) > >into a register, not that compiler stores &ptr->field into a register. > > > >>+ */ > >>+#define ACCESS_FIELD_ONCE(PTR, FIELD) (((volatile typeof(*PTR) *)PTR)->FIELD) > >>+ > > > >Here we force the compiler to consider ptr as volatile, but semantically > >it is not required in rcu_dereference(ptr->field) > > Actually, we need to mark an "address of a place" where the field value is > located as volatile before dereferencing. I have no idea how to do it in another way, > except using multiple casts and offsetof's, but, IMHO, it will be even more complex: > ACCESS_ONCE(typeof(&ptr->field)((char*)ptr + offsetof(typeof(*ptr), field))) Why not just ACCESS_ONCE(ptr->field)? Or if it is the thing that ptr->field points to that is subject to change, ACCESS_ONCE(*ptr->field)? Or rcu_dereference(ptr->field), as appropriate? Thanx, Paul > >We want field to be reloaded, not ptr. > > > >So yes, the patch appears to fix the bug, but it sounds not logical to > >me. > > > > May be we can enhance it by providing better/more detailed comments here? > Have you any suggestions? > > Thanks, > Roman > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 17:45 ` Paul E. McKenney @ 2013-05-22 19:17 ` Roman Gushchin 2013-05-25 11:37 ` Paul E. McKenney 0 siblings, 1 reply; 57+ messages in thread From: Roman Gushchin @ 2013-05-22 19:17 UTC (permalink / raw) To: paulmck Cc: Eric Dumazet, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 22.05.2013 21:45, Paul E. McKenney wrote: > On Wed, May 22, 2013 at 05:07:07PM +0400, Roman Gushchin wrote: >> On 22.05.2013 16:30, Eric Dumazet wrote: >>> On Wed, 2013-05-22 at 15:58 +0400, Roman Gushchin wrote: >>> >>>> +/* >>>> + * Same as ACCESS_ONCE(), but used for accessing field of a structure. >>>> + * The main goal is preventing compiler to store &ptr->field in a register. >>> >>> But &ptr->field is a constant during the whole duration of >>> udp4_lib_lookup2() and could be in a register, in my case field is at >>> offset 0, and ptr is a parameter (so could be in a 'register') >>> >>> The bug you found is that compiler caches the indirection (ptr->field) >>> into a register, not that compiler stores &ptr->field into a register. >>> >>>> + */ >>>> +#define ACCESS_FIELD_ONCE(PTR, FIELD) (((volatile typeof(*PTR) *)PTR)->FIELD) >>>> + >>> >>> Here we force the compiler to consider ptr as volatile, but semantically >>> it is not required in rcu_dereference(ptr->field) >> >> Actually, we need to mark an "address of a place" where the field value is >> located as volatile before dereferencing. I have no idea how to do it in another way, >> except using multiple casts and offsetof's, but, IMHO, it will be even more complex: >> ACCESS_ONCE(typeof(&ptr->field)((char*)ptr + offsetof(typeof(*ptr), field))) Probably I miss one more ACCESS_ONCE() in this expression. Should be something like ACCESS_ONCE(typeof(&ptr->field)((char*)ACCESS_ONCE(ptr) + offsetof(typeof(*ptr), field))) . But this is not a working example, just an illustration against using ACCESS_ONCE() here. > Why not just ACCESS_ONCE(ptr->field)? Or if it is the thing that > ptr->field points to that is subject to change, ACCESS_ONCE(*ptr->field)? > > Or rcu_dereference(ptr->field), as appropriate? It's not enough. So is the code now, and it doesn't work as expected. You can't write (ptr->field) without ptr being marked as a volatile pointer. I try to explain the problem once more from scratch: 1) We have the following pseudo-code (based on udp4_lib_lookup2()): static void some_func(struct hlist_nulls_head *head) { struct hlist_nulls_node *node; begin: for (node = rcu_dereference(head->first); !is_nulls(node) & ...; node = rcu_dereference(node->next)) { <...> } if (restart_condition) goto begin; 2) A problem occurs when restart_condition is true and we jump to the begin label. We do not recalculate (head + offsetof(head, first)) address, we just dereference again the OLD (head->first) pointer. So, we get a node, that WAS the first node in a previous time instead of current first node. That node can be dead, or, for instance, can be a head of another chain. It is correct from gcc's point of view, since it doesn't expect the head pointer to change, and this address is just (head + constant offset). 3) If we start with a wrong first element, restart_condition can be always true, so, we get an infinite loop. In any case, we do not scan the whole (socket) chain, as expected. This scenario is absolutely real and causes our DNS servers to hang sometimes under high load. Note, that there are no problems if we don't restart a loop. Also, it is highly dependent on gcc options, and the code in the body of the loop. Even small changes in the code (like adding debugging print) preventing reproducing of the issue. Thanks! Regards, Roman ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 19:17 ` Roman Gushchin @ 2013-05-25 11:37 ` Paul E. McKenney 2013-05-27 11:34 ` Roman Gushchin 2013-05-27 17:55 ` Roman Gushchin 0 siblings, 2 replies; 57+ messages in thread From: Paul E. McKenney @ 2013-05-25 11:37 UTC (permalink / raw) To: Roman Gushchin Cc: Eric Dumazet, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Wed, May 22, 2013 at 11:17:46PM +0400, Roman Gushchin wrote: > On 22.05.2013 21:45, Paul E. McKenney wrote: > >On Wed, May 22, 2013 at 05:07:07PM +0400, Roman Gushchin wrote: > >>On 22.05.2013 16:30, Eric Dumazet wrote: > >>>On Wed, 2013-05-22 at 15:58 +0400, Roman Gushchin wrote: > >>> > >>>>+/* > >>>>+ * Same as ACCESS_ONCE(), but used for accessing field of a structure. > >>>>+ * The main goal is preventing compiler to store &ptr->field in a register. > >>> > >>>But &ptr->field is a constant during the whole duration of > >>>udp4_lib_lookup2() and could be in a register, in my case field is at > >>>offset 0, and ptr is a parameter (so could be in a 'register') > >>> > >>>The bug you found is that compiler caches the indirection (ptr->field) > >>>into a register, not that compiler stores &ptr->field into a register. > >>> > >>>>+ */ > >>>>+#define ACCESS_FIELD_ONCE(PTR, FIELD) (((volatile typeof(*PTR) *)PTR)->FIELD) > >>>>+ > >>> > >>>Here we force the compiler to consider ptr as volatile, but semantically > >>>it is not required in rcu_dereference(ptr->field) > >> > >>Actually, we need to mark an "address of a place" where the field value is > >>located as volatile before dereferencing. I have no idea how to do it in another way, > >>except using multiple casts and offsetof's, but, IMHO, it will be even more complex: > >> ACCESS_ONCE(typeof(&ptr->field)((char*)ptr + offsetof(typeof(*ptr), field))) > > Probably I miss one more ACCESS_ONCE() in this expression. Should be something like > ACCESS_ONCE(typeof(&ptr->field)((char*)ACCESS_ONCE(ptr) + offsetof(typeof(*ptr), field))) . > But this is not a working example, just an illustration against using ACCESS_ONCE() here. > > >Why not just ACCESS_ONCE(ptr->field)? Or if it is the thing that > >ptr->field points to that is subject to change, ACCESS_ONCE(*ptr->field)? > > > >Or rcu_dereference(ptr->field), as appropriate? > > It's not enough. So is the code now, and it doesn't work as expected. > You can't write (ptr->field) without ptr being marked as a volatile pointer. > > I try to explain the problem once more from scratch: > > 1) We have the following pseudo-code (based on udp4_lib_lookup2()): > > static void some_func(struct hlist_nulls_head *head) { > struct hlist_nulls_node *node; > > begin: > for (node = rcu_dereference(head->first); > !is_nulls(node) & ...; > node = rcu_dereference(node->next)) { > <...> > } > > if (restart_condition) > goto begin; > > 2) A problem occurs when restart_condition is true and we jump to the begin label. > We do not recalculate (head + offsetof(head, first)) address, we just dereference > again the OLD (head->first) pointer. So, we get a node, that WAS the first node in a > previous time instead of current first node. That node can be dead, or, for instance, > can be a head of another chain. OK, this is what I was referring to when I said that the RCU list macros assume that the list header is static (or equivalently, appropriately protected). With some_func() as written above, you would need to return some sort of failure indication from some_func(), and have the caller refetch head. Otherwise, as far as gcc is concerned, the value of the parameter head is constant throughout some_func(). > It is correct from gcc's point of view, since it doesn't expect the head pointer > to change, and this address is just (head + constant offset). Agreed. How does the caller calculate the value to pass in through the argument "head"? > 3) If we start with a wrong first element, restart_condition can be always true, so, > we get an infinite loop. In any case, we do not scan the whole (socket) chain, > as expected. Agreed. > This scenario is absolutely real and causes our DNS servers to hang > sometimes under high load. I completely believe that such a hang could happen. > Note, that there are no problems if we don't restart a loop. Also, it is highly > dependent on gcc options, and the code in the body of the loop. Even small changes > in the code (like adding debugging print) preventing reproducing of the issue. Again, I believe that your retry logic needs to extend back into the calling function for your some_func() example above. Thanx, Paul ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-25 11:37 ` Paul E. McKenney @ 2013-05-27 11:34 ` Roman Gushchin 2013-05-27 17:55 ` Roman Gushchin 1 sibling, 0 replies; 57+ messages in thread From: Roman Gushchin @ 2013-05-27 11:34 UTC (permalink / raw) To: paulmck, Eric Dumazet Cc: Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 25.05.2013 15:37, Paul E. McKenney wrote: >> 2) A problem occurs when restart_condition is true and we jump to the begin label. >> We do not recalculate (head + offsetof(head, first)) address, we just dereference >> again the OLD (head->first) pointer. So, we get a node, that WAS the first node in a >> previous time instead of current first node. That node can be dead, or, for instance, >> can be a head of another chain. > > OK, this is what I was referring to when I said that the RCU list macros > assume that the list header is static (or equivalently, appropriately > protected). > > With some_func() as written above, you would need to return some sort > of failure indication from some_func(), and have the caller refetch head. > Otherwise, as far as gcc is concerned, the value of the parameter head > is constant throughout some_func(). Personally I have nothing against this approach, but I'm not sure, if networking maintainers will adopt corresponding patch. I'll try to find it out. > >> It is correct from gcc's point of view, since it doesn't expect the head pointer >> to change, and this address is just (head + constant offset). > > Agreed. > > How does the caller calculate the value to pass in through the argument "head"? struct sock *__udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport, __be32 daddr, __be16 dport, int dif, struct udp_table *udptable) { ... unsigned int hash2, slot2, slot = udp_hashfn(net, hnum, udptable->mask); struct udp_hslot *hslot2, *hslot = &udptable->hash[slot]; int score, badness; rcu_read_lock(); if (hslot->count > 10) { hash2 = udp4_portaddr_hash(net, daddr, hnum); slot2 = hash2 & udptable->mask; hslot2 = &udptable->hash2[slot2]; ... result = udp4_lib_lookup2(net, saddr, sport, daddr, hnum, dif, hslot2, slot2); ... } static struct sock *udp4_lib_lookup2(struct net *net, __be32 saddr, __be16 sport, __be32 daddr, unsigned int hnum, int dif, struct udp_hslot *hslot2, unsigned int slot2) { ... udp_portaddr_for_each_entry_rcu(sk, node, &hslot2->head) { ... Thank you! Regards, Roman ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-25 11:37 ` Paul E. McKenney 2013-05-27 11:34 ` Roman Gushchin @ 2013-05-27 17:55 ` Roman Gushchin 2013-05-28 0:12 ` Eric Dumazet 1 sibling, 1 reply; 57+ messages in thread From: Roman Gushchin @ 2013-05-27 17:55 UTC (permalink / raw) To: paulmck Cc: Eric Dumazet, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy Hi, Paul! > On 25.05.2013 15:37, Paul E. McKenney wrote: >> Again, I believe that your retry logic needs to extend back into the >> calling function for your some_func() example above. And what do you think about the following approach (diff below)? It seems to me, it's enough clear (especially with good accompanying comments) and produces a good binary code (without significant overhead). Also, we will remove a hidden reef in using rcu-protected (h)list traverses with restarts. diff --git a/include/linux/rculist.h b/include/linux/rculist.h index 6f95e24..8d8b1eb 100644 --- a/include/linux/rculist.h +++ b/include/linux/rculist.h @@ -269,7 +269,8 @@ static inline void list_splice_init_rcu(struct list_head *list, * as long as the traversal is guarded by rcu_read_lock(). */ #define list_for_each_entry_rcu(pos, head, member) \ - for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \ + for (ACCESS_ONCE(*(head)), \ + pos = list_entry_rcu((head)->next, typeof(*pos), member); \ &pos->member != (head); \ pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) @@ -459,7 +460,8 @@ static inline void hlist_add_after_rcu(struct hlist_node *prev, * as long as the traversal is guarded by rcu_read_lock(). */ #define hlist_for_each_entry_rcu(tpos, pos, head, member) \ - for (pos = rcu_dereference_raw(hlist_first_rcu(head)); \ + for (ACCESS_ONCE(*(head)), \ + pos = rcu_dereference_raw(hlist_first_rcu(head)); \ pos && \ ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); \ pos = rcu_dereference_raw(hlist_next_rcu(pos))) @@ -476,7 +478,7 @@ static inline void hlist_add_after_rcu(struct hlist_node *prev, * as long as the traversal is guarded by rcu_read_lock(). */ #define hlist_for_each_entry_rcu_bh(tpos, pos, head, member) \ - for (pos = rcu_dereference_bh((head)->first); \ + for (ACCESS_ONCE(*(head)), pos = rcu_dereference_bh((head)->first); \ pos && \ ({ tpos = hlist_entry(pos, typeof(*tpos), member); 1; }); \ pos = rcu_dereference_bh(pos->next)) diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h index 2ae1371..4af5ee5 100644 --- a/include/linux/rculist_nulls.h +++ b/include/linux/rculist_nulls.h @@ -107,7 +107,8 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n, * */ #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ - for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ + for (ACCESS_ONCE(*(head)), \ + pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ (!is_a_nulls(pos)) && \ ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) --- Thanks! Regards, Roman ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-27 17:55 ` Roman Gushchin @ 2013-05-28 0:12 ` Eric Dumazet 2013-05-28 9:10 ` Roman Gushchin 0 siblings, 1 reply; 57+ messages in thread From: Eric Dumazet @ 2013-05-28 0:12 UTC (permalink / raw) To: Roman Gushchin Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Mon, 2013-05-27 at 21:55 +0400, Roman Gushchin wrote: > Hi, Paul! > > > On 25.05.2013 15:37, Paul E. McKenney wrote: > >> Again, I believe that your retry logic needs to extend back into the > >> calling function for your some_func() example above. > > And what do you think about the following approach (diff below)? > > It seems to me, it's enough clear (especially with good accompanying comments) > and produces a good binary code (without significant overhead). > Also, we will remove a hidden reef in using rcu-protected (h)list traverses with restarts. > > diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h > index 2ae1371..4af5ee5 100644 > --- a/include/linux/rculist_nulls.h > +++ b/include/linux/rculist_nulls.h > @@ -107,7 +107,8 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n, > * > */ > #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ > - for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ > + for (ACCESS_ONCE(*(head)), \ > + pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ > (!is_a_nulls(pos)) && \ > ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ > pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) It looks like this still relies on gcc being friendly here. I repeat again : @head here is a constant. Macro already uses ACCESS_ONCE(), we only have to instruct gcc that caching the value is forbidden if we restart the loop (aka "goto begin;" see Documentation/RCU/rculist_nulls.txt line 146) Adding a barrier() is probably what we want. I cooked followed patch and it fixes the problem. diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h index 2ae1371..4dc51b2 100644 --- a/include/linux/rculist_nulls.h +++ b/include/linux/rculist_nulls.h @@ -105,8 +105,12 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n, * @head: the head for your list. * @member: the name of the hlist_nulls_node within the struct. * + * The barrier() is needed to make sure compiler doesn't cache first element, + * as this loop can be restarted. + * (cf Documentation/RCU/rculist_nulls.txt around line 146) */ #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ + barrier(); \ for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ (!is_a_nulls(pos)) && \ ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-28 0:12 ` Eric Dumazet @ 2013-05-28 9:10 ` Roman Gushchin 2013-05-29 0:34 ` Eric Dumazet 2013-05-29 1:19 ` Paul E. McKenney 0 siblings, 2 replies; 57+ messages in thread From: Roman Gushchin @ 2013-05-28 9:10 UTC (permalink / raw) To: Eric Dumazet Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 28.05.2013 04:12, Eric Dumazet wrote: > On Mon, 2013-05-27 at 21:55 +0400, Roman Gushchin wrote: >> Hi, Paul! >> >>> On 25.05.2013 15:37, Paul E. McKenney wrote: >>>> Again, I believe that your retry logic needs to extend back into the >>>> calling function for your some_func() example above. >> >> And what do you think about the following approach (diff below)? >> >> It seems to me, it's enough clear (especially with good accompanying comments) >> and produces a good binary code (without significant overhead). >> Also, we will remove a hidden reef in using rcu-protected (h)list traverses with restarts. >> > >> diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h >> index 2ae1371..4af5ee5 100644 >> --- a/include/linux/rculist_nulls.h >> +++ b/include/linux/rculist_nulls.h >> @@ -107,7 +107,8 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n, >> * >> */ >> #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ >> - for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ >> + for (ACCESS_ONCE(*(head)), \ >> + pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ >> (!is_a_nulls(pos)) && \ >> ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ >> pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) > > It looks like this still relies on gcc being friendly here. > > I repeat again : @head here is a constant. No. Actually, there are two volatile objects: pointer to the first element (as a part of the head structure), and the first element by itself. So, to be strict, head as a structure contains a volatile field. Head->first should be treated as a volatile pointer to a volatile data. So, the whole head object is volatile. > > Macro already uses ACCESS_ONCE(), we only have to instruct gcc that > caching the value is forbidden if we restart the loop > (aka "goto begin;" see Documentation/RCU/rculist_nulls.txt line 146) My patch seems to be correct, because, ACCESS_ONCE(*(head)) will cause gcc to (re)read head data from the memory. According to gcc documentation: "A scalar volatile object is read when it is accessed in a void context: volatile int *src = somevalue; *src; Such expressions are rvalues, and GCC implements this as a read of the volatile object being pointed to." And this is exactly our case. > Adding a barrier() is probably what we want. I agree, inserting barrier() is also a correct and working fix. Thanks! Regards, Roman ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-28 9:10 ` Roman Gushchin @ 2013-05-29 0:34 ` Eric Dumazet 2013-05-29 1:31 ` Paul E. McKenney 2013-05-29 9:17 ` Roman Gushchin 2013-05-29 1:19 ` Paul E. McKenney 1 sibling, 2 replies; 57+ messages in thread From: Eric Dumazet @ 2013-05-29 0:34 UTC (permalink / raw) To: Roman Gushchin, Jesper Dangaard Brouer Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, 2013-05-28 at 13:10 +0400, Roman Gushchin wrote: > On 28.05.2013 04:12, Eric Dumazet wrote: > > Adding a barrier() is probably what we want. > > I agree, inserting barrier() is also a correct and working fix. Yeah, but I can not find a clean way to put it inside the "for (;;)" for (barrier();;) -> error: expected expression before ‘__asm__’ No user currently does : if (condition) hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) But who knows... ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-29 0:34 ` Eric Dumazet @ 2013-05-29 1:31 ` Paul E. McKenney 2013-05-29 5:08 ` Eric Dumazet 2013-05-29 9:17 ` Roman Gushchin 1 sibling, 1 reply; 57+ messages in thread From: Paul E. McKenney @ 2013-05-29 1:31 UTC (permalink / raw) To: Eric Dumazet Cc: Roman Gushchin, Jesper Dangaard Brouer, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, May 28, 2013 at 05:34:53PM -0700, Eric Dumazet wrote: > On Tue, 2013-05-28 at 13:10 +0400, Roman Gushchin wrote: > > On 28.05.2013 04:12, Eric Dumazet wrote: > > > > Adding a barrier() is probably what we want. > > > > I agree, inserting barrier() is also a correct and working fix. > > Yeah, but I can not find a clean way to put it inside the "for (;;)" > > for (barrier();;) -> > > error: expected expression before ‘__asm__’ > > No user currently does : > > if (condition) > hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) > > But who knows... I still have my earlier question, but I suggest "({ barrier(); XXX })" to put the barrier into the for loop, either in the second or third clause, where XXX was the original second or third clause. Thanx, Paul ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-29 1:31 ` Paul E. McKenney @ 2013-05-29 5:08 ` Eric Dumazet 2013-05-29 10:09 ` Roman Gushchin 0 siblings, 1 reply; 57+ messages in thread From: Eric Dumazet @ 2013-05-29 5:08 UTC (permalink / raw) To: paulmck Cc: Roman Gushchin, Jesper Dangaard Brouer, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, 2013-05-28 at 18:31 -0700, Paul E. McKenney wrote: > On Tue, May 28, 2013 at 05:34:53PM -0700, Eric Dumazet wrote: > > On Tue, 2013-05-28 at 13:10 +0400, Roman Gushchin wrote: > > > On 28.05.2013 04:12, Eric Dumazet wrote: > > > > > > Adding a barrier() is probably what we want. > > > > > > I agree, inserting barrier() is also a correct and working fix. > > > > Yeah, but I can not find a clean way to put it inside the "for (;;)" > > > > for (barrier();;) -> > > > > error: expected expression before ‘__asm__’ > > > > No user currently does : > > > > if (condition) > > hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) > > > > But who knows... > > I still have my earlier question, but I suggest "({ barrier(); XXX })" > to put the barrier into the for loop, either in the second or third > clause, where XXX was the original second or third clause. > > Hmm, it doesn't work, I wanted to replace : barrier(); for (expr1 ; expr2; expr3) { by : for ( some_clever_thing ; expr2; expr3) { So barrier() should not be in second or third clause : it must be done once and before "expr1". Not a big deal anyway, we're not adding new hlist_nulls_for_each_entry_rcu() users :) About your earlier question, I really don't know why compiler would cache a memory read if we explicitly use barrier() to prevent this from happening. BTW Roman patch generates a double load as in : 2cb1: 49 8b 07 mov (%r15),%rax 2cb4: 49 8b 07 mov (%r15),%rax ... 2ea2: e8 f9 dc ff ff callq ba0 <sock_put> 2ea7: 8b 0c 24 mov (%rsp),%ecx 2eaa: e9 02 fe ff ff jmpq 2cb1 <udp4_lib_lookup2+0x91> because of ACCESS_ONCE() used twice, once explicitly in hlist_nulls_for_each_entry_rcu(), and once in rcu_dereference_raw() While barrier();ptr = rcu_dereference(X); does generate a single load. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-29 5:08 ` Eric Dumazet @ 2013-05-29 10:09 ` Roman Gushchin 2013-05-29 19:06 ` Eric Dumazet 0 siblings, 1 reply; 57+ messages in thread From: Roman Gushchin @ 2013-05-29 10:09 UTC (permalink / raw) To: Eric Dumazet, paulmck Cc: Jesper Dangaard Brouer, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 29.05.2013 09:08, Eric Dumazet wrote: > On Tue, 2013-05-28 at 18:31 -0700, Paul E. McKenney wrote: >> On Tue, May 28, 2013 at 05:34:53PM -0700, Eric Dumazet wrote: >>> On Tue, 2013-05-28 at 13:10 +0400, Roman Gushchin wrote: >>>> On 28.05.2013 04:12, Eric Dumazet wrote: > About your earlier question, I really don't know why compiler would > cache a memory read if we explicitly use barrier() to prevent this from > happening. I agree. > BTW Roman patch generates a double load as in : > > 2cb1: 49 8b 07 mov (%r15),%rax > 2cb4: 49 8b 07 mov (%r15),%rax > > > ... > 2ea2: e8 f9 dc ff ff callq ba0 <sock_put> > 2ea7: 8b 0c 24 mov (%rsp),%ecx > 2eaa: e9 02 fe ff ff jmpq 2cb1 <udp4_lib_lookup2+0x91> > > because of ACCESS_ONCE() used twice, once explicitly in > hlist_nulls_for_each_entry_rcu(), and once in rcu_dereference_raw() > > While barrier();ptr = rcu_dereference(X); does generate a single load. It's true. Unfortunately, using barrier() can also prevent gcc from making some (acceptable) code optimizations, because barrier() has a global effect, and everything we want to reload is the (head->first) pointer. So, to be absolutely precise, we have to introduce and use the ACCESS_FIELD_ONCE() macro. In any case, it doesn't look like a big problem. In my mind, the best solution is to use the ACCESS_FIELD_ONCE() macro, but using barrier() is also an acceptable solution. Regards, Roman ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-29 10:09 ` Roman Gushchin @ 2013-05-29 19:06 ` Eric Dumazet 2013-05-30 8:25 ` Roman Gushchin 0 siblings, 1 reply; 57+ messages in thread From: Eric Dumazet @ 2013-05-29 19:06 UTC (permalink / raw) To: Roman Gushchin, David Miller Cc: paulmck, Jesper Dangaard Brouer, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Wed, 2013-05-29 at 14:09 +0400, Roman Gushchin wrote: > Unfortunately, using barrier() can also prevent gcc from making some > (acceptable) code optimizations, because barrier() has a global effect, > and everything we want to reload is the (head->first) pointer. > So, to be absolutely precise, we have to introduce and use > the ACCESS_FIELD_ONCE() macro. Yep, ACCESS_ONCE() seems to not work for a field, at least the ACCESS_ONCE() used in __rcu_dereference_check() > > In any case, it doesn't look like a big problem. > In my mind, the best solution is to use the ACCESS_FIELD_ONCE() macro, > but using barrier() is also an acceptable solution. > True, these lookup functions are usually structured the same around the hlist_nulls_for_each_entry_rcu() loop. A barrier() right before the loop seems to be a benefit, the size of assembly code is reduced by 48 bytes. And its one of the documented way to handle this kind of problems (Documentation/atomic_ops.txt line 114) I guess we should amend this documentation, eventually. Thanks, please add you "Signed-off-by" if you agree with the patch. [PATCH] net: force a reload of first item in hlist_nulls_for_each_entry_rcu Roman Gushchin discovered that udp4_lib_lookup2() was not reloading first item in the rcu protected list, in case the loop was restarted. This produced soft lockups as in https://lkml.org/lkml/2013/4/16/37 rcu_dereference(X)/ACCESS_ONCE(X) seem to not work as intended if X is ptr->field : In some cases, gcc caches the value or ptr->field in a register. Use a barrier() to disallow such caching, as documented in Documentation/atomic_ops.txt line 114 Thanks a lot to Roman for providing analysis and numerous patches. Diagnosed-by: Roman Gushchin <klamm@yandex-team.ru> Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru> --- include/linux/rculist_nulls.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h index 2ae1371..c7557fa 100644 --- a/include/linux/rculist_nulls.h +++ b/include/linux/rculist_nulls.h @@ -105,9 +105,14 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n, * @head: the head for your list. * @member: the name of the hlist_nulls_node within the struct. * + * The barrier() is needed to make sure compiler doesn't cache first element [1], + * as this loop can be restarted [2] + * [1] Documentation/atomic_ops.txt around line 114 + * [2] Documentation/RCU/rculist_nulls.txt around line 146 */ #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ - for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ + for (({barrier();}), \ + pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ (!is_a_nulls(pos)) && \ ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-29 19:06 ` Eric Dumazet @ 2013-05-30 8:25 ` Roman Gushchin 2013-06-02 23:31 ` Eric Dumazet 0 siblings, 1 reply; 57+ messages in thread From: Roman Gushchin @ 2013-05-30 8:25 UTC (permalink / raw) To: Eric Dumazet Cc: David Miller, paulmck, Jesper Dangaard Brouer, Dipankar Sarma, zhmurov, linux-kernel, netdev, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy, David Laight On 29.05.2013 23:06, Eric Dumazet wrote: > On Wed, 2013-05-29 at 14:09 +0400, Roman Gushchin wrote: > > True, these lookup functions are usually structured the same around the > hlist_nulls_for_each_entry_rcu() loop. > > A barrier() right before the loop seems to be a benefit, the size of > assembly code is reduced by 48 bytes. > > And its one of the documented way to handle this kind of problems > (Documentation/atomic_ops.txt line 114) > > I guess we should amend this documentation, eventually. > > Thanks, please add you "Signed-off-by" if you agree with the patch. > Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> Many thanks to you, Paul E. McKenney and David Laight for your patches, help and participation in this discussion. > > [PATCH] net: force a reload of first item in hlist_nulls_for_each_entry_rcu > > Roman Gushchin discovered that udp4_lib_lookup2() was not reloading > first item in the rcu protected list, in case the loop was restarted. > > This produced soft lockups as in https://lkml.org/lkml/2013/4/16/37 > > rcu_dereference(X)/ACCESS_ONCE(X) seem to not work as intended if X is > ptr->field : > > In some cases, gcc caches the value or ptr->field in a register. > > Use a barrier() to disallow such caching, as documented in > Documentation/atomic_ops.txt line 114 > > Thanks a lot to Roman for providing analysis and numerous patches. > > Diagnosed-by: Roman Gushchin <klamm@yandex-team.ru> > Signed-off-by: Eric Dumazet <edumazet@google.com> > Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru> > --- > include/linux/rculist_nulls.h | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h > index 2ae1371..c7557fa 100644 > --- a/include/linux/rculist_nulls.h > +++ b/include/linux/rculist_nulls.h > @@ -105,9 +105,14 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n, > * @head: the head for your list. > * @member: the name of the hlist_nulls_node within the struct. > * > + * The barrier() is needed to make sure compiler doesn't cache first element [1], > + * as this loop can be restarted [2] > + * [1] Documentation/atomic_ops.txt around line 114 > + * [2] Documentation/RCU/rculist_nulls.txt around line 146 > */ > #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ > - for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ > + for (({barrier();}), \ > + pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ > (!is_a_nulls(pos)) && \ > ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ > pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) > > ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-30 8:25 ` Roman Gushchin @ 2013-06-02 23:31 ` Eric Dumazet 2013-06-03 2:58 ` David Miller 0 siblings, 1 reply; 57+ messages in thread From: Eric Dumazet @ 2013-06-02 23:31 UTC (permalink / raw) To: Roman Gushchin Cc: David Miller, paulmck, Jesper Dangaard Brouer, Dipankar Sarma, zhmurov, linux-kernel, netdev, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy, David Laight On Thu, 2013-05-30 at 12:25 +0400, Roman Gushchin wrote: > On 29.05.2013 23:06, Eric Dumazet wrote: > > On Wed, 2013-05-29 at 14:09 +0400, Roman Gushchin wrote: > > > > True, these lookup functions are usually structured the same around the > > hlist_nulls_for_each_entry_rcu() loop. > > > > A barrier() right before the loop seems to be a benefit, the size of > > assembly code is reduced by 48 bytes. > > > > And its one of the documented way to handle this kind of problems > > (Documentation/atomic_ops.txt line 114) > > > > I guess we should amend this documentation, eventually. > > > > Thanks, please add you "Signed-off-by" if you agree with the patch. > > > > Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> > > Many thanks to you, Paul E. McKenney and David Laight for your > patches, help and participation in this discussion. Thanks to you ! David, is there any problem with the patch ? http://patchwork.ozlabs.org/patch/247360/ says "Not applicable", please tell me what is needed to get it merged. Thanks ! ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-06-02 23:31 ` Eric Dumazet @ 2013-06-03 2:58 ` David Miller 2013-06-03 3:12 ` Eric Dumazet 2013-06-03 3:42 ` Paul E. McKenney 0 siblings, 2 replies; 57+ messages in thread From: David Miller @ 2013-06-03 2:58 UTC (permalink / raw) To: eric.dumazet Cc: klamm, paulmck, brouer, dipankar, zhmurov, linux-kernel, netdev, kuznet, jmorris, yoshfuji, kaber, David.Laight From: Eric Dumazet <eric.dumazet@gmail.com> Date: Sun, 02 Jun 2013 16:31:35 -0700 > On Thu, 2013-05-30 at 12:25 +0400, Roman Gushchin wrote: >> On 29.05.2013 23:06, Eric Dumazet wrote: >> > On Wed, 2013-05-29 at 14:09 +0400, Roman Gushchin wrote: >> > >> > True, these lookup functions are usually structured the same around the >> > hlist_nulls_for_each_entry_rcu() loop. >> > >> > A barrier() right before the loop seems to be a benefit, the size of >> > assembly code is reduced by 48 bytes. >> > >> > And its one of the documented way to handle this kind of problems >> > (Documentation/atomic_ops.txt line 114) >> > >> > I guess we should amend this documentation, eventually. >> > >> > Thanks, please add you "Signed-off-by" if you agree with the patch. >> > >> >> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> >> >> Many thanks to you, Paul E. McKenney and David Laight for your >> patches, help and participation in this discussion. > > Thanks to you ! > > David, is there any problem with the patch ? > > http://patchwork.ozlabs.org/patch/247360/ says "Not applicable", please > tell me what is needed to get it merged. It's not a networking patch, it's a patch for generic RCU upstream. So it either goes through Paul McKenney, or directly via Linus on linux-kernel. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-06-03 2:58 ` David Miller @ 2013-06-03 3:12 ` Eric Dumazet 2013-06-03 3:27 ` David Miller 2013-06-03 3:48 ` Paul E. McKenney 2013-06-03 3:42 ` Paul E. McKenney 1 sibling, 2 replies; 57+ messages in thread From: Eric Dumazet @ 2013-06-03 3:12 UTC (permalink / raw) To: David Miller Cc: klamm, paulmck, brouer, dipankar, zhmurov, linux-kernel, netdev, kuznet, jmorris, yoshfuji, kaber, David.Laight On Sun, 2013-06-02 at 19:58 -0700, David Miller wrote: > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Sun, 02 Jun 2013 16:31:35 -0700 > > > On Thu, 2013-05-30 at 12:25 +0400, Roman Gushchin wrote: > >> On 29.05.2013 23:06, Eric Dumazet wrote: > >> > On Wed, 2013-05-29 at 14:09 +0400, Roman Gushchin wrote: > >> > > >> > True, these lookup functions are usually structured the same around the > >> > hlist_nulls_for_each_entry_rcu() loop. > >> > > >> > A barrier() right before the loop seems to be a benefit, the size of > >> > assembly code is reduced by 48 bytes. > >> > > >> > And its one of the documented way to handle this kind of problems > >> > (Documentation/atomic_ops.txt line 114) > >> > > >> > I guess we should amend this documentation, eventually. > >> > > >> > Thanks, please add you "Signed-off-by" if you agree with the patch. > >> > > >> > >> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> > >> > >> Many thanks to you, Paul E. McKenney and David Laight for your > >> patches, help and participation in this discussion. > > > > Thanks to you ! > > > > David, is there any problem with the patch ? > > > > http://patchwork.ozlabs.org/patch/247360/ says "Not applicable", please > > tell me what is needed to get it merged. > > It's not a networking patch, it's a patch for generic RCU upstream. > So it either goes through Paul McKenney, or directly via Linus on > linux-kernel. Well, whole discussion went on linux kernel, and you were the original committer of this patch five years ago. http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=bbaffaca4810de1a25e32ecaf836eeaacc7a3d11 So please Paul or David make sure the patch goes in. My only concern is that Paul seems quite busy these days. Thanks ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-06-03 3:12 ` Eric Dumazet @ 2013-06-03 3:27 ` David Miller 2013-06-03 3:42 ` Paul E. McKenney 2013-06-03 3:48 ` Paul E. McKenney 1 sibling, 1 reply; 57+ messages in thread From: David Miller @ 2013-06-03 3:27 UTC (permalink / raw) To: eric.dumazet Cc: klamm, paulmck, brouer, dipankar, zhmurov, linux-kernel, netdev, kuznet, jmorris, yoshfuji, kaber, David.Laight From: Eric Dumazet <eric.dumazet@gmail.com> Date: Sun, 02 Jun 2013 20:12:49 -0700 > On Sun, 2013-06-02 at 19:58 -0700, David Miller wrote: >> From: Eric Dumazet <eric.dumazet@gmail.com> >> Date: Sun, 02 Jun 2013 16:31:35 -0700 >> >> > On Thu, 2013-05-30 at 12:25 +0400, Roman Gushchin wrote: >> >> On 29.05.2013 23:06, Eric Dumazet wrote: >> >> > On Wed, 2013-05-29 at 14:09 +0400, Roman Gushchin wrote: >> >> > >> >> > True, these lookup functions are usually structured the same around the >> >> > hlist_nulls_for_each_entry_rcu() loop. >> >> > >> >> > A barrier() right before the loop seems to be a benefit, the size of >> >> > assembly code is reduced by 48 bytes. >> >> > >> >> > And its one of the documented way to handle this kind of problems >> >> > (Documentation/atomic_ops.txt line 114) >> >> > >> >> > I guess we should amend this documentation, eventually. >> >> > >> >> > Thanks, please add you "Signed-off-by" if you agree with the patch. >> >> > >> >> >> >> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> >> >> >> >> Many thanks to you, Paul E. McKenney and David Laight for your >> >> patches, help and participation in this discussion. >> > >> > Thanks to you ! >> > >> > David, is there any problem with the patch ? >> > >> > http://patchwork.ozlabs.org/patch/247360/ says "Not applicable", please >> > tell me what is needed to get it merged. >> >> It's not a networking patch, it's a patch for generic RCU upstream. >> So it either goes through Paul McKenney, or directly via Linus on >> linux-kernel. > > Well, whole discussion went on linux kernel, and you were the original > committer of this patch five years ago. > > http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=bbaffaca4810de1a25e32ecaf836eeaacc7a3d11 > > So please Paul or David make sure the patch goes in. > > My only concern is that Paul seems quite busy these days. Ok I can take it then, no problem. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-06-03 3:27 ` David Miller @ 2013-06-03 3:42 ` Paul E. McKenney 2013-06-03 3:47 ` Eric Dumazet 2013-06-03 3:49 ` David Miller 0 siblings, 2 replies; 57+ messages in thread From: Paul E. McKenney @ 2013-06-03 3:42 UTC (permalink / raw) To: David Miller Cc: eric.dumazet, klamm, brouer, dipankar, zhmurov, linux-kernel, netdev, kuznet, jmorris, yoshfuji, kaber, David.Laight On Sun, Jun 02, 2013 at 08:27:03PM -0700, David Miller wrote: > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Sun, 02 Jun 2013 20:12:49 -0700 > > > On Sun, 2013-06-02 at 19:58 -0700, David Miller wrote: > >> From: Eric Dumazet <eric.dumazet@gmail.com> > >> Date: Sun, 02 Jun 2013 16:31:35 -0700 > >> > >> > On Thu, 2013-05-30 at 12:25 +0400, Roman Gushchin wrote: > >> >> On 29.05.2013 23:06, Eric Dumazet wrote: > >> >> > On Wed, 2013-05-29 at 14:09 +0400, Roman Gushchin wrote: > >> >> > > >> >> > True, these lookup functions are usually structured the same around the > >> >> > hlist_nulls_for_each_entry_rcu() loop. > >> >> > > >> >> > A barrier() right before the loop seems to be a benefit, the size of > >> >> > assembly code is reduced by 48 bytes. > >> >> > > >> >> > And its one of the documented way to handle this kind of problems > >> >> > (Documentation/atomic_ops.txt line 114) > >> >> > > >> >> > I guess we should amend this documentation, eventually. > >> >> > > >> >> > Thanks, please add you "Signed-off-by" if you agree with the patch. > >> >> > > >> >> > >> >> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> > >> >> > >> >> Many thanks to you, Paul E. McKenney and David Laight for your > >> >> patches, help and participation in this discussion. > >> > > >> > Thanks to you ! > >> > > >> > David, is there any problem with the patch ? > >> > > >> > http://patchwork.ozlabs.org/patch/247360/ says "Not applicable", please > >> > tell me what is needed to get it merged. > >> > >> It's not a networking patch, it's a patch for generic RCU upstream. > >> So it either goes through Paul McKenney, or directly via Linus on > >> linux-kernel. > > > > Well, whole discussion went on linux kernel, and you were the original > > committer of this patch five years ago. > > > > http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=bbaffaca4810de1a25e32ecaf836eeaacc7a3d11 > > > > So please Paul or David make sure the patch goes in. > > > > My only concern is that Paul seems quite busy these days. > > Ok I can take it then, no problem. Or if you would like to take it: Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Just let me know. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-06-03 3:42 ` Paul E. McKenney @ 2013-06-03 3:47 ` Eric Dumazet 2013-06-03 3:49 ` David Miller 1 sibling, 0 replies; 57+ messages in thread From: Eric Dumazet @ 2013-06-03 3:47 UTC (permalink / raw) To: paulmck Cc: David Miller, klamm, brouer, dipankar, zhmurov, linux-kernel, netdev, kuznet, jmorris, yoshfuji, kaber, David.Laight On Sun, 2013-06-02 at 20:42 -0700, Paul E. McKenney wrote: > Or if you would like to take it: > > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Thanks guys ! ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-06-03 3:42 ` Paul E. McKenney 2013-06-03 3:47 ` Eric Dumazet @ 2013-06-03 3:49 ` David Miller 2013-06-03 6:05 ` Paul E. McKenney 2013-06-10 18:29 ` Boris B. Zhmurov 1 sibling, 2 replies; 57+ messages in thread From: David Miller @ 2013-06-03 3:49 UTC (permalink / raw) To: paulmck Cc: eric.dumazet, klamm, brouer, dipankar, zhmurov, linux-kernel, netdev, kuznet, jmorris, yoshfuji, kaber, David.Laight From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Date: Sun, 2 Jun 2013 20:42:54 -0700 > On Sun, Jun 02, 2013 at 08:27:03PM -0700, David Miller wrote: >> From: Eric Dumazet <eric.dumazet@gmail.com> >> Date: Sun, 02 Jun 2013 20:12:49 -0700 >> >> > On Sun, 2013-06-02 at 19:58 -0700, David Miller wrote: >> >> From: Eric Dumazet <eric.dumazet@gmail.com> >> >> Date: Sun, 02 Jun 2013 16:31:35 -0700 >> >> >> >> > On Thu, 2013-05-30 at 12:25 +0400, Roman Gushchin wrote: >> >> >> On 29.05.2013 23:06, Eric Dumazet wrote: >> >> >> > On Wed, 2013-05-29 at 14:09 +0400, Roman Gushchin wrote: >> >> >> > >> >> >> > True, these lookup functions are usually structured the same around the >> >> >> > hlist_nulls_for_each_entry_rcu() loop. >> >> >> > >> >> >> > A barrier() right before the loop seems to be a benefit, the size of >> >> >> > assembly code is reduced by 48 bytes. >> >> >> > >> >> >> > And its one of the documented way to handle this kind of problems >> >> >> > (Documentation/atomic_ops.txt line 114) >> >> >> > >> >> >> > I guess we should amend this documentation, eventually. >> >> >> > >> >> >> > Thanks, please add you "Signed-off-by" if you agree with the patch. >> >> >> > >> >> >> >> >> >> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> >> >> >> >> >> >> Many thanks to you, Paul E. McKenney and David Laight for your >> >> >> patches, help and participation in this discussion. >> >> > >> >> > Thanks to you ! >> >> > >> >> > David, is there any problem with the patch ? >> >> > >> >> > http://patchwork.ozlabs.org/patch/247360/ says "Not applicable", please >> >> > tell me what is needed to get it merged. >> >> >> >> It's not a networking patch, it's a patch for generic RCU upstream. >> >> So it either goes through Paul McKenney, or directly via Linus on >> >> linux-kernel. >> > >> > Well, whole discussion went on linux kernel, and you were the original >> > committer of this patch five years ago. >> > >> > http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=bbaffaca4810de1a25e32ecaf836eeaacc7a3d11 >> > >> > So please Paul or David make sure the patch goes in. >> > >> > My only concern is that Paul seems quite busy these days. >> >> Ok I can take it then, no problem. > > Or if you would like to take it: > > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Let's end this deadlock by me taking it :-) ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-06-03 3:49 ` David Miller @ 2013-06-03 6:05 ` Paul E. McKenney 2013-06-10 18:29 ` Boris B. Zhmurov 1 sibling, 0 replies; 57+ messages in thread From: Paul E. McKenney @ 2013-06-03 6:05 UTC (permalink / raw) To: David Miller Cc: eric.dumazet, klamm, brouer, dipankar, zhmurov, linux-kernel, netdev, kuznet, jmorris, yoshfuji, kaber, David.Laight On Sun, Jun 02, 2013 at 08:49:15PM -0700, David Miller wrote: > From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> > Date: Sun, 2 Jun 2013 20:42:54 -0700 > > > On Sun, Jun 02, 2013 at 08:27:03PM -0700, David Miller wrote: > >> From: Eric Dumazet <eric.dumazet@gmail.com> > >> Date: Sun, 02 Jun 2013 20:12:49 -0700 > >> > >> > On Sun, 2013-06-02 at 19:58 -0700, David Miller wrote: > >> >> From: Eric Dumazet <eric.dumazet@gmail.com> > >> >> Date: Sun, 02 Jun 2013 16:31:35 -0700 > >> >> > >> >> > On Thu, 2013-05-30 at 12:25 +0400, Roman Gushchin wrote: > >> >> >> On 29.05.2013 23:06, Eric Dumazet wrote: > >> >> >> > On Wed, 2013-05-29 at 14:09 +0400, Roman Gushchin wrote: > >> >> >> > > >> >> >> > True, these lookup functions are usually structured the same around the > >> >> >> > hlist_nulls_for_each_entry_rcu() loop. > >> >> >> > > >> >> >> > A barrier() right before the loop seems to be a benefit, the size of > >> >> >> > assembly code is reduced by 48 bytes. > >> >> >> > > >> >> >> > And its one of the documented way to handle this kind of problems > >> >> >> > (Documentation/atomic_ops.txt line 114) > >> >> >> > > >> >> >> > I guess we should amend this documentation, eventually. > >> >> >> > > >> >> >> > Thanks, please add you "Signed-off-by" if you agree with the patch. > >> >> >> > > >> >> >> > >> >> >> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> > >> >> >> > >> >> >> Many thanks to you, Paul E. McKenney and David Laight for your > >> >> >> patches, help and participation in this discussion. > >> >> > > >> >> > Thanks to you ! > >> >> > > >> >> > David, is there any problem with the patch ? > >> >> > > >> >> > http://patchwork.ozlabs.org/patch/247360/ says "Not applicable", please > >> >> > tell me what is needed to get it merged. > >> >> > >> >> It's not a networking patch, it's a patch for generic RCU upstream. > >> >> So it either goes through Paul McKenney, or directly via Linus on > >> >> linux-kernel. > >> > > >> > Well, whole discussion went on linux kernel, and you were the original > >> > committer of this patch five years ago. > >> > > >> > http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=bbaffaca4810de1a25e32ecaf836eeaacc7a3d11 > >> > > >> > So please Paul or David make sure the patch goes in. > >> > > >> > My only concern is that Paul seems quite busy these days. > >> > >> Ok I can take it then, no problem. > > > > Or if you would like to take it: > > > > Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > Let's end this deadlock by me taking it :-) You got it! ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-06-03 3:49 ` David Miller 2013-06-03 6:05 ` Paul E. McKenney @ 2013-06-10 18:29 ` Boris B. Zhmurov 2013-06-10 18:51 ` Eric Dumazet 1 sibling, 1 reply; 57+ messages in thread From: Boris B. Zhmurov @ 2013-06-10 18:29 UTC (permalink / raw) To: David Miller Cc: paulmck, eric.dumazet, klamm, brouer, dipankar, zhmurov, linux-kernel, netdev, kuznet, jmorris, yoshfuji, kaber, David.Laight, stable, gregkh On 06/03/2013 07:49 AM, David Miller wrote: >>>> Well, whole discussion went on linux kernel, and you were the original >>>> committer of this patch five years ago. >>>> >>>> http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=bbaffaca4810de1a25e32ecaf836eeaacc7a3d11 >>>> >>>> So please Paul or David make sure the patch goes in. >>>> >>>> My only concern is that Paul seems quite busy these days. >>> >>> Ok I can take it then, no problem. >> >> Or if you would like to take it: >> >> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > Let's end this deadlock by me taking it :-) Hello David, Do you think, this patch is "stable material" and should be included in 3.0-stable and 3.4-stable trees? Thanks. -- Boris B. Zhmurov Yandex.Search SRE Team ************* т.: +7 (495) 739-70-00 ext.7160 http://yandex.ru ************* ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-06-10 18:29 ` Boris B. Zhmurov @ 2013-06-10 18:51 ` Eric Dumazet 0 siblings, 0 replies; 57+ messages in thread From: Eric Dumazet @ 2013-06-10 18:51 UTC (permalink / raw) To: zhmurov Cc: David Miller, paulmck, klamm, brouer, dipankar, linux-kernel, netdev, kuznet, jmorris, yoshfuji, kaber, David.Laight, stable, gregkh On Mon, 2013-06-10 at 22:29 +0400, Boris B. Zhmurov wrote: > Hello David, > > Do you think, this patch is "stable material" and should be included in > 3.0-stable and 3.4-stable trees? Thanks. Yes, its in the stable queue : http://patchwork.ozlabs.org/bundle/davem/stable/?state=* ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-06-03 3:12 ` Eric Dumazet 2013-06-03 3:27 ` David Miller @ 2013-06-03 3:48 ` Paul E. McKenney 1 sibling, 0 replies; 57+ messages in thread From: Paul E. McKenney @ 2013-06-03 3:48 UTC (permalink / raw) To: Eric Dumazet Cc: David Miller, klamm, brouer, dipankar, zhmurov, linux-kernel, netdev, kuznet, jmorris, yoshfuji, kaber, David.Laight On Sun, Jun 02, 2013 at 08:12:49PM -0700, Eric Dumazet wrote: > On Sun, 2013-06-02 at 19:58 -0700, David Miller wrote: > > From: Eric Dumazet <eric.dumazet@gmail.com> > > Date: Sun, 02 Jun 2013 16:31:35 -0700 > > > > > On Thu, 2013-05-30 at 12:25 +0400, Roman Gushchin wrote: > > >> On 29.05.2013 23:06, Eric Dumazet wrote: > > >> > On Wed, 2013-05-29 at 14:09 +0400, Roman Gushchin wrote: > > >> > > > >> > True, these lookup functions are usually structured the same around the > > >> > hlist_nulls_for_each_entry_rcu() loop. > > >> > > > >> > A barrier() right before the loop seems to be a benefit, the size of > > >> > assembly code is reduced by 48 bytes. > > >> > > > >> > And its one of the documented way to handle this kind of problems > > >> > (Documentation/atomic_ops.txt line 114) > > >> > > > >> > I guess we should amend this documentation, eventually. > > >> > > > >> > Thanks, please add you "Signed-off-by" if you agree with the patch. > > >> > > > >> > > >> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> > > >> > > >> Many thanks to you, Paul E. McKenney and David Laight for your > > >> patches, help and participation in this discussion. > > > > > > Thanks to you ! > > > > > > David, is there any problem with the patch ? > > > > > > http://patchwork.ozlabs.org/patch/247360/ says "Not applicable", please > > > tell me what is needed to get it merged. > > > > It's not a networking patch, it's a patch for generic RCU upstream. > > So it either goes through Paul McKenney, or directly via Linus on > > linux-kernel. > > Well, whole discussion went on linux kernel, and you were the original > committer of this patch five years ago. > > http://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=bbaffaca4810de1a25e32ecaf836eeaacc7a3d11 > > So please Paul or David make sure the patch goes in. > > My only concern is that Paul seems quite busy these days. One reason for the longer-than-usual response delays is my current timezone, IST. ;-) Back to PDT on the 10th. Thanx, Paul ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-06-03 2:58 ` David Miller 2013-06-03 3:12 ` Eric Dumazet @ 2013-06-03 3:42 ` Paul E. McKenney 1 sibling, 0 replies; 57+ messages in thread From: Paul E. McKenney @ 2013-06-03 3:42 UTC (permalink / raw) To: David Miller Cc: eric.dumazet, klamm, brouer, dipankar, zhmurov, linux-kernel, netdev, kuznet, jmorris, yoshfuji, kaber, David.Laight On Sun, Jun 02, 2013 at 07:58:26PM -0700, David Miller wrote: > From: Eric Dumazet <eric.dumazet@gmail.com> > Date: Sun, 02 Jun 2013 16:31:35 -0700 > > > On Thu, 2013-05-30 at 12:25 +0400, Roman Gushchin wrote: > >> On 29.05.2013 23:06, Eric Dumazet wrote: > >> > On Wed, 2013-05-29 at 14:09 +0400, Roman Gushchin wrote: > >> > > >> > True, these lookup functions are usually structured the same around the > >> > hlist_nulls_for_each_entry_rcu() loop. > >> > > >> > A barrier() right before the loop seems to be a benefit, the size of > >> > assembly code is reduced by 48 bytes. > >> > > >> > And its one of the documented way to handle this kind of problems > >> > (Documentation/atomic_ops.txt line 114) > >> > > >> > I guess we should amend this documentation, eventually. > >> > > >> > Thanks, please add you "Signed-off-by" if you agree with the patch. > >> > > >> > >> Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> > >> > >> Many thanks to you, Paul E. McKenney and David Laight for your > >> patches, help and participation in this discussion. > > > > Thanks to you ! > > > > David, is there any problem with the patch ? > > > > http://patchwork.ozlabs.org/patch/247360/ says "Not applicable", please > > tell me what is needed to get it merged. > > It's not a networking patch, it's a patch for generic RCU upstream. > So it either goes through Paul McKenney, or directly via Linus on > linux-kernel. I would be happy to take it. Thanx, Paul ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-29 0:34 ` Eric Dumazet 2013-05-29 1:31 ` Paul E. McKenney @ 2013-05-29 9:17 ` Roman Gushchin 1 sibling, 0 replies; 57+ messages in thread From: Roman Gushchin @ 2013-05-29 9:17 UTC (permalink / raw) To: Eric Dumazet Cc: Jesper Dangaard Brouer, paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 29.05.2013 04:34, Eric Dumazet wrote: > On Tue, 2013-05-28 at 13:10 +0400, Roman Gushchin wrote: >> On 28.05.2013 04:12, Eric Dumazet wrote: > >>> Adding a barrier() is probably what we want. >> >> I agree, inserting barrier() is also a correct and working fix. > > Yeah, but I can not find a clean way to put it inside the "for (;;)" diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h index 2ae1371..74d5065 100644 --- a/include/linux/rculist_nulls.h +++ b/include/linux/rculist_nulls.h @@ -107,7 +107,7 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n, * */ #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ - for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ + for (({barrier();}), pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ (!is_a_nulls(pos)) && \ ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) ^ permalink raw reply related [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-28 9:10 ` Roman Gushchin 2013-05-29 0:34 ` Eric Dumazet @ 2013-05-29 1:19 ` Paul E. McKenney 1 sibling, 0 replies; 57+ messages in thread From: Paul E. McKenney @ 2013-05-29 1:19 UTC (permalink / raw) To: Roman Gushchin Cc: Eric Dumazet, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, May 28, 2013 at 01:10:46PM +0400, Roman Gushchin wrote: > On 28.05.2013 04:12, Eric Dumazet wrote: > >On Mon, 2013-05-27 at 21:55 +0400, Roman Gushchin wrote: > >>Hi, Paul! > >> > >>>On 25.05.2013 15:37, Paul E. McKenney wrote: > >>>>Again, I believe that your retry logic needs to extend back into the > >>>>calling function for your some_func() example above. > >> > >>And what do you think about the following approach (diff below)? > >> > >>It seems to me, it's enough clear (especially with good accompanying comments) > >>and produces a good binary code (without significant overhead). > >>Also, we will remove a hidden reef in using rcu-protected (h)list traverses with restarts. > >> > > > >>diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h > >>index 2ae1371..4af5ee5 100644 > >>--- a/include/linux/rculist_nulls.h > >>+++ b/include/linux/rculist_nulls.h > >>@@ -107,7 +107,8 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n, > >> * > >> */ > >> #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ > >>- for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ > >>+ for (ACCESS_ONCE(*(head)), \ > >>+ pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ > >> (!is_a_nulls(pos)) && \ > >> ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ > >> pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) > > > >It looks like this still relies on gcc being friendly here. > > > >I repeat again : @head here is a constant. > > No. > Actually, there are two volatile objects: pointer to the first element (as a part of the head structure), > and the first element by itself. So, to be strict, head as a structure contains a volatile field. > Head->first should be treated as a volatile pointer to a volatile data. So, the whole head object is volatile. > > > > >Macro already uses ACCESS_ONCE(), we only have to instruct gcc that > >caching the value is forbidden if we restart the loop > >(aka "goto begin;" see Documentation/RCU/rculist_nulls.txt line 146) > > My patch seems to be correct, because, ACCESS_ONCE(*(head)) will cause gcc to (re)read head data from the memory. > According to gcc documentation: > "A scalar volatile object is read when it is accessed in a void context: > volatile int *src = somevalue; > *src; > Such expressions are rvalues, and GCC implements this as a read of the volatile object being pointed to." > And this is exactly our case. > > >Adding a barrier() is probably what we want. > > I agree, inserting barrier() is also a correct and working fix. I have to ask this question of both of you... Are either of Eric's or Roman's fixes really guaranteed to work if gcc elects not to inline the function containing the RCU list traversal? Thanx, Paul ^ permalink raw reply [flat|nested] 57+ messages in thread
* RE: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 12:30 ` Eric Dumazet @ 2013-05-22 13:27 ` David Laight 2013-05-22 13:27 ` David Laight 1 sibling, 0 replies; 57+ messages in thread From: David Laight @ 2013-05-22 13:27 UTC (permalink / raw) To: Eric Dumazet, Roman Gushchin Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1216 bytes --] > So yes, the patch appears to fix the bug, but it sounds not logical to > me. I was confused because the copy of the code I found was different (it has some checks for reusaddr - which force a function call in the loop). The code being compiled is: begin: result = NULL; badness = -1; udp_portaddr_for_each_entry_rcu(sk, node, &hslot2->head) { score = compute_score2(sk, net, saddr, sport, daddr, hnum, dif); if (score > badness) { result = sk; badness = score; if (score == SCORE2_MAX) goto exact_match; } } /* * if the nulls value we got at the end of this lookup is * not the expected one, we must restart lookup. * We probably met an item that was moved to another chain. */ if (get_nulls_value(node) != slot2) goto begin; Which is entirely inlined - so the compiler is allowed to assume that no other code modifies any of the data. Hence it is allowed to cache the list head value. Indeed it could convert the last line to "for (;;);". A asm volatile ("":::"memory") somewhere would fix it. David ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 57+ messages in thread
* RE: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro @ 2013-05-22 13:27 ` David Laight 0 siblings, 0 replies; 57+ messages in thread From: David Laight @ 2013-05-22 13:27 UTC (permalink / raw) To: Eric Dumazet, Roman Gushchin Cc: paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy > So yes, the patch appears to fix the bug, but it sounds not logical to > me. I was confused because the copy of the code I found was different (it has some checks for reusaddr - which force a function call in the loop). The code being compiled is: begin: result = NULL; badness = -1; udp_portaddr_for_each_entry_rcu(sk, node, &hslot2->head) { score = compute_score2(sk, net, saddr, sport, daddr, hnum, dif); if (score > badness) { result = sk; badness = score; if (score == SCORE2_MAX) goto exact_match; } } /* * if the nulls value we got at the end of this lookup is * not the expected one, we must restart lookup. * We probably met an item that was moved to another chain. */ if (get_nulls_value(node) != slot2) goto begin; Which is entirely inlined - so the compiler is allowed to assume that no other code modifies any of the data. Hence it is allowed to cache the list head value. Indeed it could convert the last line to "for (;;);". A asm volatile ("":::"memory") somewhere would fix it. David ^ permalink raw reply [flat|nested] 57+ messages in thread
* RE: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 13:27 ` David Laight (?) @ 2013-05-22 13:36 ` Eric Dumazet 2013-05-22 14:23 ` David Laight -1 siblings, 1 reply; 57+ messages in thread From: Eric Dumazet @ 2013-05-22 13:36 UTC (permalink / raw) To: David Laight Cc: Roman Gushchin, paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Wed, 2013-05-22 at 14:27 +0100, David Laight wrote: > > So yes, the patch appears to fix the bug, but it sounds not logical to > > me. > > I was confused because the copy of the code I found was different > (it has some checks for reusaddr - which force a function call in the > loop). > > The code being compiled is: > > begin: > result = NULL; > badness = -1; > udp_portaddr_for_each_entry_rcu(sk, node, &hslot2->head) { Here this loops begin by someptr = rcu_dereference(somelocation); May claim is rcu_dereference() should force the compiler to read again somelocation. Its done thanks to ACCESS_ONCE(). But apparently in the specific case of &hslot->head, it doesnt work. When using head directly (ie the same value), the ACCESS_ONCE() does correctly its job. > score = compute_score2(sk, net, saddr, sport, > daddr, hnum, dif); > if (score > badness) { > result = sk; > badness = score; > if (score == SCORE2_MAX) > goto exact_match; > } > } > /* > * if the nulls value we got at the end of this lookup is > * not the expected one, we must restart lookup. > * We probably met an item that was moved to another chain. > */ > if (get_nulls_value(node) != slot2) > goto begin; > > Which is entirely inlined - so the compiler is allowed to assume > that no other code modifies any of the data. You missed the rcu_dereference()/ACCESS_ONCE() we have in this loop. ^ permalink raw reply [flat|nested] 57+ messages in thread
* RE: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 13:36 ` Eric Dumazet @ 2013-05-22 14:23 ` David Laight 0 siblings, 0 replies; 57+ messages in thread From: David Laight @ 2013-05-22 14:23 UTC (permalink / raw) To: Eric Dumazet Cc: Roman Gushchin, paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy [-- Warning: decoded text below may be mangled, UTF-8 assumed --] [-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1051 bytes --] > Here this loops begin by > > someptr = rcu_dereference(somelocation); > > May claim is rcu_dereference() should force the compiler to read again > somelocation. Its done thanks to ACCESS_ONCE(). But apparently in the > specific case of &hslot->head, it doesnt work. Hmmm.... #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) That might be doomed to fail for much the same reason as: void x(struct foo *unaligned_ptr) { char *p = (void *)unaligned_ptr; memcpy(tgt, p, sizeof *p); } generates alignment faults. And that casts to a union type don't get around 'strict aliasing'. Basically the compiler makes use of the fact that you should cast addresses back to their original type before dereferencing them. So I'm not sure you can use a cast to add a type qualifier. The front-end lets you remove 'const', but I suspect the optimiser is using the original types. David ÿôèº{.nÇ+·®+%Ëÿ±éݶ\x17¥wÿº{.nÇ+·¥{±þG«éÿ{ayº\x1dÊÚë,j\a¢f£¢·hïêÿêçz_è®\x03(éÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?¨èÚ&£ø§~á¶iOæ¬z·vØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?I¥ ^ permalink raw reply [flat|nested] 57+ messages in thread
* RE: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro @ 2013-05-22 14:23 ` David Laight 0 siblings, 0 replies; 57+ messages in thread From: David Laight @ 2013-05-22 14:23 UTC (permalink / raw) To: Eric Dumazet Cc: Roman Gushchin, paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy > Here this loops begin by > > someptr = rcu_dereference(somelocation); > > May claim is rcu_dereference() should force the compiler to read again > somelocation. Its done thanks to ACCESS_ONCE(). But apparently in the > specific case of &hslot->head, it doesnt work. Hmmm.... #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) That might be doomed to fail for much the same reason as: void x(struct foo *unaligned_ptr) { char *p = (void *)unaligned_ptr; memcpy(tgt, p, sizeof *p); } generates alignment faults. And that casts to a union type don't get around 'strict aliasing'. Basically the compiler makes use of the fact that you should cast addresses back to their original type before dereferencing them. So I'm not sure you can use a cast to add a type qualifier. The front-end lets you remove 'const', but I suspect the optimiser is using the original types. David ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 13:27 ` David Laight (?) (?) @ 2013-05-22 13:55 ` Roman Gushchin -1 siblings, 0 replies; 57+ messages in thread From: Roman Gushchin @ 2013-05-22 13:55 UTC (permalink / raw) To: David Laight Cc: Eric Dumazet, paulmck, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On 22.05.2013 17:27, David Laight wrote: >> So yes, the patch appears to fix the bug, but it sounds not logical to >> me. > > I was confused because the copy of the code I found was different > (it has some checks for reusaddr - which force a function call in the > loop). > > The code being compiled is: > > begin: > result = NULL; > badness = -1; > udp_portaddr_for_each_entry_rcu(sk, node, &hslot2->head) { > score = compute_score2(sk, net, saddr, sport, > daddr, hnum, dif); > if (score > badness) { > result = sk; > badness = score; > if (score == SCORE2_MAX) > goto exact_match; > } > } > /* > * if the nulls value we got at the end of this lookup is > * not the expected one, we must restart lookup. > * We probably met an item that was moved to another chain. > */ > if (get_nulls_value(node) != slot2) > goto begin; > > Which is entirely inlined - so the compiler is allowed to assume > that no other code modifies any of the data. > Hence it is allowed to cache the list head value. > Indeed it could convert the last line to "for (;;);". > > A asm volatile ("":::"memory") somewhere would fix it. > Yes, it does. It was my first attempt to fix the issue. But putting such instructions into each such cycle isn't a good idea, IMHO. Regards, Roman ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 2:01 ` Eric Dumazet 2013-05-22 5:49 ` Eric Dumazet @ 2013-05-22 9:58 ` Paul E. McKenney 2013-05-22 12:28 ` Eric Dumazet 1 sibling, 1 reply; 57+ messages in thread From: Paul E. McKenney @ 2013-05-22 9:58 UTC (permalink / raw) To: Eric Dumazet Cc: Roman Gushchin, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Tue, May 21, 2013 at 07:01:20PM -0700, Eric Dumazet wrote: > On Tue, 2013-05-21 at 22:12 +0400, Roman Gushchin wrote: > > > If other rcu accessors have the same problem, a more complete patch is > > > needed. > > > > [PATCH] rcu: fix a race in rcu lists traverse macros > > > > Some network functions (udp4_lib_lookup2(), for instance) use the > > rcu lists traverse macros (hlist_nulls_for_each_entry_rcu, for instance) > > in a way that assumes restarting of a loop. In this case, it is strictly > > necessary to reread the head->first value from the memory before each scan. > > Without additional hints, gcc caches this value in a register. In this case, > > if a cached node is moved to another chain during the scan, we can loop > > forever getting wrong nulls values and restarting the loop uninterruptedly. > > > > Signed-off-by: Roman Gushchin <klamm@yandex-team.ru> > > Reported-by: Boris Zhmurov <zhmurov@yandex-team.ru> > > --- > > include/linux/compiler.h | 6 ++++++ > > include/linux/rculist.h | 6 ++++-- > > include/linux/rculist_nulls.h | 3 ++- > > 3 files changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/include/linux/compiler.h b/include/linux/compiler.h > > index 92669cd..0e05d7c 100644 > > --- a/include/linux/compiler.h > > +++ b/include/linux/compiler.h > > @@ -351,6 +351,12 @@ void ftrace_likely_update(struct ftrace_branch_data *f, int val, int expect); > > */ > > #define ACCESS_ONCE(x) (*(volatile typeof(x) *)&(x)) > > > > +/* > > + * Prevent the compiler from optimizing accesses to structure member. > > + */ > > +#define GET_STRUCT_FIELD_VOLATILE(struct_ptr, field) \ > > + (((volatile typeof(*struct_ptr) *)struct_ptr)->field) > > + > > /* Ignore/forbid kprobes attach on very low level functions marked by this attribute: */ > > #ifdef CONFIG_KPROBES > > # define __kprobes __attribute__((__section__(".kprobes.text"))) > > diff --git a/include/linux/rculist.h b/include/linux/rculist.h > > index 8089e35..6c3eea7 100644 > > --- a/include/linux/rculist.h > > +++ b/include/linux/rculist.h > > @@ -282,7 +282,8 @@ static inline void list_splice_init_rcu(struct list_head *list, > > * as long as the traversal is guarded by rcu_read_lock(). > > */ > > #define list_for_each_entry_rcu(pos, head, member) \ > > - for (pos = list_entry_rcu((head)->next, typeof(*pos), member); \ > > + for (pos = list_entry_rcu(GET_STRUCT_FIELD_VOLATILE(head, next), \ > > + typeof(*pos), member); \ > > &pos->member != (head); \ > > pos = list_entry_rcu(pos->member.next, typeof(*pos), member)) > > > > @@ -348,7 +349,8 @@ static inline void hlist_replace_rcu(struct hlist_node *old, > > /* > > * return the first or the next element in an RCU protected hlist > > */ > > -#define hlist_first_rcu(head) (*((struct hlist_node __rcu **)(&(head)->first))) > > +#define hlist_first_rcu(head) (*((struct hlist_node __rcu **) \ > > + (&GET_STRUCT_FIELD_VOLATILE(head, first)))) > > #define hlist_next_rcu(node) (*((struct hlist_node __rcu **)(&(node)->next))) > > #define hlist_pprev_rcu(node) (*((struct hlist_node __rcu **)((node)->pprev))) > > > > diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h > > index 2ae1371..b04fd0a 100644 > > --- a/include/linux/rculist_nulls.h > > +++ b/include/linux/rculist_nulls.h > > @@ -38,7 +38,8 @@ static inline void hlist_nulls_del_init_rcu(struct hlist_nulls_node *n) > > } > > > > #define hlist_nulls_first_rcu(head) \ > > - (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) > > + (*((struct hlist_nulls_node __rcu __force **) \ > > + &(GET_STRUCT_FIELD_VOLATILE(head, first)))) > > > > #define hlist_nulls_next_rcu(node) \ > > (*((struct hlist_nulls_node __rcu __force **)&(node)->next)) Now that I am more awake... The RCU list macros assume that the list header is either statically allocated (in which case no ACCESS_ONCE() or whatever is needed) or that the caller did whatever was necessary to protect the list header, whether that be holding the right lock, using rcu_dereference() when traversing the pointer to the list header, or whatever. Maybe I need to document this assumption... Thanx, Paul > Please use ACCESS_ONCE(), which is the standard way to deal with this, > and remove the rcu_dereference_raw() in > hlist_nulls_for_each_entry_rcu() > > something like : (for the nulls part only) > > diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h > index 2ae1371..1d485d3 100644 > --- a/include/linux/rculist_nulls.h > +++ b/include/linux/rculist_nulls.h > @@ -40,6 +40,9 @@ static inline void hlist_nulls_del_init_rcu(struct hlist_nulls_node *n) > #define hlist_nulls_first_rcu(head) \ > (*((struct hlist_nulls_node __rcu __force **)&(head)->first)) > > +#define hlist_nulls_first_rcu_once(head) \ > + ACCESS_ONCE(*((struct hlist_nulls_node __rcu __force **)&(head)->first)) > + > #define hlist_nulls_next_rcu(node) \ > (*((struct hlist_nulls_node __rcu __force **)&(node)->next)) > > @@ -107,7 +110,7 @@ static inline void hlist_nulls_add_head_rcu(struct hlist_nulls_node *n, > * > */ > #define hlist_nulls_for_each_entry_rcu(tpos, pos, head, member) \ > - for (pos = rcu_dereference_raw(hlist_nulls_first_rcu(head)); \ > + for (pos = hlist_nulls_first_rcu_once(head); \ > (!is_a_nulls(pos)) && \ > ({ tpos = hlist_nulls_entry(pos, typeof(*tpos), member); 1; }); \ > pos = rcu_dereference_raw(hlist_nulls_next_rcu(pos))) > > > ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 9:58 ` Paul E. McKenney @ 2013-05-22 12:28 ` Eric Dumazet 2013-05-22 13:00 ` Paul E. McKenney 0 siblings, 1 reply; 57+ messages in thread From: Eric Dumazet @ 2013-05-22 12:28 UTC (permalink / raw) To: paulmck Cc: Roman Gushchin, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Wed, 2013-05-22 at 02:58 -0700, Paul E. McKenney wrote: > Now that I am more awake... > > The RCU list macros assume that the list header is either statically > allocated (in which case no ACCESS_ONCE() or whatever is needed) or > that the caller did whatever was necessary to protect the list header, > whether that be holding the right lock, using rcu_dereference() when > traversing the pointer to the list header, or whatever. Not sure what you mean, we do hold rcu_read_lock() here. But when we jump back to begin, we do not do "rcu_read_unlock()/rcu_read_lock()" pair. ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 12:28 ` Eric Dumazet @ 2013-05-22 13:00 ` Paul E. McKenney 2013-05-22 14:16 ` Eric Dumazet 0 siblings, 1 reply; 57+ messages in thread From: Paul E. McKenney @ 2013-05-22 13:00 UTC (permalink / raw) To: Eric Dumazet Cc: Roman Gushchin, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Wed, May 22, 2013 at 05:28:47AM -0700, Eric Dumazet wrote: > On Wed, 2013-05-22 at 02:58 -0700, Paul E. McKenney wrote: > > > Now that I am more awake... > > > > The RCU list macros assume that the list header is either statically > > allocated (in which case no ACCESS_ONCE() or whatever is needed) or > > that the caller did whatever was necessary to protect the list header, > > whether that be holding the right lock, using rcu_dereference() when > > traversing the pointer to the list header, or whatever. > > Not sure what you mean, we do hold rcu_read_lock() here. > > But when we jump back to begin, we do not do > "rcu_read_unlock()/rcu_read_lock()" pair. Right, rcu_read_lock() is part of the protection, but rcu_dereference() is the other part. All that aside, I can't claim that I understand what problem the various patches would solve. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 57+ messages in thread
* Re: [PATCH v2] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro 2013-05-22 13:00 ` Paul E. McKenney @ 2013-05-22 14:16 ` Eric Dumazet 0 siblings, 0 replies; 57+ messages in thread From: Eric Dumazet @ 2013-05-22 14:16 UTC (permalink / raw) To: paulmck Cc: Roman Gushchin, Dipankar Sarma, zhmurov, linux-kernel, netdev, David S. Miller, Alexey Kuznetsov, James Morris, Hideaki YOSHIFUJI, Patrick McHardy On Wed, 2013-05-22 at 06:00 -0700, Paul E. McKenney wrote: > Right, rcu_read_lock() is part of the protection, but rcu_dereference() > is the other part. > > All that aside, I can't claim that I understand what problem the various > patches would solve. ;-) Problem is that rcu_dereference(expr) might be optimized by the compiler to cache the dereferenced data in certain circumstances. Following patch shows the difference if you look at the generated code: This patch fixes the problem, and its not really obvious why ! diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 0bf5d39..6aa8088 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -416,11 +416,12 @@ static struct sock *udp4_lib_lookup2(struct net *net, struct hlist_nulls_node *node; int score, badness, matches = 0, reuseport = 0; u32 hash = 0; + struct hlist_nulls_head *head = &hslot2->head; begin: result = NULL; badness = 0; - udp_portaddr_for_each_entry_rcu(sk, node, &hslot2->head) { + udp_portaddr_for_each_entry_rcu(sk, node, head) { score = compute_score2(sk, net, saddr, sport, daddr, hnum, dif); if (score > badness) { ^ permalink raw reply related [flat|nested] 57+ messages in thread
end of thread, other threads:[~2013-06-10 18:52 UTC | newest] Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-05-21 9:05 [PATCH] rcu: fix a race in hlist_nulls_for_each_entry_rcu macro Roman Gushchin 2013-05-21 10:40 ` David Laight 2013-05-21 11:55 ` Roman Gushchin 2013-05-21 13:42 ` David Laight 2013-05-21 12:09 ` Paul E. McKenney 2013-05-21 12:46 ` Roman Gushchin 2013-05-21 12:58 ` Paul E. McKenney 2013-05-21 13:37 ` Eric Dumazet 2013-05-21 13:44 ` Eric Dumazet 2013-05-21 14:47 ` Roman Gushchin 2013-05-21 15:16 ` Eric Dumazet 2013-05-21 15:51 ` Roman Gushchin 2013-05-21 15:38 ` Eric Dumazet 2013-05-21 15:51 ` Roman Gushchin 2013-05-21 18:12 ` [PATCH v2] " Roman Gushchin 2013-05-22 2:01 ` Eric Dumazet 2013-05-22 5:49 ` Eric Dumazet 2013-05-22 11:58 ` Roman Gushchin 2013-05-22 12:30 ` Eric Dumazet 2013-05-22 13:07 ` Roman Gushchin 2013-05-22 17:45 ` Paul E. McKenney 2013-05-22 19:17 ` Roman Gushchin 2013-05-25 11:37 ` Paul E. McKenney 2013-05-27 11:34 ` Roman Gushchin 2013-05-27 17:55 ` Roman Gushchin 2013-05-28 0:12 ` Eric Dumazet 2013-05-28 9:10 ` Roman Gushchin 2013-05-29 0:34 ` Eric Dumazet 2013-05-29 1:31 ` Paul E. McKenney 2013-05-29 5:08 ` Eric Dumazet 2013-05-29 10:09 ` Roman Gushchin 2013-05-29 19:06 ` Eric Dumazet 2013-05-30 8:25 ` Roman Gushchin 2013-06-02 23:31 ` Eric Dumazet 2013-06-03 2:58 ` David Miller 2013-06-03 3:12 ` Eric Dumazet 2013-06-03 3:27 ` David Miller 2013-06-03 3:42 ` Paul E. McKenney 2013-06-03 3:47 ` Eric Dumazet 2013-06-03 3:49 ` David Miller 2013-06-03 6:05 ` Paul E. McKenney 2013-06-10 18:29 ` Boris B. Zhmurov 2013-06-10 18:51 ` Eric Dumazet 2013-06-03 3:48 ` Paul E. McKenney 2013-06-03 3:42 ` Paul E. McKenney 2013-05-29 9:17 ` Roman Gushchin 2013-05-29 1:19 ` Paul E. McKenney 2013-05-22 13:27 ` David Laight 2013-05-22 13:27 ` David Laight 2013-05-22 13:36 ` Eric Dumazet 2013-05-22 14:23 ` David Laight 2013-05-22 14:23 ` David Laight 2013-05-22 13:55 ` Roman Gushchin 2013-05-22 9:58 ` Paul E. McKenney 2013-05-22 12:28 ` Eric Dumazet 2013-05-22 13:00 ` Paul E. McKenney 2013-05-22 14:16 ` Eric Dumazet
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.