linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Q: locking mechanisms
@ 2006-07-01  5:58 Urs Thuermann
  2006-07-01  9:36 ` Thomas Gleixner
  0 siblings, 1 reply; 7+ messages in thread
From: Urs Thuermann @ 2006-07-01  5:58 UTC (permalink / raw)
  To: linux-kernel

I need to lock concurrent access to a list and I am unsure what the
best locking mechanism is in this case.

I have 3 functions which access the list, 2 functions called from a
syscall, i.e. in process context, need write access, and one called
from the softirq for network packet reception.

Currently, I have something like this (much simplified):

	HLIST_HEAD(head);
	rwlock_t lock = RW_LOCK_UNLOCKED;
    
	add_item(...)  /* called_from_syscall */
	{
		...
		write_lock_bh(&lock);
		some_read_operations_on_the_list();
		if (some_condition) {
			p = kmalloc(...);
			initialize(p);
			hlist_add_head(p->list, &head);
		}
		write_unlock_bh(&lock);
	}
    
	del_item(...)  /* called_from_syscall */
	{
		...
		write_lock_bh(&lock);
		p = find_item_to_delete();
		hlist_del(p->list);
		kfree(p);
		write_unlock_bh(&lock);
	}
    
	receive_function(...)
	{
		...
		read_lock(&lock);
		hlist_for_each_entry(p, n, &head, list) {
			deliver_packet_to_recv_queue(p);
		}
		read_unlock(&lock);
	}

The problem here is, that the receive_function() may have to wait very
long for the lock while the add_item() function holds the lock and
blocks in the call to kmalloc().  For some reasons it's not easy to
move the kmalloc() outside the locked region.

I have also thought about using RCU.  But I don't understand it
toroughly enough.

The straight-forward way would be to remove the calls to write_lock(),
replace the list operations by their _rcu counterpart, and replace the
calls to read_lock/read_unlock by calls to rcu_read_lock/rcu_read_unlock,
and replace the call to kfree() by a call to call_rcu(..., kfree, p);
Right?

This would make the list traversal for packet delivery atomic, by
disabling preemption on that CPU.  However, since the list may contain
quite a number of receivers, preemption may be disabled for a time
longer than I'd like it to.

So my question is, is it really necessary for the list traversal to be
atomic, i.e. to disable preemption?  According to "Linux Device
Drivers", this is needed for the callback function, so it can be
called after the scheduler has been run on all CPUs and no reader is
still accessing the list item to be freed.  Is it right, that the
rcu_read_lock() wouldn't be necessary if I only would call
list_add_rcu() and list_del_rcu() since these make atomic changes and
can run in parallel anyway, even with rcu_read_lock(), on a SMP
system?

If so, I could possibly find another way to kfree() the list item when
no one is still using it.  Without the need to disable preemption for
too long a time.


Yet another solution would be to have two locks, one to synchronize
multiple processes trying to modify the list, similar to the code
above, and a rwlock to synchronize between writes from the processes
and the softirq routine receiving packets.  The receiving function
would use read_lock as in the code above, and the del_item() function
would get a write-lock just for the two lines

    write_lock_bh(&other_lock);
    hlist_del(p->list)
    kfree(p)
    write_unlock_bh(&other_lock)

However, if possible, I would prefer avoiding a second lock.

BTW, while working on this I thought about two functions I would like
to see in the kernel:  upgrade a read_lock to write_lock and downgrade
a write_lock to read_lock.


urs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Q: locking mechanisms
  2006-07-01  5:58 Q: locking mechanisms Urs Thuermann
@ 2006-07-01  9:36 ` Thomas Gleixner
  2006-07-02  5:26   ` Urs Thuermann
  2006-07-04 12:58   ` Urs Thuermann
  0 siblings, 2 replies; 7+ messages in thread
From: Thomas Gleixner @ 2006-07-01  9:36 UTC (permalink / raw)
  To: Urs Thuermann; +Cc: linux-kernel

Urs,

On Sat, 2006-07-01 at 07:58 +0200, Urs Thuermann wrote:
> So my question is, is it really necessary for the list traversal to be
> atomic, i.e. to disable preemption?  According to "Linux Device
> Drivers", this is needed for the callback function, so it can be
> called after the scheduler has been run on all CPUs and no reader is
> still accessing the list item to be freed.  Is it right, that the
> rcu_read_lock() wouldn't be necessary if I only would call
> list_add_rcu() and list_del_rcu() since these make atomic changes and
> can run in parallel anyway, even with rcu_read_lock(), on a SMP
> system?

Does Documentation/listRCU.txt answer your questions ?

	tglx



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Q: locking mechanisms
  2006-07-01  9:36 ` Thomas Gleixner
@ 2006-07-02  5:26   ` Urs Thuermann
  2006-07-04 12:58   ` Urs Thuermann
  1 sibling, 0 replies; 7+ messages in thread
From: Urs Thuermann @ 2006-07-02  5:26 UTC (permalink / raw)
  To: linux-kernel

Thomas Gleixner <tglx@linutronix.de> writes:

> Does Documentation/listRCU.txt answer your questions ?

I haven't seen that whole RCU directory yet.  Thanks for pointing it
out.

urs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Q: locking mechanisms
  2006-07-01  9:36 ` Thomas Gleixner
  2006-07-02  5:26   ` Urs Thuermann
@ 2006-07-04 12:58   ` Urs Thuermann
  2006-07-05  6:11     ` Paul E. McKenney
  1 sibling, 1 reply; 7+ messages in thread
From: Urs Thuermann @ 2006-07-04 12:58 UTC (permalink / raw)
  To: linux-kernel

Thomas Gleixner <tglx@linutronix.de> writes:

> Does Documentation/listRCU.txt answer your questions ?

It doesn't answer my question.  I have code that receives network
packets by registering with dev_add_pack().  Each packet received is
then delivered to a list of receivers, where this list can contain quite
a lot of items:

	receive_function(struct sk_buff *skb, struct net_device *dev,
			struct packet_type *pt, struct net_device *orig_dev)
	{
		...
		rcu_read_lock();
                head = find_list(dev);
		hlist_for_each_entry_rcu(p, n, head, list) {
			deliver_packet_to_receiver(skb, p);
		}
		rcu_read_unlock();
	}

The deliver_packet_to_receiver() function finally ends up in a call to
sock_queue_rcv_skb().

My questions was, wether I should worry to "hold" the rcu_read_lock for
the time of the list traversal since the list can be quite long and
preemption is disabled between rcu_read_lock() and rcu_read_unlock().


urs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Q: locking mechanisms
  2006-07-04 12:58   ` Urs Thuermann
@ 2006-07-05  6:11     ` Paul E. McKenney
  2006-07-05  9:35       ` Urs Thuermann
  0 siblings, 1 reply; 7+ messages in thread
From: Paul E. McKenney @ 2006-07-05  6:11 UTC (permalink / raw)
  To: Urs Thuermann; +Cc: linux-kernel

On Tue, Jul 04, 2006 at 02:58:42PM +0200, Urs Thuermann wrote:
> Thomas Gleixner <tglx@linutronix.de> writes:
> 
> > Does Documentation/listRCU.txt answer your questions ?
> 
> It doesn't answer my question.  I have code that receives network
> packets by registering with dev_add_pack().  Each packet received is
> then delivered to a list of receivers, where this list can contain quite
> a lot of items:
> 
> 	receive_function(struct sk_buff *skb, struct net_device *dev,
> 			struct packet_type *pt, struct net_device *orig_dev)
> 	{
> 		...
> 		rcu_read_lock();
>                 head = find_list(dev);
> 		hlist_for_each_entry_rcu(p, n, head, list) {
> 			deliver_packet_to_receiver(skb, p);
> 		}
> 		rcu_read_unlock();
> 	}
> 
> The deliver_packet_to_receiver() function finally ends up in a call to
> sock_queue_rcv_skb().
> 
> My questions was, wether I should worry to "hold" the rcu_read_lock for
> the time of the list traversal since the list can be quite long and
> preemption is disabled between rcu_read_lock() and rcu_read_unlock().

"Holding" rcu_read_lock() for long time periods is much less of a
concern than holding other types of synchronization mechanisms.
The main concern is the effect on realtime latency in CONFIG_PREEMPT
(but -not- CONFIG_PREEMPT_RT) kernels.  This concern is due to the
fact that rcu_read_lock() suppresses preemption in CONFIG_PREEMPT
kernels.

But I have to ask: roughly how long is "quite long"?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Q: locking mechanisms
  2006-07-05  6:11     ` Paul E. McKenney
@ 2006-07-05  9:35       ` Urs Thuermann
  2006-07-05 14:13         ` Paul E. McKenney
  0 siblings, 1 reply; 7+ messages in thread
From: Urs Thuermann @ 2006-07-05  9:35 UTC (permalink / raw)
  To: linux-kernel; +Cc: Paul E. McKenney

"Paul E. McKenney" <paulmck@us.ibm.com> writes:

> > I have code that receives network packets by registering with
> > dev_add_pack().  Each packet received is then delivered to a list
> > of receivers, where this list can contain quite a lot of items:
> > 
> > 	receive_function(struct sk_buff *skb, struct net_device *dev,
> > 			struct packet_type *pt, struct net_device *orig_dev)
> > 	{
> > 		...
> > 		rcu_read_lock();
> >             head = find_list(dev);
> > 		hlist_for_each_entry_rcu(p, n, head, list) {
> > 			deliver_packet_to_receiver(skb, p);
> > 		}
> > 		rcu_read_unlock();
> > 	}
> > 
> > The deliver_packet_to_receiver() function finally ends up in a call to
> > sock_queue_rcv_skb().

> "Holding" rcu_read_lock() for long time periods is much less of a
> concern than holding other types of synchronization mechanisms.

Why is that?  I thought, if I hold a spinlock (or rw_lock in my case)
I only block other threads that try to get that same lock.  With
rcu_read_lock() I disable preemption which I thought affects more
(all) other parts of the kernel.

> But I have to ask: roughly how long is "quite long"?

Depends on how many and what types of sockets are opened and which
packets they want to receive.  The code is part of a new protocol
family implementation for CAN (controller area network), which you can
see at belios.de, project name socket-can.  It implements several
types of PF_CAN sockets, which register for packets of certain CAN
IDs, which are then lightly processed (filtered) and eventually
delivered into a queue using sock_queue_rcv_skb().  Usually, the list
has one receiver per open PF_CAN sockets.  Typical usage here has
shown list lengths of 30-100 entries, i.e. 30-100 packets delivered
with sock_queue_rcv_skb().  But it all depends on what the user space
does, how many PF_CAN sockets are opened by all processes.

Is that value of "quite long" also "too long" for doing an
rcu_read_lock()?


urs


BTW, while implementing PF_CAN and reading kernel code and
documentation I often run into questions on details like this which
I'd like answered to get a better understanding of kernel internals,
sometimes not directly related to some concrete implementation
problem.  Is LKML the right place for these questions?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Q: locking mechanisms
  2006-07-05  9:35       ` Urs Thuermann
@ 2006-07-05 14:13         ` Paul E. McKenney
  0 siblings, 0 replies; 7+ messages in thread
From: Paul E. McKenney @ 2006-07-05 14:13 UTC (permalink / raw)
  To: Urs Thuermann; +Cc: linux-kernel

On Wed, Jul 05, 2006 at 11:35:52AM +0200, Urs Thuermann wrote:
> "Paul E. McKenney" <paulmck@us.ibm.com> writes:
> 
> > > I have code that receives network packets by registering with
> > > dev_add_pack().  Each packet received is then delivered to a list
> > > of receivers, where this list can contain quite a lot of items:
> > > 
> > > 	receive_function(struct sk_buff *skb, struct net_device *dev,
> > > 			struct packet_type *pt, struct net_device *orig_dev)
> > > 	{
> > > 		...
> > > 		rcu_read_lock();
> > >             head = find_list(dev);
> > > 		hlist_for_each_entry_rcu(p, n, head, list) {
> > > 			deliver_packet_to_receiver(skb, p);
> > > 		}
> > > 		rcu_read_unlock();
> > > 	}
> > > 
> > > The deliver_packet_to_receiver() function finally ends up in a call to
> > > sock_queue_rcv_skb().
> 
> > "Holding" rcu_read_lock() for long time periods is much less of a
> > concern than holding other types of synchronization mechanisms.
> 
> Why is that?  I thought, if I hold a spinlock (or rw_lock in my case)
> I only block other threads that try to get that same lock.  With
> rcu_read_lock() I disable preemption which I thought affects more
> (all) other parts of the kernel.

In any kernel in which rcu_read_lock() disables preemption, both
spin_lock() and read_lock() (and friends) also disables preemption.
In addition, as you pointed out above, spin_lock(), read_lock(), and
friends are also blocking any other task that is trying to acquire
the lock in a conflicting manner to task(s) already holding it.

For reference, the kernels handle preemption as follows:

o	non-CONFIG_PREEMPT:  preemption is already disabled anyway,
	so any time anywhere in the kernel counts against realtime
	latency.

o	CONFIG_PREEMPT: all spinlock acquisitions, as well as rcu_read_lock(),
	disable preemption.

o	CONFIG_PREEMPT_RT (in Ingo Molnar's -rt patchset): neither "normal"
	spinlock acquisitions nor rcu_read_lock() disable preemption.

> > But I have to ask: roughly how long is "quite long"?
> 
> Depends on how many and what types of sockets are opened and which
> packets they want to receive.  The code is part of a new protocol
> family implementation for CAN (controller area network), which you can
> see at belios.de, project name socket-can.  It implements several
> types of PF_CAN sockets, which register for packets of certain CAN
> IDs, which are then lightly processed (filtered) and eventually
> delivered into a queue using sock_queue_rcv_skb().  Usually, the list
> has one receiver per open PF_CAN sockets.  Typical usage here has
> shown list lengths of 30-100 entries, i.e. 30-100 packets delivered
> with sock_queue_rcv_skb().  But it all depends on what the user space
> does, how many PF_CAN sockets are opened by all processes.
> 
> Is that value of "quite long" also "too long" for doing an
> rcu_read_lock()?

Seems like 30-100 entries would be OK in many cases -- but is it possible
to use a hash table, a tree, or some such if problems arise?

> urs
> 
> BTW, while implementing PF_CAN and reading kernel code and
> documentation I often run into questions on details like this which
> I'd like answered to get a better understanding of kernel internals,
> sometimes not directly related to some concrete implementation
> problem.  Is LKML the right place for these questions?

Yes, though it is almost always worthwhile to google "xxx site:lwn.net"
or "xxx site:lkml.org" for topic "xxx".  And the contents of the
Documentation directory, as well -- sometimes a little grepping in
that directory can be quite helpful.  In addition, there are a number
of books on Linux kernel internals.

						Thanx, Paul

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2006-07-05 14:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-07-01  5:58 Q: locking mechanisms Urs Thuermann
2006-07-01  9:36 ` Thomas Gleixner
2006-07-02  5:26   ` Urs Thuermann
2006-07-04 12:58   ` Urs Thuermann
2006-07-05  6:11     ` Paul E. McKenney
2006-07-05  9:35       ` Urs Thuermann
2006-07-05 14:13         ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).