archive mirror
 help / color / mirror / Atom feed
From: Ingo Oeser <>
To: "Richard B. Johnson" <>
Cc: Roger Larsson <>,
	Linux kernel <>
Subject: Re: SMP spin-locks
Date: Fri, 15 Jun 2001 14:10:47 +0200	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <> <>
In-Reply-To: <>; from on Thu, Jun 14, 2001 at 05:05:07PM -0400

On Thu, Jun 14, 2001 at 05:05:07PM -0400, Richard B. Johnson wrote:
> The problem is that a data acquisition board across the PCI bus
> gives a data transfer rate of 10 to 11 megabytes per second
> with a UP kernel, and the transfer drops to 5-6 megabytes per
> second with a SMP kernel. The ISR is really simple and copies
> data, that's all.
> The 'read()' routine uses a spinlock when it modifies pointers.
> I started to look into where all the CPU clocks were going. The
> SMP spinlock code is where it's going. There is often contention
> for the lock because interrupts normally occur at 50 to 60 kHz.

Then you need another (better?) queueing mechanism.

Use multiple queues and a _overflowable_ sequence number as
global variable between the queues. 

N Queues (N := no. of CPUs + 1), which have a spin_lock for each

optionally: One reader packet reassembly priority queue (APQ) ordered by
   sequence number (implicitly or explicitly), if this shouldn't
   be done in user space.

In the writer ISR: 

   Foreach Queue in RR order (start with remebered one):
   - Try to lock it with spin_trylock (totally inline!)
     + Failed
        * if we failed to find a free queue for x "rounds", disable
          device (we have no reader) and notify user space somehow
       * increment "rounds" 
       * next queue
     + Succeed
       * Increment sequence number
       * Put data record into queue
      (* remember this queue as last queue used)
      (* mark queue "not empty")
       * do other IRQ work...

In the reader routine:
   Foreach Queue in RR order (start with remebered one):
   - No data counter above threshold -> EAGAIN [1] 
   - Try to lock it with spin_trylock (totally inline!)
     + Failed -> next queue
     + Succeed
       * if queue empty, unlock and try next one
      (* remember this queue as last queue used)
       * Get one data record from queue (in queue order!)
       * Move data record into APQ
       * Unlock queue
       * Deliver as much data from the APQ, as the user wants and
         is available
    - if all queues empty or locked -> increment "no data round"

   The "last queue used" variable is static, but local to routine.
   It is there to decrease the number of iterations and distribute
   the data to all queues as more equally.

   Statistics about lock contention per queue, per round and per
   try would be nice here to estimate the number of queues

   The APQ can be quite large, if the sequences are bad
   distributed and some queues tend to be always locked, if the
   reader wants to read from this queue.

   The above can be solved by 2^N "One entry queues" (aka slots)
   and sequence numbers mapping to this slots. If you need many
   slots (more then 256, I would say) then this is again 
   inaccaptable, because of the iteration cost in the ISR.
What do you think? After some polishing this should decrease lock
contention noticibly.


Ingo Oeser

[1] Blocking will be harder to implement here, since we need to
   notify the reader routine, that we have data available, which
   involves some latency you cannot afford. Maybe this could be
   done via schedule_task(), if needed.
Use ReiserFS to get a faster fsck and Ext2 to fsck slowly and gently.

  parent reply	other threads:[~2001-06-15 12:10 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-06-14 17:26 Richard B. Johnson
2001-06-14 17:32 ` David S. Miller
2001-06-14 17:35 ` Kurt Garloff
2001-06-15  6:51   ` Doug Ledford
2001-06-14 20:42 ` Roger Larsson
2001-06-14 21:05   ` Richard B. Johnson
2001-06-14 21:30     ` Roger Larsson
2001-06-15  3:21       ` Richard B. Johnson
2001-06-15  2:33         ` David Lang
2001-06-15 10:35         ` David Schwartz
2001-06-15 13:26           ` Richard B. Johnson
2001-06-15 12:10     ` Ingo Oeser [this message]
2001-06-15 12:49       ` Richard B. Johnson
2001-06-15 15:52     ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \
    --subject='Re: SMP spin-locks' \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).