All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Christoph Lameter <cl@linux.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Pekka Enberg <penberg@kernel.org>, Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org,
	Thomas Gleixner <tglx@linutronix.de>
Subject: Re: [GIT PULL] slab fixes for 3.2-rc4
Date: Wed, 21 Dec 2011 09:05:35 -0800	[thread overview]
Message-ID: <20111221170535.GB9213@google.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1112210851230.9601@router.home>

Hello, Christoph.

On Wed, Dec 21, 2011 at 09:16:24AM -0600, Christoph Lameter wrote:
> __this_cpu ops are generally the most useless. You can basically do the
> same thing by open coding it. But then on x86 you'd miss out on generating
> a simple inc seg:var instruction that does not impact registers. Plus you
> avoid the necessity of calculating the address first. Instead of one
> instruction you'd have 5.
>
> Dropping preemption protected ones is going to be difficult given their
> use in key subsystems.

The thing is that irqsafe ones are the "complete" ones.  We can use
irqsafe ones instead of preempt safe ones but not the other way.  This
matters only if flipping irq is noticeably more expensive than
inc/dec'ing preempt count but I suspect there are enough such
machines.  (cc'ing arch) Does anyone have better insight here?  How
much more expensive are local irq save/restore compared to inc/dec'ing
preempt count on various archs?

> > > Christoph, what do you think?  What would be the minimal set that you
> > can work with?
> 
> If you just talking about the slub allocator and the this_cpu_cmpxchg
> variants there then the irqsafe variants of cmpxchg and cmpxchg_double are
> sufficient there.
> 
> However, the this_cpu ops are widely used in many subsystems for keeping
> statistics. Their main role is to keep the overhead of incrementing/adding
> to counters as minimal as possible. Changes there would cause instructions
> to be generated that are longer in size and also would cause higher
> latency of execution. Generally the irqsafe variants are not needed for
> counters so we may be able to toss those.
> 
> this_cpu ops are not sloppy unless one intentionally uses __this_cpu_xxx
> in a non preempt safe context which was the case for the vmstat counters
> for awhile.
> 
> The amount of this_cpu functions may be excessive because I tried to cover
> all possible use cases rather than actuallly used forms in the kernel. So
> a lot of things could be weeded out. this_cpu ops is a way to experiment
> with different forms of synchronization that are particular important for
> fastpaths implementing per cpu caching. This could be of relevance to many
> of the allocators in the future.
> 
> The way that the cmpxchg things are used is also similar to transactional
> memory that is becoming available in the next generation of processors by
> Intel and that is already available in the current generation of powerpc
> processors by IBM. It is a way to avoid locking overhead.

Hmmm... how about removing the ones which aren't currently in use?
percpu API in general needs a lot more clean up but I think that would
be a good starting point.

Thanks.

-- 
tejun

  reply	other threads:[~2011-12-21 17:05 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-11-29 18:02 [GIT PULL] slab fixes for 3.2-rc4 Pekka Enberg
2011-11-29 19:29 ` Linus Torvalds
2011-11-29 19:38   ` Linus Torvalds
2011-12-20  9:47   ` Pekka Enberg
2011-12-20 16:23     ` Tejun Heo
2011-12-20 16:31       ` Christoph Lameter
2011-12-20 19:28       ` Linus Torvalds
2011-12-20 20:28         ` Tejun Heo
2011-12-21  8:08           ` Pekka Enberg
2011-12-21 17:09             ` Tejun Heo
2011-12-21 15:16           ` Christoph Lameter
2011-12-21 17:05             ` Tejun Heo [this message]
2011-12-22  2:19               ` Linus Torvalds
2011-12-22 16:05                 ` Tejun Heo
2011-12-28 10:25                 ` Benjamin Herrenschmidt
2011-12-22 14:58               ` Christoph Lameter
2011-12-22 16:08                 ` Tejun Heo
2011-12-22 17:58                   ` Christoph Lameter
2011-12-22 18:03                     ` Ingo Molnar
2011-12-22 18:31                     ` Linus Torvalds
2011-12-23 16:55                       ` Christoph Lameter
2011-12-23 20:54                         ` Linus Torvalds
2012-01-04 15:30                           ` Christoph Lameter
2012-01-04 16:07                             ` Linus Torvalds
2012-01-04 17:00                               ` Christoph Lameter
2012-01-04 23:10                                 ` Linus Torvalds
2012-01-05 19:15                                   ` Christoph Lameter
2012-01-05 19:27                                     ` Linus Torvalds
2011-12-22 18:47                     ` Tejun Heo
2011-12-20 16:26     ` Christoph Lameter
2011-12-21  8:06       ` Pekka Enberg
2011-12-21 15:20         ` Christoph Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111221170535.GB9213@google.com \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@linux.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=penberg@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.