All of lore.kernel.org
 help / color / mirror / Atom feed
* using prefetch
@ 2013-02-15 10:16 Kevin Wilson
  2013-02-15 16:15 ` Valdis.Kletnieks at vt.edu
  2013-02-15 16:42 ` michi1 at michaelblizek.twilightparadox.com
  0 siblings, 2 replies; 4+ messages in thread
From: Kevin Wilson @ 2013-02-15 10:16 UTC (permalink / raw)
  To: kernelnewbies

Hello, kernelnewbies
There are many cases in drivers where I see calls to prefetch|() of
some variable.
For example,
    prefetch(&skb->end); in 1 bnx2_tx_int(),
http://lxr.free-electrons.com/source/drivers/net/ethernet/broadcom/bnx2.c

AFAIK, what prefetch does is get a variable from memory and put it in
cache (L2 cache I believe).
Is the prefetch operation synchronous ? I mean, after calling it, are
we gauranteed that the variable is
indeed in the cache ?

So this is probably for improving performance, assuming that you will
need this variable in the near
future.
The comment there says:
/* prefetch skb_end_pointer() to speedup skb_shinfo(skb) */

According to this logic, anywhere that we want to call skb_shinfo(skb)
we better do a prefetch before.

In fact, if we prefetch any variable that we want to use then we end up
with performance boost.

So - any hints, what are the guidlines for using prefetch()?

rgs,
Kevin

^ permalink raw reply	[flat|nested] 4+ messages in thread

* using prefetch
  2013-02-15 10:16 using prefetch Kevin Wilson
@ 2013-02-15 16:15 ` Valdis.Kletnieks at vt.edu
  2013-02-15 16:42 ` michi1 at michaelblizek.twilightparadox.com
  1 sibling, 0 replies; 4+ messages in thread
From: Valdis.Kletnieks at vt.edu @ 2013-02-15 16:15 UTC (permalink / raw)
  To: kernelnewbies

On Fri, 15 Feb 2013 12:16:02 +0200, Kevin Wilson said:

> Is the prefetch operation synchronous ? I mean, after calling it, are
> we gauranteed that the variable is
> indeed in the cache ?

No, the whole *point* is that it's asynchronous.  You issue the prefetch
several lines of code before you need it to be in cache, so that you
can get several lines of hopefully not data-dependent code to run while
the cache line fetch happens, rather than take a stall when you reference
the variable.  The prefetch may in fact not complete in time, but at
worst you end up just stalling for a cache miss the same as you would have
otherwise.

> According to this logic, anywhere that we want to call skb_shinfo(skb)
> we better do a prefetch before.

No, because most references to skb will be cache-hot because you're in the
middle of the IP stack, which touches the skb struct all over the place, and
therefor it's probably in L2 already.

> In fact, if we prefetch any variable that we want to use then we end up
> with performance boost.

Nope. Not as true as you might think.  If you play around with the 'perf'
command you'll find out that on modern processors you'll see a 98% or so
hit rate on the L2 cache - so 98% of the time you'll *waste* a cycle
issuing the opcode needlessly.

If you look carefully at some of the other structs in the net/ subtree,
you'll see where they've put variables together so that once you reference
one field of the struct, all/most of the needed stuff gets sucked in on
the same cache line.  That's probably more productive than trying to add
prefetch calls all over the place.

> So - any hints, what are the guidlines for using prefetch()?

Only use it if you have good reason to believe that you *will* need
that variable (in other words, it's not in the unlikely half of an if
statement or somehting) *and* there's a good chance that the
variable/memory is cache-cold.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 865 bytes
Desc: not available
Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20130215/e31a61db/attachment.bin 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* using prefetch
  2013-02-15 10:16 using prefetch Kevin Wilson
  2013-02-15 16:15 ` Valdis.Kletnieks at vt.edu
@ 2013-02-15 16:42 ` michi1 at michaelblizek.twilightparadox.com
  2013-02-15 20:18   ` Kevin Wilson
  1 sibling, 1 reply; 4+ messages in thread
From: michi1 at michaelblizek.twilightparadox.com @ 2013-02-15 16:42 UTC (permalink / raw)
  To: kernelnewbies

Hi!

On 12:16 Fri 15 Feb     , Kevin Wilson wrote:
...
> AFAIK, what prefetch does is get a variable from memory and put it in
> cache (L2 cache I believe).

Yes, this is true. See:
http://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Other-Builtins.html
I am not so sure about the cache level it is fetched to.

> Is the prefetch operation synchronous ? I mean, after calling it, are
> we gauranteed that the variable is
> indeed in the cache ?

No, the variable definitely not guaranteed to be in the cache. This would not
make any sense. The purpose of the prefetch is to fetch data in background
while executing something else.

Actually it is not guaranteed to fetch anything at all. The target cpu might
not support the feature at all. Even if it does there are cases where it will
not be prefetched, e.g. when it triggers a page fault. Also the cpu itself
might decide not to do the prefetch, e.g. when the cache line is present (and
locked by cache coherency) in the cache of a different cpu/core.

> So this is probably for improving performance, assuming that you will
> need this variable in the near
> future.
> The comment there says:
> /* prefetch skb_end_pointer() to speedup skb_shinfo(skb) */
> 
> According to this logic, anywhere that we want to call skb_shinfo(skb)
> we better do a prefetch before.
> 
> In fact, if we prefetch any variable that we want to use then we end up
> with performance boost.
> 
> So - any hints, what are the guidlines for using prefetch()?

You really should *not* prefetch() all variables you want to use. Prefetch
itself generates code which needs cpu cycles. It can quickly make your program
slower. Use it only in places where
- the data is very unlikely to be in the cache of either the current or any
  other cpu in the system *and*
- you can add the prefetch instruction at least 100ns before the actual use

Also, if you access a reasonably large memory array sequentially (either
forward or backward), you should not use prefetch() at all. The cpus have
hardware prefetchers which are faster in this case.


A general advise for performance optimisation: run benchmarks

	-Michi
-- 
programing a layer 3+4 network protocol for mesh networks
see http://michaelblizek.twilightparadox.com

^ permalink raw reply	[flat|nested] 4+ messages in thread

* using prefetch
  2013-02-15 16:42 ` michi1 at michaelblizek.twilightparadox.com
@ 2013-02-15 20:18   ` Kevin Wilson
  0 siblings, 0 replies; 4+ messages in thread
From: Kevin Wilson @ 2013-02-15 20:18 UTC (permalink / raw)
  To: kernelnewbies

Thanks!
KW


On Fri, Feb 15, 2013 at 6:42 PM,
<michi1@michaelblizek.twilightparadox.com> wrote:
> Hi!
>
> On 12:16 Fri 15 Feb     , Kevin Wilson wrote:
> ...
>> AFAIK, what prefetch does is get a variable from memory and put it in
>> cache (L2 cache I believe).
>
> Yes, this is true. See:
> http://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Other-Builtins.html
> I am not so sure about the cache level it is fetched to.
>
>> Is the prefetch operation synchronous ? I mean, after calling it, are
>> we gauranteed that the variable is
>> indeed in the cache ?
>
> No, the variable definitely not guaranteed to be in the cache. This would not
> make any sense. The purpose of the prefetch is to fetch data in background
> while executing something else.
>
> Actually it is not guaranteed to fetch anything at all. The target cpu might
> not support the feature at all. Even if it does there are cases where it will
> not be prefetched, e.g. when it triggers a page fault. Also the cpu itself
> might decide not to do the prefetch, e.g. when the cache line is present (and
> locked by cache coherency) in the cache of a different cpu/core.
>
>> So this is probably for improving performance, assuming that you will
>> need this variable in the near
>> future.
>> The comment there says:
>> /* prefetch skb_end_pointer() to speedup skb_shinfo(skb) */
>>
>> According to this logic, anywhere that we want to call skb_shinfo(skb)
>> we better do a prefetch before.
>>
>> In fact, if we prefetch any variable that we want to use then we end up
>> with performance boost.
>>
>> So - any hints, what are the guidlines for using prefetch()?
>
> You really should *not* prefetch() all variables you want to use. Prefetch
> itself generates code which needs cpu cycles. It can quickly make your program
> slower. Use it only in places where
> - the data is very unlikely to be in the cache of either the current or any
>   other cpu in the system *and*
> - you can add the prefetch instruction at least 100ns before the actual use
>
> Also, if you access a reasonably large memory array sequentially (either
> forward or backward), you should not use prefetch() at all. The cpus have
> hardware prefetchers which are faster in this case.
>
>
> A general advise for performance optimisation: run benchmarks
>
>         -Michi
> --
> programing a layer 3+4 network protocol for mesh networks
> see http://michaelblizek.twilightparadox.com
>
> _______________________________________________
> Kernelnewbies mailing list
> Kernelnewbies at kernelnewbies.org
> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-02-15 20:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-02-15 10:16 using prefetch Kevin Wilson
2013-02-15 16:15 ` Valdis.Kletnieks at vt.edu
2013-02-15 16:42 ` michi1 at michaelblizek.twilightparadox.com
2013-02-15 20:18   ` Kevin Wilson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.