All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: 8xx: Work around CPU15 erratum.
@ 2008-05-14 14:52 Ben Gardiner
  2008-05-14 16:23 ` Dan Malek
  0 siblings, 1 reply; 3+ messages in thread
From: Ben Gardiner @ 2008-05-14 14:52 UTC (permalink / raw)
  To: linuxppc-embedded

Hello,

This is my first post to the linuxppc-embedded list. Please forgive me 
for jumping in late onto the CPU15 workaround discussions, but my 
company is very interested in finding an efficient fix for this silicon 
errata on our devices.

I applied the patch submitted by Scott Wood to our patched denx 2.4.24 
tree and did some timing measurements to compare its effect on 
performance. We noted a worst case 37% slowdown -- this was during 
program load, where the PC is moving all over the place, so the penalty 
for all the tlbie's is felt heavily. This is solely anecdotal, I don't 
mean to imply that this fix will result in any particular slowdown on 
anyone else's systems.

I would like to start a discussion on the possibility of implementing a 
selective invalidation of pages on i-tlb miss. On July 20th, 2007 Scott 
Wood wrote:
 >The only lower-overhead workaround I know of requires compiler 
modifications (and I made it configurable to allow for that possibility).

So there likely are reasons why the following is not possible:
In the errata document from Freescale 
(http://www.freescale.com/files/32bit/doc/errata/MPC860CE.pdf?fpsp=1), 
they list an alternative to the "invalidate previous and next page" itlb 
miss handler (see pg 57), where they use the SPS/SH/CI bits of a L2 page 
entry to indicate whether: 1) a page is unchecked, 2) it does not 
contain a bad branch, 3) it does contain a bad branch, or 4) it has a 
previous page with a bad branch. These bits are zeroed before being 
written to MI_RPN. In loose terms the itlb miss handle does the 
following: If the page is unchecked, they check the last instruction the 
current page or the previous page and write the status bits to the L2 
entry accordingly. If the page has been checked the status bits are used 
to selectively invalidate the next or previous page as needed.

Since MI_RPN is always written with bits 24,25,26 and 27 set; could we 
use bits 24, 25 and 26 in memory to indicate the 4 possible page states 
and perform selective page invalidation as demonstrated in their example 
listing?

Regards,
Ben Gardiner

Nanometrics Inc.
250 Herzberg Rd.
Kanata ON
K2K 2A1
Telephone 613 592 6776 ext 239

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 8xx: Work around CPU15 erratum.
  2008-05-14 14:52 8xx: Work around CPU15 erratum Ben Gardiner
@ 2008-05-14 16:23 ` Dan Malek
  2008-05-14 20:17   ` Ben Gardiner
  0 siblings, 1 reply; 3+ messages in thread
From: Dan Malek @ 2008-05-14 16:23 UTC (permalink / raw)
  To: Ben Gardiner; +Cc: linuxppc-embedded


On May 14, 2008, at 10:52 AM, Ben Gardiner wrote:

> So there likely are reasons why the following is not possible:

That's way too much code for a tlb exception handler.
 From a system resource perspective, you are much better
off with a small and efficient piece of tlb loading code,
always invalidating pages on both ends and taking the
tlb exception fault.   Unfortunately, this could cause some
thrashing edge cases, so a little intelligence would be
needed.   Exception processing isn't free, and it quickly
destroys the cache footprint of your application, further
slowing down the entire system.  The tlb reload handler
goal should be maximum of 8 instructions and 4 memory
accesses, not 4K of elaborate conditional testing.  :-)

Thanks.

	-- Dan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 8xx: Work around CPU15 erratum.
  2008-05-14 16:23 ` Dan Malek
@ 2008-05-14 20:17   ` Ben Gardiner
  0 siblings, 0 replies; 3+ messages in thread
From: Ben Gardiner @ 2008-05-14 20:17 UTC (permalink / raw)
  To: Dan Malek; +Cc: linuxppc-embedded

Dan Malek wrote:
>
> On May 14, 2008, at 10:52 AM, Ben Gardiner wrote:
>
>> So there likely are reasons why the following is not possible:
>
> That's way too much code for a tlb exception handler.
> From a system resource perspective, you are much better
> off with a small and efficient piece of tlb loading code,
> always invalidating pages on both ends and taking the
> tlb exception fault.   Unfortunately, this could cause some
> thrashing edge cases, so a little intelligence would be
> needed.   Exception processing isn't free, and it quickly
> destroys the cache footprint of your application, further
> slowing down the entire system.  The tlb reload handler
> goal should be maximum of 8 instructions and 4 memory
> accesses, not 4K of elaborate conditional testing.  :-)
>
> Thanks.
>
>     -- Dan
>
Hi Dan,
Thanks for the rapid reply :) I really appreciate you giving me an 
answer "from the horse's mouth." I would still like to experiment a 
little and I'm not really sure it is safe to use any bits in a PTE.

Assuming I was crazy enough to ruin my cache footprint; are there any 
three bits in the PTE that are safe to use for some page status 
information?

Best Regards,
Ben

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2008-05-14 20:16 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-05-14 14:52 8xx: Work around CPU15 erratum Ben Gardiner
2008-05-14 16:23 ` Dan Malek
2008-05-14 20:17   ` Ben Gardiner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.