linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* memcpy to videoram eats too much CPU on ATI cards (cache trashing?)
@ 2001-08-27 18:13 Peter Surda
  2001-08-27 19:22 ` Alan Cox
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Surda @ 2001-08-27 18:13 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2360 bytes --]

Dear kernel gurus,

First of all I want to apologise for writing here, but I think this is the
place with the greatest chance of getting some help. I have done extensive
research on the problem and talked to many people including driver developers
and was unable to find any solutions yet.

So, fist a little intro: when watching videos with Xv under XFree86, a certain
function is called to transfer the data from system RAM to video RAM. This
function is driver specific, but for all the drivers I checked (mga, tdfx,
mach64, r128) it has the same contents, looks like pasted. It basically does a
for (h--) memcpy (blah, blah, blah).

The point is that with mga, tdfx and what I heard nvidia too, this doesn't
cause any CPU load (or more precisely, non-measurable load). However, with
mach64 and r128, it DOES. I did some more research.

memcpy-ing 380kB at 25fps takes about 5ms per frame and causes X to eat 1% cpu
time (time measurements were done by tsc)
memcpy-ing 760kB at 25fps takes about 11ms per frame, but instead of eating
2% CPU time, it eats 35% (yes, that's 35 times more)

The speed isn't the real problem (when you multiply it you get about 70MB/s
and that's definitely enough). The problem is that this eats CPU time, and
that shouldn't (or at least not so much).

This happens on both of my systems, one with PIIMobile/366 and mach64, and one
with Duron 650 with r128. I had a voodoo before for tests, and CPU load wasn't
measurable, from what I heard mga and nvidia as well, so it is something
ATI-specific. Some other people having ATI cards have the same problem (from
what I read on gatos-devel list), but I have never heard someone explicitely
say "the problem doesn't exist on my ATI".

MTRR is enabled correctly, disabling it only worsens the problem.

I have been in close contact with the driver developer and XFree86 maintainer,
but none of them seem to know exactly how to solve it. Current theory is that
this is caused by some cache being trashed, but I have no idea how to fix it.

Oh yes, I already tried using a memcpy written in assembly utilizing MMX, but
it didn't show any change.

I would be very grateful for ideas.

Please CC me, I'm not on the list.

Bye,

Peter Surda (Shurdeek) <shurdeek@panorama.sth.ac.at>, ICQ 10236103, +436505122023

--
Failure is not an option. It comes bundled with your Microsoft product.

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: memcpy to videoram eats too much CPU on ATI cards (cache trashing?)
  2001-08-27 18:13 memcpy to videoram eats too much CPU on ATI cards (cache trashing?) Peter Surda
@ 2001-08-27 19:22 ` Alan Cox
  2001-08-27 21:40   ` Rogier Wolff
  0 siblings, 1 reply; 7+ messages in thread
From: Alan Cox @ 2001-08-27 19:22 UTC (permalink / raw)
  To: Peter Surda; +Cc: linux-kernel

> cause any CPU load (or more precisely, non-measurable load). However, with
> mach64 and r128, it DOES. I did some more research.

Makes sense

> memcpy-ing 380kB at 25fps takes about 5ms per frame and causes X to eat 1% cpu
> time (time measurements were done by tsc)
> memcpy-ing 760kB at 25fps takes about 11ms per frame, but instead of eating
> 2% CPU time, it eats 35% (yes, that's 35 times more)

So presumably at this bandwidth you are beginning collide with the graphics 
controllers and it will stall your cycles, at which point its going to stall
the CPU so effectively eat CPU time

Sounds like a reasonable explanation to me

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: memcpy to videoram eats too much CPU on ATI cards (cache trashing?)
  2001-08-27 19:22 ` Alan Cox
@ 2001-08-27 21:40   ` Rogier Wolff
  2001-08-27 22:01     ` Peter Surda
  0 siblings, 1 reply; 7+ messages in thread
From: Rogier Wolff @ 2001-08-27 21:40 UTC (permalink / raw)
  To: Alan Cox; +Cc: Peter Surda, linux-kernel

Alan Cox wrote:
> > cause any CPU load (or more precisely, non-measurable load). However, with
> > mach64 and r128, it DOES. I did some more research.
> 
> Makes sense
> 
> > memcpy-ing 380kB at 25fps takes about 5ms per frame and causes X to eat 1% cpu
> > time (time measurements were done by tsc)

1%? at 5ms/f * 25 f/s = 125 ms/ second. = 12% of your CPU. 

However, as the X server manages to finish doing what it has to do
before the next timer tick, it will almost never get a timer tick
accounted to it.

> > memcpy-ing 760kB at 25fps takes about 11ms per frame, but instead of eating
> > 2% CPU time, it eats 35% (yes, that's 35 times more)

So at 2.2 times as much CPU time, I'd expect around 27% real
usage. But instead of managing to miss the timer tick almost every
time, it is now managing to hit slightly more than average on the
timer ticks. Or it's using a little bit more than you measured.

By doing stuff that takes on the order of a timer tick, and to trigger
them off a timer event, means that CPU time measurements can become
highly inaccurate. 

If you buy CPUtime on a large unix computer, and have a tight
mainloop, you can measure the number of iterations you can do inside
your mainloop in 9ms. Then schedule an "usleep (1)" every time you hit
that many iterations. Cheap computing... ;-)

			Roger. 


-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2137555 **
*-- BitWizard writes Linux device drivers for any device you may have! --*
* There are old pilots, and there are bold pilots. 
* There are also old, bald pilots. 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: memcpy to videoram eats too much CPU on ATI cards (cache trashing?)
  2001-08-27 21:40   ` Rogier Wolff
@ 2001-08-27 22:01     ` Peter Surda
  2001-09-06  6:18       ` (solved) memcpy to videoram eats too much CPU on ATI cards Peter Surda
  2001-09-06 11:37       ` Daniel Egger
  0 siblings, 2 replies; 7+ messages in thread
From: Peter Surda @ 2001-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 847 bytes --]

On Mon, Aug 27, 2001 at 11:40:44PM +0200, Rogier Wolff wrote:
> However, as the X server manages to finish doing what it has to do before
> the next timer tick, it will almost never get a timer tick accounted to it.
Yes, I also realised that, and other people also seem to think it is this way.

So the conclusion is basically that the card can't chew data that fast and I
should use busmastering instead of memcpy (and other drivers should do that
too because "hidden load" occurs anyway). I'm working on it.

Thanks to all who replied, I am as always pleased by the cooperation in
open-source world and wish everyone good luck. Yeah, and linux rocks of course
:-)

> 			Roger. 
Bye,

Peter Surda (Shurdeek) <shurdeek@panorama.sth.ac.at>, ICQ 10236103, +436505122023

--
               I believe the technical term is "Oops!"

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* (solved) memcpy to videoram eats too much CPU on ATI cards
  2001-08-27 22:01     ` Peter Surda
@ 2001-09-06  6:18       ` Peter Surda
  2001-09-06 11:37       ` Daniel Egger
  1 sibling, 0 replies; 7+ messages in thread
From: Peter Surda @ 2001-09-06  6:18 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1074 bytes --]

On Tue, Aug 28, 2001 at 12:01:27AM +0200, Peter Surda wrote:
> So the conclusion is basically that the card can't chew data that fast and I
> should use busmastering instead of memcpy (and other drivers should do that
> too because "hidden load" occurs anyway). I'm working on it.
Just to end this thread in a victorous manner ;-), thanks to Michel Dänzer
<michdaen@iiic.ethz.ch> and me, there is now a working implementation of
busmastered video transfers for the r128 driver, and it has been submitted to
all relevant lists and maintainers. It indeed solved the problem, CPU time
eaten by video transfers is negligible and DVD and "DivX ;-)" playback was
never so smooth. With software-only DVD decoder, watching fullscreen DVD
leaves 50-60% CPU time idle on a Duron 650, even on action-packed scenes. If
I catch someone claiming again that Linux isn't suitable for multimedia, I can
just laugh now :-).

Bye,

Peter Surda (Shurdeek) <shurdeek@panorama.sth.ac.at>, ICQ 10236103, +436505122023

--
               Dudes! May the Open Source be with you.

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: (solved) memcpy to videoram eats too much CPU on ATI cards
  2001-08-27 22:01     ` Peter Surda
  2001-09-06  6:18       ` (solved) memcpy to videoram eats too much CPU on ATI cards Peter Surda
@ 2001-09-06 11:37       ` Daniel Egger
  2001-09-06 13:10         ` Peter Surda
  1 sibling, 1 reply; 7+ messages in thread
From: Daniel Egger @ 2001-09-06 11:37 UTC (permalink / raw)
  To: Peter Surda; +Cc: linux-kernel

Am 06 Sep 2001 08:18:43 +0200 schrieb Peter Surda:

> Just to end this thread in a victorous manner ;-), thanks to Michel Dänzer
> <michdaen@iiic.ethz.ch> and me, there is now a working implementation of
> busmastered video transfers for the r128 driver, and it has been submitted to. 
> all relevant lists and maintainers.

I kept on checking the relevant mailinglist and project CVSes but failed to
find the described implementation, would you please provide additional hints
where to get it?

Servus,
       Daniel


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: (solved) memcpy to videoram eats too much CPU on ATI cards
  2001-09-06 11:37       ` Daniel Egger
@ 2001-09-06 13:10         ` Peter Surda
  0 siblings, 0 replies; 7+ messages in thread
From: Peter Surda @ 2001-09-06 13:10 UTC (permalink / raw)
  To: linux-kernel; +Cc: Daniel Egger

[-- Attachment #1: Type: text/plain, Size: 1061 bytes --]

On Thu, Sep 06, 2001 at 01:37:07PM +0200, Daniel Egger wrote:
> > Just to end this thread in a victorous manner ;-), thanks to Michel Dänzer
> > <michdaen@iiic.ethz.ch> and me, there is now a working implementation of
> > busmastered video transfers for the r128 driver, and it has been submitted to. 
> > all relevant lists and maintainers.
> I kept on checking the relevant mailinglist and project CVSes but failed to
> find the described implementation, would you please provide additional hints
> where to get it?
dri-devel should have it, as well as livid-gatos (which unfortunately seems to
have full disks and is offline at the moment).

For very lazy ones I found a link to the annoucement in dri-devel:
http://www.geocrawler.com/lists/3/SourceForge/680/0/6536416/

> Servus,
>        Daniel
Mit freundlichen Grüßen

Peter Surda (Shurdeek) <shurdeek@panorama.sth.ac.at>, ICQ 10236103, +436505122023

--
   "One world, one web, one program"  -- Microsoft promotional ad
         "Ein Volk, ein Reich, ein Fuehrer"  -- Adolf Hitler

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2001-09-06 13:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-08-27 18:13 memcpy to videoram eats too much CPU on ATI cards (cache trashing?) Peter Surda
2001-08-27 19:22 ` Alan Cox
2001-08-27 21:40   ` Rogier Wolff
2001-08-27 22:01     ` Peter Surda
2001-09-06  6:18       ` (solved) memcpy to videoram eats too much CPU on ATI cards Peter Surda
2001-09-06 11:37       ` Daniel Egger
2001-09-06 13:10         ` Peter Surda

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).