All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Christian König" <deathsimple@vodafone.de>
To: "Michel Dänzer" <michel@daenzer.net>,
	"Alex Deucher" <alexdeucher@gmail.com>
Cc: mesa-dev@lists.freedesktop.org, dri-devel@lists.freedesktop.org
Subject: Re: [PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT
Date: Wed, 23 Jul 2014 09:32:04 +0200	[thread overview]
Message-ID: <53CF64F4.5030809@vodafone.de> (raw)
In-Reply-To: <53CF626B.4030201@daenzer.net>

Am 23.07.2014 09:21, schrieb Michel Dänzer:
> On 23.07.2014 15:42, Christian König wrote:
>> Am 23.07.2014 05:54, schrieb Michel Dänzer:
>>> On 21.07.2014 17:07, Christian König wrote:
>>>> Am 19.07.2014 03:15, schrieb Michel Dänzer:
>>>>> On 19.07.2014 00:47, Christian König wrote:
>>>>>> Am 18.07.2014 05:07, schrieb Michel Dänzer:
>>>>>>>>> [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers on >= SI
>>>>>>>> I'm still not very keen with this change since I still don't
>>>>>>>> understand
>>>>>>>> the reason why it's faster than with GTT. Definitely needs more
>>>>>>>> testing
>>>>>>>> on a wider range of systems.
>>>>>>> Sure. If anyone wants to give this patch a spin and see if they can
>>>>>>> measure any performance difference, good or bad, that would be
>>>>>>> interesting.
>>>>>>>
>>>>>>>> Maybe limit it to APUs for now?
>>>>>>> But IIRC, CPU writes to VRAM vs. write-combined GTT are actually an
>>>>>>> even
>>>>>>> bigger win with dedicated GPUs than with the Kaveri built-in GPU
>>>>>>> on my
>>>>>>> system. I suspect it may depend on the bandwidth available for
>>>>>>> PCIe vs.
>>>>>>> system memory though.
>>>>>> I've made a few tests today with the kernel part of the patches
>>>>>> running
>>>>>> Xonotic on Ultra in 1920 x 1080.
>>>>>>
>>>>>> Without any patches I get around ~47.0fps on average with my dedicated
>>>>>> HD7870.
>>>>>>
>>>>>> Adding only "drm/radeon: Use write-combined CPU mappings of rings and
>>>>>> IBs on >= SI" and that goes down to ~45.3fps.
>>>>>>
>>>>>> Adding on to off that "drm/radeon: Use VRAM for indirect buffers on >=
>>>>>> SI" and the frame rate goes down to ~27.74fps.
>>>>> Hmm, looks like I'll need to do more benchmarking of 3D workloads as
>>>>> well.
>>> I haven't been able to consistently[0] measure any significant
>>> difference between all placements of the rings and IBs with Xonotic or
>>> Reaction Quake with my Bonaire. I'd expect Xonotic to be shader / GPU
>>> memory bandwidth bound rather than CS bound anyway, so a ~40% hit from
>>> that kernel patch alone is very surprising. Are you sure it wasn't just
>>> the same kind of variation as described below?
>> Yes, I've measured that multiple times and the results where quite
>> consistent.
>>
>> But I didn't measured it on a Bonaire, where the bottleneck probably
>> isn't the CPU load. I measured it on a fast Pitcairn
> Ahem, my Bonaire is cranking out ~90fps of Xonotic Ultra at 1920x1080.
> :) (And AFAIK there are even faster Bonaire variants)

My Bonaire only makes something around 17fps with Xonotic Ultra at 
1920x1080, might be a good idea to figure out why at some point.

>
>> and there Xonotic was clearly affected by the patches.
> Okay, I hadn't realized we're not doing any command stream checking as
> of CIK, that probably explains the difference.

Good point, I should probably test the putting IBs in VRAM patch with my 
Bonaire as well.

>>>> My tests clearly show that we still can use USWC for the ring buffer on
>>>> SI and probably earlier chips as well.
>>> Yeah, that might be the safest approach for now.
>> How about using USWC for the rings on all chips since R600
> Any particular reason against doing it for older chips which support
> unsnooped access as well?

Not really, I just didn't noticed that older chips can do this as well.

Christian.

>
>> and for the IB only on CIK? As far as I can see that should do the trick
>> quite well.
> Yeah, sounds good.
>
>

  reply	other threads:[~2014-07-23  7:32 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-17 10:01 [PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT Michel Dänzer
2014-07-17 10:01 ` [PATCH 1/5] drm/radeon: Remove radeon_gart_restore() Michel Dänzer
2014-07-17 10:01 ` [PATCH 2/5] drm/radeon: Pass GART page flags to radeon_gart_set_page() explicitly Michel Dänzer
2014-07-17 10:01 ` [PATCH 3/5] drm/radeon: Allow write-combined CPU mappings of BOs in GTT Michel Dänzer
2014-07-17 10:01 ` [PATCH 4/5] drm/radeon: Use write-combined CPU mappings of rings and IBs on >= SI Michel Dänzer
2014-07-17 10:01 ` [PATCH 5/5] drm/radeon: Use VRAM for indirect buffers " Michel Dänzer
2014-07-17 10:01 ` [PATCH 1/5] winsys/radeon: Use separate caching buffer managers for VRAM and GTT Michel Dänzer
2014-07-17 10:01 ` [PATCH 2/5] r600g/radeonsi: Use write-combined CPU mappings of some BOs in GTT Michel Dänzer
2014-07-17 10:01 ` [PATCH 3/5] r600g/radeonsi: Prefer VRAM for CPU -> GPU streaming buffers Michel Dänzer
2014-07-17 10:01 ` [PATCH 4/5] r600g, radeonsi: Use write-combined persistent GTT mappings Michel Dänzer
2014-07-17 10:56   ` Grigori Goronzy
2014-07-17 12:00   ` Marek Olšák
2014-07-18  3:19     ` [Mesa-dev] " Michel Dänzer
2014-07-18 11:45       ` Marek Olšák
2014-07-18 19:50         ` [Mesa-dev] " Grigori Goronzy
2014-07-18 20:52           ` Marek Olšák
2014-07-23  6:32         ` Michel Dänzer
2014-07-23 11:52           ` Marek Olšák
2014-07-18 20:55       ` Marek Olšák
2014-07-17 10:01 ` [PATCH 5/5] r600g, radeonsi: Prefer VRAM for persistent mappings Michel Dänzer
2014-07-17 12:17   ` Marek Olšák
2014-07-17 10:09 ` [PATCH 0/5] radeon: Write-combined CPU mappings of BOs in GTT Christian König
2014-07-18  3:07   ` Michel Dänzer
2014-07-18  3:58     ` Dieter Nützel
2014-07-18  9:52       ` Michel Dänzer
2014-07-18 15:47     ` Christian König
2014-07-18 17:47       ` Marek Olšák
2014-07-18 22:03         ` Marek Olšák
2014-07-19  1:15       ` Michel Dänzer
2014-07-21  8:07         ` Christian König
2014-07-23  3:54           ` Michel Dänzer
2014-07-23  6:42             ` Christian König
2014-07-23  7:21               ` Michel Dänzer
2014-07-23  7:32                 ` Christian König [this message]
2014-07-23 12:07                 ` Marek Olšák
2014-07-23 13:59             ` Alex Deucher
2014-07-17 12:25 ` Marek Olšák
2014-07-17 13:35 ` Alex Deucher

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53CF64F4.5030809@vodafone.de \
    --to=deathsimple@vodafone.de \
    --cc=alexdeucher@gmail.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=mesa-dev@lists.freedesktop.org \
    --cc=michel@daenzer.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.