All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Anholt <eric@anholt.net>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: Daniel Vetter <daniel@ffwll.ch>,
	devicetree@vger.kernel.org,
	Stephen Warren <swarren@wwwdotorg.org>,
	Lee Jones <lee@kernel.org>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-rpi-kernel@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 3/7] drm/vc4: Add KMS support for Raspberry Pi.
Date: Thu, 13 Aug 2015 16:03:00 -0700	[thread overview]
Message-ID: <87zj1uiyfv.fsf@eliezer.anholt.net> (raw)
In-Reply-To: <20150813212936.GT7557@n2100.arm.linux.org.uk>

[-- Attachment #1: Type: text/plain, Size: 1421 bytes --]

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Thu, Aug 13, 2015 at 01:44:03PM -0700, Eric Anholt wrote:
>> Struct mutex is here because this code is from the V3D series, with the
>> in-kernel BO cache ripped out (it turns out that the CMA allocator is
>> slow, and you can't just userspace cache since we have to do allocations
>> within the kernel to the tune of a couple per draw and that's too much).
>
> The CMA allocator is fast until you have pinned pages in its region,
> where it becomes _very_ slow to do allocations, sometimes getting up
> to the order of seconds.
>
> The main culpret of this are GFP_HIGHUSER_MOVABLE allocations which
> then pin the page.  It doesn't take many of those to make CMA really
> inefficient.
>
> The problem is that CMA doesn't get any information back from the
> internal page migration about which pages couldn't be moved, so it
> dumbly just tries incrementing the allocation by one page (subject
> to alignment constraints) and retrying again - repeating over the
> entire CMA region.  The bigger the region, the more time this takes.

Ouch.

Since I can workaround the allocation cost, the main problem I have
right now is that I've got a set of small allocations for 3D that all
need to have the same high 4 bits of paddr, because someone cleverly
packed some address bits in a GPU-managed structure.  Any
recommendations for ways to handle this with CMA?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

WARNING: multiple messages have this Message-ID (diff)
From: Eric Anholt <eric@anholt.net>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: devicetree@vger.kernel.org,
	Stephen Warren <swarren@wwwdotorg.org>,
	Lee Jones <lee@kernel.org>,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org,
	linux-rpi-kernel@lists.infradead.org,
	linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 3/7] drm/vc4: Add KMS support for Raspberry Pi.
Date: Thu, 13 Aug 2015 16:03:00 -0700	[thread overview]
Message-ID: <87zj1uiyfv.fsf@eliezer.anholt.net> (raw)
In-Reply-To: <20150813212936.GT7557@n2100.arm.linux.org.uk>


[-- Attachment #1.1: Type: text/plain, Size: 1421 bytes --]

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Thu, Aug 13, 2015 at 01:44:03PM -0700, Eric Anholt wrote:
>> Struct mutex is here because this code is from the V3D series, with the
>> in-kernel BO cache ripped out (it turns out that the CMA allocator is
>> slow, and you can't just userspace cache since we have to do allocations
>> within the kernel to the tune of a couple per draw and that's too much).
>
> The CMA allocator is fast until you have pinned pages in its region,
> where it becomes _very_ slow to do allocations, sometimes getting up
> to the order of seconds.
>
> The main culpret of this are GFP_HIGHUSER_MOVABLE allocations which
> then pin the page.  It doesn't take many of those to make CMA really
> inefficient.
>
> The problem is that CMA doesn't get any information back from the
> internal page migration about which pages couldn't be moved, so it
> dumbly just tries incrementing the allocation by one page (subject
> to alignment constraints) and retrying again - repeating over the
> entire CMA region.  The bigger the region, the more time this takes.

Ouch.

Since I can workaround the allocation cost, the main problem I have
right now is that I've got a set of small allocations for 3D that all
need to have the same high 4 bits of paddr, because someone cleverly
packed some address bits in a GPU-managed structure.  Any
recommendations for ways to handle this with CMA?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 818 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

WARNING: multiple messages have this Message-ID (diff)
From: eric@anholt.net (Eric Anholt)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH 3/7] drm/vc4: Add KMS support for Raspberry Pi.
Date: Thu, 13 Aug 2015 16:03:00 -0700	[thread overview]
Message-ID: <87zj1uiyfv.fsf@eliezer.anholt.net> (raw)
In-Reply-To: <20150813212936.GT7557@n2100.arm.linux.org.uk>

Russell King - ARM Linux <linux@arm.linux.org.uk> writes:

> On Thu, Aug 13, 2015 at 01:44:03PM -0700, Eric Anholt wrote:
>> Struct mutex is here because this code is from the V3D series, with the
>> in-kernel BO cache ripped out (it turns out that the CMA allocator is
>> slow, and you can't just userspace cache since we have to do allocations
>> within the kernel to the tune of a couple per draw and that's too much).
>
> The CMA allocator is fast until you have pinned pages in its region,
> where it becomes _very_ slow to do allocations, sometimes getting up
> to the order of seconds.
>
> The main culpret of this are GFP_HIGHUSER_MOVABLE allocations which
> then pin the page.  It doesn't take many of those to make CMA really
> inefficient.
>
> The problem is that CMA doesn't get any information back from the
> internal page migration about which pages couldn't be moved, so it
> dumbly just tries incrementing the allocation by one page (subject
> to alignment constraints) and retrying again - repeating over the
> entire CMA region.  The bigger the region, the more time this takes.

Ouch.

Since I can workaround the allocation cost, the main problem I have
right now is that I've got a set of small allocations for 3D that all
need to have the same high 4 bits of paddr, because someone cleverly
packed some address bits in a GPU-managed structure.  Any
recommendations for ways to handle this with CMA?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 818 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20150813/771987cb/attachment.sig>

  reply	other threads:[~2015-08-13 23:03 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-13  0:56 Raspberry Pi KMS-only driver Eric Anholt
2015-08-13  0:56 ` Eric Anholt
2015-08-13  0:56 ` Eric Anholt
2015-08-13  0:56 ` [PATCH 1/7] drm/vc4: Add devicetree bindings for VC4 Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-15  4:38   ` Stephen Warren
2015-08-15  4:38     ` Stephen Warren
2015-08-17 18:30     ` Eric Anholt
2015-08-17 18:30       ` Eric Anholt
2015-08-17 18:30       ` Eric Anholt
2015-08-24 13:47       ` Rob Herring
2015-08-24 13:47         ` Rob Herring
2015-08-24 13:47         ` Rob Herring
2015-08-25 20:42         ` Rob Clark
2015-08-25 20:42           ` Rob Clark
2015-08-25 20:42           ` Rob Clark
2015-08-25 23:22           ` Rob Herring
2015-08-25 23:22             ` Rob Herring
2015-08-25 23:22             ` Rob Herring
2015-08-26 11:52           ` Daniel Vetter
2015-08-26 11:52             ` Daniel Vetter
2015-08-26 11:52             ` Daniel Vetter
2015-08-26 12:09             ` Thierry Reding
2015-08-26 12:09               ` Thierry Reding
2015-08-26 12:09               ` Thierry Reding
2015-08-26 14:30               ` Rob Herring
2015-08-26 14:30                 ` Rob Herring
2015-08-26 14:30                 ` Rob Herring
2015-08-26 20:59                 ` Dave Airlie
2015-08-26 20:59                   ` Dave Airlie
2015-08-26 20:59                   ` Dave Airlie
2015-08-27  0:35                   ` Rob Herring
2015-08-27  0:35                     ` Rob Herring
2015-08-27  0:35                     ` Rob Herring
2015-08-26 11:51     ` Thierry Reding
2015-08-26 11:51       ` Thierry Reding
2015-08-26 11:51       ` Thierry Reding
2015-08-24 13:56   ` Rob Herring
2015-08-24 13:56     ` Rob Herring
2015-08-24 13:56     ` Rob Herring
2015-08-13  0:56 ` [PATCH 2/7] MAINTAINERS: Add myself for the new VC4 (RPi GPU) graphics driver Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-15  4:39   ` Stephen Warren
2015-08-15  4:39     ` Stephen Warren
2015-08-15  4:39     ` Stephen Warren
2015-08-17 18:47     ` Eric Anholt
2015-08-17 18:47       ` Eric Anholt
2015-08-17 18:47       ` Eric Anholt
2015-08-13  0:56 ` [PATCH 3/7] drm/vc4: Add KMS support for Raspberry Pi Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-13  7:51   ` Daniel Vetter
2015-08-13  7:51     ` Daniel Vetter
2015-08-13  7:51     ` Daniel Vetter
2015-08-13 20:44     ` Eric Anholt
2015-08-13 20:44       ` Eric Anholt
2015-08-13 20:44       ` Eric Anholt
2015-08-13 21:17       ` Daniel Vetter
2015-08-13 21:17         ` Daniel Vetter
2015-08-13 21:17         ` Daniel Vetter
2015-08-18 20:56         ` Eric Anholt
2015-08-18 20:56           ` Eric Anholt
2015-08-18 20:56           ` Eric Anholt
2015-08-13 21:29       ` Russell King - ARM Linux
2015-08-13 21:29         ` Russell King - ARM Linux
2015-08-13 21:29         ` Russell King - ARM Linux
2015-08-13 23:03         ` Eric Anholt [this message]
2015-08-13 23:03           ` Eric Anholt
2015-08-13 23:03           ` Eric Anholt
2015-08-13 11:45   ` Emil Velikov
2015-08-13 11:45     ` Emil Velikov
2015-08-13 11:45     ` Emil Velikov
2015-08-15  4:45   ` Stephen Warren
2015-08-15  4:45     ` Stephen Warren
2015-08-15  4:45     ` Stephen Warren
2015-08-17 17:56     ` Eric Anholt
2015-08-17 17:56       ` Eric Anholt
2015-08-17 17:56       ` Eric Anholt
2015-08-13  0:56 ` [PATCH 4/7] drm/vc4: Use the fbdev_cma helpers Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-13  0:56 ` [PATCH 5/7] drm/vc4: Allow vblank to be disabled Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-13  0:56 ` [PATCH 6/7] ARM: bcm2835: Add the DDC I2C controller to the device tree Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-15  4:51   ` Stephen Warren
2015-08-15  4:51     ` Stephen Warren
2015-08-15  4:51     ` Stephen Warren
2015-08-17 18:35     ` Eric Anholt
2015-08-17 18:35       ` Eric Anholt
2015-08-17 18:35       ` Eric Anholt
2015-08-13  0:56 ` [PATCH 7/7] ARM: bcm2835: Add VC4 " Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-13  0:56   ` Eric Anholt
2015-08-15  4:54   ` Stephen Warren
2015-08-15  4:54     ` Stephen Warren
2015-08-15  4:54     ` Stephen Warren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87zj1uiyfv.fsf@eliezer.anholt.net \
    --to=eric@anholt.net \
    --cc=daniel@ffwll.ch \
    --cc=devicetree@vger.kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=lee@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rpi-kernel@lists.infradead.org \
    --cc=linux@arm.linux.org.uk \
    --cc=swarren@wwwdotorg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.