From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0F44C43461 for ; Fri, 21 May 2021 09:40:39 +0000 (UTC) Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8B85B613BF for ; Fri, 21 May 2021 09:40:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8B85B613BF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linaro.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=dri-devel-bounces@lists.freedesktop.org Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 28AAC6F60B; Fri, 21 May 2021 09:40:38 +0000 (UTC) Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) by gabe.freedesktop.org (Postfix) with ESMTPS id 728AB6E038 for ; Fri, 21 May 2021 09:40:35 +0000 (UTC) Received: by mail-pg1-x530.google.com with SMTP id f22so12852018pgb.9 for ; Fri, 21 May 2021 02:40:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=aw8gDlLoU0pIEdlFcNuRqOAiJmYCm1GOm0P/pUdoo+o=; b=p8HUmmOqPdlZl5efq5vAQenxYPwsDSCHY3IbPgOt0yf8y1gTt9Y83yTC/r1mMtiiBy jCmN0jcTyPMrQ0vXBM+wcQzBDx6wtsBDLAaKBnpO65+0UORk+OTSxFEwHP1miPSXB3f7 hwE4mvF7ObDhAgclgo6v1hLcJHGxqyGk3KZU5/9/Q30SeLBJazei82yBOX1msdZQHLqS NQfa6vOBhhsv3sZkH96JmEw74I+NID5XGFYJdpcBl332x5YYM6cdab/9LhWPrbTgiQSa SBaI3NFwYszeFNgN1PyXaVi/D5h4oHToBPDhh6nL4B8v9PUKUzBzRhT5JOwo/eOm4B4+ zCdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=aw8gDlLoU0pIEdlFcNuRqOAiJmYCm1GOm0P/pUdoo+o=; b=EpBiUFSE6t1iO4fskvpkaqGXVW5H40/oTdTwpIJ7i5LFXl9fqxfBUGHjy8EWZcojUf XCKZuZN3fqL5mrcKnQiMBUONAhZ2pLLjwVPxnDYVVXGUjZfdAYcYe04RX8RL1ywUNOBo v1gRbT04mj1NWuI4UAnEIj71/piK6InIt+n4jFRD5c8HrWeWgTngNkixzX7kX8k6/XZh gLRv+k2z6KynzwZnoFJ3A16Vb2BLHlOLSbLMm22hvmQAQxPiHlyUqBOAoasdfm00+Cbd xyp2XUIApvBcDW61SMFSesvSgYw7xQOCLYeLd9LOTZ7B1AegekJUhV1loCPWCRyXQhkz 6wEQ== X-Gm-Message-State: AOAM531Uga04LByc5KOBy0Tlz+1chkYMFZ4jHVfDySmJIxZ6L+sw/4An KDHPRh7RCNG0c4V8cjgPOtp7Y9/dbl0a+mDPshRlKw== X-Google-Smtp-Source: ABdhPJxLG9bkG3cUPsNsdBSAPBh3zlZBpZDsInXzIVhKilPzqFUeqYehNJQlRl46JjB36k9qqDBLqJgHzaB42NojxxA= X-Received: by 2002:a62:ab14:0:b029:2db:b3d9:1709 with SMTP id p20-20020a62ab140000b02902dbb3d91709mr9191628pff.80.1621590035117; Fri, 21 May 2021 02:40:35 -0700 (PDT) MIME-Version: 1.0 References: <20201110034934.70898-1-john.stultz@linaro.org> In-Reply-To: <20201110034934.70898-1-john.stultz@linaro.org> From: Lee Jones Date: Fri, 21 May 2021 10:40:24 +0100 Message-ID: Subject: Re: [PATCH v5 0/7] dma-buf: Performance improvements for system heap & a system-uncached implementation To: John Stultz Content-Type: multipart/alternative; boundary="00000000000003ce7105c2d3da5e" X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Sandeep Patil , dri-devel , Ezequiel Garcia , Robin Murphy , James Jones , lkml , Liam Mark , Laura Abbott , Chris Goldsworthy , Hridya Valsaraju , =?UTF-8?Q?=C3=98rjan_Eide?= , "open list:DMA BUFFER SHARING FRAMEWORK" , Suren Baghdasaryan , Daniel Mentz Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" --00000000000003ce7105c2d3da5e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, 10 Nov 2020 at 03:49, John Stultz wrote: > Hey All, > So just wanted to send my last revision of my patch series > of performance optimizations to the dma-buf system heap. > > This series reworks the system heap to use sgtables, and then > consolidates the pagelist method from the heap-helpers into the > CMA heap. After which the heap-helpers logic is removed (as it > is unused). I'd still like to find a better way to avoid some of > the logic duplication in implementing the entire dma_buf_ops > handlers per heap. But unfortunately that code is tied somewhat > to how the buffer's memory is tracked. As more heaps show up I > think we'll have a better idea how to best share code, so for > now I think this is ok. > > After this, the series introduces an optimization that > =C3=98rjan Eide implemented for ION that avoids calling sync on > attachments that don't have a mapping. > > Next, an optimization to use larger order pages for the system > heap. This change brings us closer to the current performance > of the ION allocation code (though there still is a gap due > to ION using a mix of deferred-freeing and page pools, I'll be > looking at integrating those eventually). > > Finally, a reworked version of my uncached system heap > implementation I was submitting a few weeks back. Since it > duplicated a lot of the now reworked system heap code, I > realized it would be much simpler to add the functionality to > the system_heap implementation itself. > > While not improving the core allocation performance, the > uncached heap allocations do result in *much* improved > performance on HiKey960 as it avoids a lot of flushing and > invalidating buffers that the cpu doesn't touch often. > > Feedback on these would be great! > > thanks > -john > > New in v5: > * Added a comment explaining why the order sizes are > chosen as they are > > Cc: Sumit Semwal > Cc: Liam Mark > Cc: Laura Abbott > Cc: Brian Starkey > Cc: Hridya Valsaraju > Cc: Suren Baghdasaryan > Cc: Sandeep Patil > Cc: Daniel Mentz > Cc: Chris Goldsworthy > Cc: =C3=98rjan Eide > Cc: Robin Murphy > Cc: Ezequiel Garcia > Cc: Simon Ser > Cc: James Jones > Cc: linux-media@vger.kernel.org > Cc: dri-devel@lists.freedesktop.org > > John Stultz (7): > dma-buf: system_heap: Rework system heap to use sgtables instead of > pagelists > dma-buf: heaps: Move heap-helper logic into the cma_heap > implementation > dma-buf: heaps: Remove heap-helpers code > dma-buf: heaps: Skip sync if not mapped > dma-buf: system_heap: Allocate higher order pages if available > dma-buf: dma-heap: Keep track of the heap device struct > dma-buf: system_heap: Add a system-uncached heap re-using the system > heap > > drivers/dma-buf/dma-heap.c | 33 +- > drivers/dma-buf/heaps/Makefile | 1 - > drivers/dma-buf/heaps/cma_heap.c | 324 +++++++++++++++--- > drivers/dma-buf/heaps/heap-helpers.c | 270 --------------- > drivers/dma-buf/heaps/heap-helpers.h | 53 --- > drivers/dma-buf/heaps/system_heap.c | 494 ++++++++++++++++++++++++--- > include/linux/dma-heap.h | 9 + > 7 files changed, 753 insertions(+), 431 deletions(-) > delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c > delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h John, did this ever make it past v5? I don't see a follow-up. --=20 Lee Jones [=E6=9D=8E=E7=90=BC=E6=96=AF] Linaro Services Senior Technical Lead Linaro.org =E2=94=82 Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog --00000000000003ce7105c2d3da5e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, 10 Nov 2020 at 03:49, John Stultz= <john.stultz@linaro.org&g= t; wrote:
Hey All,
=C2=A0 So just wanted to send my last revision of my patch series
of performance optimizations to the dma-buf system heap.

This series reworks the system heap to use sgtables, and then
consolidates the pagelist method from the heap-helpers into the
CMA heap. After which the heap-helpers logic is removed (as it
is unused). I'd still like to find a better way to avoid some of
the logic duplication in implementing the entire dma_buf_ops
handlers per heap. But unfortunately that code is tied somewhat
to how the buffer's memory is tracked. As more heaps show up I
think we'll have a better idea how to best share code, so for
now I think this is ok.

After this, the series introduces an optimization that
=C3=98rjan Eide implemented for ION that avoids calling sync on
attachments that don't have a mapping.

Next, an optimization to use larger order pages for the system
heap. This change brings us closer to the current performance
of the ION allocation code (though there still is a gap due
to ION using a mix of deferred-freeing and page pools, I'll be
looking at integrating those eventually).

Finally, a reworked version of my uncached system heap
implementation I was submitting a few weeks back. Since it
duplicated a lot of the now reworked system heap code, I
realized it would be much simpler to add the functionality to
the system_heap implementation itself.

While not improving the core allocation performance, the
uncached heap allocations do result in *much* improved
performance on HiKey960 as it avoids a lot of flushing and
invalidating buffers that the cpu doesn't touch often.

Feedback on these would be great!

thanks
-john

New in v5:
* Added a comment explaining why the order sizes are
=C2=A0 chosen as they are

Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: =C3=98rjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <= contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-= media@vger.kernel.org
Cc: dr= i-devel@lists.freedesktop.org

John Stultz (7):
=C2=A0 dma-buf: system_heap: Rework system heap to use sgtables instead of<= br> =C2=A0 =C2=A0 pagelists
=C2=A0 dma-buf: heaps: Move heap-helper logic into the cma_heap
=C2=A0 =C2=A0 implementation
=C2=A0 dma-buf: heaps: Remove heap-helpers code
=C2=A0 dma-buf: heaps: Skip sync if not mapped
=C2=A0 dma-buf: system_heap: Allocate higher order pages if available
=C2=A0 dma-buf: dma-heap: Keep track of the heap device struct
=C2=A0 dma-buf: system_heap: Add a system-uncached heap re-using the system=
=C2=A0 =C2=A0 heap

=C2=A0drivers/dma-buf/dma-heap.c=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0|= =C2=A0 33 +-
=C2=A0drivers/dma-buf/heaps/Makefile=C2=A0 =C2=A0 =C2=A0 =C2=A0|=C2=A0 =C2= =A01 -
=C2=A0drivers/dma-buf/heaps/cma_heap.c=C2=A0 =C2=A0 =C2=A0| 324 +++++++++++= ++++---
=C2=A0drivers/dma-buf/heaps/heap-helpers.c | 270 ---------------
=C2=A0drivers/dma-buf/heaps/heap-helpers.h |=C2=A0 53 ---
=C2=A0drivers/dma-buf/heaps/system_heap.c=C2=A0 | 494 +++++++++++++++++++++= +++---
=C2=A0include/linux/dma-heap.h=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0|=C2=A0 =C2=A09 +
=C2=A07 files changed, 753 insertions(+), 431 deletions(-)
=C2=A0delete mode 100644 drivers/dma-buf/heaps/heap-helpers.c
=C2=A0delete mode 100644 drivers/dma-buf/heaps/heap-helpers.h
<= div>
John, did this ever make it past v5?=C2=A0 I don't s= ee a follow-up.

--
Lee Jones [=E6=9D=8E=E7=90=BC=E6=96=AF]
Linaro Services Senior Technical Lea= d
Linaro.org =E2=94=82 Op= en source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
--00000000000003ce7105c2d3da5e--