All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Zhao, Yong" <Yong.Zhao@amd.com>
To: "Kuehling, Felix" <Felix.Kuehling@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Subject: Re: [PATCH] drm/amdkfd: Consolidate duplicated bo alloc flags
Date: Thu, 5 Mar 2020 20:55:49 +0000	[thread overview]
Message-ID: <CH2PR12MB3926C6062226E00D4626CC43F0E20@CH2PR12MB3926.namprd12.prod.outlook.com> (raw)
In-Reply-To: <48c5b228-99e3-2041-92c8-efc87de0586f@amd.com>


[-- Attachment #1.1: Type: text/plain, Size: 11051 bytes --]

[AMD Official Use Only - Internal Distribution Only]

Okay, I will change back to its original format.

Yong

________________________________
From: Kuehling, Felix <Felix.Kuehling@amd.com>
Sent: Thursday, March 5, 2020 3:49 PM
To: amd-gfx@lists.freedesktop.org <amd-gfx@lists.freedesktop.org>; Zhao, Yong <Yong.Zhao@amd.com>
Subject: Re: [PATCH] drm/amdkfd: Consolidate duplicated bo alloc flags

On 2020-03-04 3:21 p.m., Yong Zhao wrote:
> ALLOC_MEM_FLAGS_* used are the same as the KFD_IOC_ALLOC_MEM_FLAGS_*,
> but they are interweavedly used in kernel driver, resulting in bad
> readability. For example, KFD_IOC_ALLOC_MEM_FLAGS_COHERENT is totally
> not referenced in kernel, and it functions in the kernel through
> ALLOC_MEM_FLAGS_COHERENT, causing unnecessary confusion.
>
> Replace all occurrences of ALLOC_MEM_FLAGS_* by
> KFD_IOC_ALLOC_MEM_FLAGS_* to solve the problem.
>
> Change-Id: Iced6ed3698167296c97b14e7e4569883859d619c
> Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
> ---
>   drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c    |  9 +++--
>   .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  | 38 +++++++++++--------
>   drivers/gpu/drm/amd/amdkfd/kfd_process.c      | 13 ++++---
>   .../gpu/drm/amd/include/kgd_kfd_interface.h   | 21 ----------
>   4 files changed, 36 insertions(+), 45 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> index 726c91ab6761..affaa0d4b636 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
> @@ -29,6 +29,7 @@
>   #include <linux/module.h>
>   #include <linux/dma-buf.h>
>   #include "amdgpu_xgmi.h"
> +#include <uapi/linux/kfd_ioctl.h>
>
>   static const unsigned int compute_vmid_bitmap = 0xFF00;
>
> @@ -500,11 +501,13 @@ int amdgpu_amdkfd_get_dmabuf_info(struct kgd_dev *kgd, int dma_buf_fd,
>                r = amdgpu_bo_get_metadata(bo, metadata_buffer, buffer_size,
>                                           metadata_size, &metadata_flags);
>        if (flags) {
> -             *flags = (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) ?
> -                     ALLOC_MEM_FLAGS_VRAM : ALLOC_MEM_FLAGS_GTT;
> +             if (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM)
> +                     *flags = KFD_IOC_ALLOC_MEM_FLAGS_VRAM;
> +             else
> +                     *flags = KFD_IOC_ALLOC_MEM_FLAGS_GTT;

You're sneaking in some personal preference (changing the trinary (cond
? a : b) operator to if-else) with the renaming change. Personally I
find the trinary operator just as readable and more concise.


>
>                if (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)
> -                     *flags |= ALLOC_MEM_FLAGS_PUBLIC;
> +                     *flags |= KFD_IOC_ALLOC_MEM_FLAGS_PUBLIC;
>        }
>
>   out_put:
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index e4481caed648..c81fe7011e88 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -29,6 +29,7 @@
>   #include "amdgpu_vm.h"
>   #include "amdgpu_amdkfd.h"
>   #include "amdgpu_dma_buf.h"
> +#include <uapi/linux/kfd_ioctl.h>
>
>   /* BO flag to indicate a KFD userptr BO */
>   #define AMDGPU_AMDKFD_USERPTR_BO (1ULL << 63)
> @@ -400,18 +401,18 @@ static int vm_update_pds(struct amdgpu_vm *vm, struct amdgpu_sync *sync)
>   static uint64_t get_pte_flags(struct amdgpu_device *adev, struct kgd_mem *mem)
>   {
>        struct amdgpu_device *bo_adev = amdgpu_ttm_adev(mem->bo->tbo.bdev);
> -     bool coherent = mem->alloc_flags & ALLOC_MEM_FLAGS_COHERENT;
> +     bool coherent = mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_COHERENT;
>        uint32_t mapping_flags;
>
>        mapping_flags = AMDGPU_VM_PAGE_READABLE;
> -     if (mem->alloc_flags & ALLOC_MEM_FLAGS_WRITABLE)
> +     if (mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE)
>                mapping_flags |= AMDGPU_VM_PAGE_WRITEABLE;
> -     if (mem->alloc_flags & ALLOC_MEM_FLAGS_EXECUTABLE)
> +     if (mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE)
>                mapping_flags |= AMDGPU_VM_PAGE_EXECUTABLE;
>
>        switch (adev->asic_type) {
>        case CHIP_ARCTURUS:
> -             if (mem->alloc_flags & ALLOC_MEM_FLAGS_VRAM) {
> +             if (mem->alloc_flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
>                        if (bo_adev == adev)
>                                mapping_flags |= coherent ?
>                                        AMDGPU_VM_MTYPE_CC : AMDGPU_VM_MTYPE_RW;
> @@ -1160,24 +1161,24 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>        /*
>         * Check on which domain to allocate BO
>         */
> -     if (flags & ALLOC_MEM_FLAGS_VRAM) {
> +     if (flags & KFD_IOC_ALLOC_MEM_FLAGS_VRAM) {
>                domain = alloc_domain = AMDGPU_GEM_DOMAIN_VRAM;
>                alloc_flags = AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE;
> -             alloc_flags |= (flags & ALLOC_MEM_FLAGS_PUBLIC) ?
> +             alloc_flags |= (flags & KFD_IOC_ALLOC_MEM_FLAGS_PUBLIC) ?
>                        AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED :
>                        AMDGPU_GEM_CREATE_NO_CPU_ACCESS;
> -     } else if (flags & ALLOC_MEM_FLAGS_GTT) {
> +     } else if (flags & KFD_IOC_ALLOC_MEM_FLAGS_GTT) {
>                domain = alloc_domain = AMDGPU_GEM_DOMAIN_GTT;
>                alloc_flags = 0;
> -     } else if (flags & ALLOC_MEM_FLAGS_USERPTR) {
> +     } else if (flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) {
>                domain = AMDGPU_GEM_DOMAIN_GTT;
>                alloc_domain = AMDGPU_GEM_DOMAIN_CPU;
>                alloc_flags = 0;
>                if (!offset || !*offset)
>                        return -EINVAL;
>                user_addr = untagged_addr(*offset);
> -     } else if (flags & (ALLOC_MEM_FLAGS_DOORBELL |
> -                     ALLOC_MEM_FLAGS_MMIO_REMAP)) {
> +     } else if (flags & (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL |
> +                     KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP)) {
>                domain = AMDGPU_GEM_DOMAIN_GTT;
>                alloc_domain = AMDGPU_GEM_DOMAIN_CPU;
>                bo_type = ttm_bo_type_sg;
> @@ -1198,7 +1199,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu(
>        }
>        INIT_LIST_HEAD(&(*mem)->bo_va_list);
>        mutex_init(&(*mem)->lock);
> -     (*mem)->aql_queue = !!(flags & ALLOC_MEM_FLAGS_AQL_QUEUE_MEM);
> +     (*mem)->aql_queue = !!(flags & KFD_IOC_ALLOC_MEM_FLAGS_AQL_QUEUE_MEM);
>
>        /* Workaround for AQL queue wraparound bug. Map the same
>         * memory twice. That means we only actually allocate half
> @@ -1652,6 +1653,7 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct kgd_dev *kgd,
>        struct drm_gem_object *obj;
>        struct amdgpu_bo *bo;
>        struct amdgpu_vm *avm = (struct amdgpu_vm *)vm;
> +     uint32_t flags;
>
>        if (dma_buf->ops != &amdgpu_dmabuf_ops)
>                /* Can't handle non-graphics buffers */
> @@ -1680,10 +1682,16 @@ int amdgpu_amdkfd_gpuvm_import_dmabuf(struct kgd_dev *kgd,
>
>        INIT_LIST_HEAD(&(*mem)->bo_va_list);
>        mutex_init(&(*mem)->lock);
> -     (*mem)->alloc_flags =
> -             ((bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) ?
> -              ALLOC_MEM_FLAGS_VRAM : ALLOC_MEM_FLAGS_GTT) |
> -             ALLOC_MEM_FLAGS_WRITABLE | ALLOC_MEM_FLAGS_EXECUTABLE;
> +
> +     flags = KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE
> +             | KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;
> +
> +     if (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM)
> +             flags |= KFD_IOC_ALLOC_MEM_FLAGS_VRAM;
> +     else
> +             flags |= KFD_IOC_ALLOC_MEM_FLAGS_GTT;
> +
> +     (*mem)->alloc_flags = flags;

Same as above. The original code was only 4 lines and a single
statement. Your code is 8 lines, four statements, plus an extra local
variable. I don't see how this is an improvement.

Regards,
   Felix


>
>        (*mem)->bo = amdgpu_bo_ref(bo);
>        (*mem)->va = va;
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 22abdbc6dfd7..1c7bfc842f06 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -327,10 +327,10 @@ static int kfd_process_alloc_gpuvm(struct kfd_process_device *pdd,
>   static int kfd_process_device_reserve_ib_mem(struct kfd_process_device *pdd)
>   {
>        struct qcm_process_device *qpd = &pdd->qpd;
> -     uint32_t flags = ALLOC_MEM_FLAGS_GTT |
> -                      ALLOC_MEM_FLAGS_NO_SUBSTITUTE |
> -                      ALLOC_MEM_FLAGS_WRITABLE |
> -                      ALLOC_MEM_FLAGS_EXECUTABLE;
> +     uint32_t flags = KFD_IOC_ALLOC_MEM_FLAGS_GTT |
> +                     KFD_IOC_ALLOC_MEM_FLAGS_NO_SUBSTITUTE |
> +                     KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE |
> +                     KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;
>        void *kaddr;
>        int ret;
>
> @@ -692,8 +692,9 @@ static int kfd_process_device_init_cwsr_dgpu(struct kfd_process_device *pdd)
>   {
>        struct kfd_dev *dev = pdd->dev;
>        struct qcm_process_device *qpd = &pdd->qpd;
> -     uint32_t flags = ALLOC_MEM_FLAGS_GTT |
> -             ALLOC_MEM_FLAGS_NO_SUBSTITUTE | ALLOC_MEM_FLAGS_EXECUTABLE;
> +     uint32_t flags = KFD_IOC_ALLOC_MEM_FLAGS_GTT
> +                     | KFD_IOC_ALLOC_MEM_FLAGS_NO_SUBSTITUTE
> +                     | KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;
>        void *kaddr;
>        int ret;
>
> diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> index 0cee79d56075..a3c238c39ef5 100644
> --- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> +++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
> @@ -167,27 +167,6 @@ struct tile_config {
>
>   #define KFD_MAX_NUM_OF_QUEUES_PER_DEVICE_DEFAULT 4096
>
> -/*
> - * Allocation flag domains
> - * NOTE: This must match the corresponding definitions in kfd_ioctl.h.
> - */
> -#define ALLOC_MEM_FLAGS_VRAM         (1 << 0)
> -#define ALLOC_MEM_FLAGS_GTT          (1 << 1)
> -#define ALLOC_MEM_FLAGS_USERPTR              (1 << 2)
> -#define ALLOC_MEM_FLAGS_DOORBELL     (1 << 3)
> -#define ALLOC_MEM_FLAGS_MMIO_REMAP   (1 << 4)
> -
> -/*
> - * Allocation flags attributes/access options.
> - * NOTE: This must match the corresponding definitions in kfd_ioctl.h.
> - */
> -#define ALLOC_MEM_FLAGS_WRITABLE     (1 << 31)
> -#define ALLOC_MEM_FLAGS_EXECUTABLE   (1 << 30)
> -#define ALLOC_MEM_FLAGS_PUBLIC               (1 << 29)
> -#define ALLOC_MEM_FLAGS_NO_SUBSTITUTE        (1 << 28) /* TODO */
> -#define ALLOC_MEM_FLAGS_AQL_QUEUE_MEM        (1 << 27)
> -#define ALLOC_MEM_FLAGS_COHERENT     (1 << 26) /* For GFXv9 or later */
> -
>   /**
>    * struct kfd2kgd_calls
>    *

[-- Attachment #1.2: Type: text/html, Size: 21621 bytes --]

[-- Attachment #2: Type: text/plain, Size: 154 bytes --]

_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

  reply	other threads:[~2020-03-05 20:55 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-04 20:21 [PATCH] drm/amdkfd: Consolidate duplicated bo alloc flags Yong Zhao
2020-03-05 20:49 ` Felix Kuehling
2020-03-05 20:55   ` Zhao, Yong [this message]
2020-03-06 23:30 Yong Zhao
2020-03-09 21:45 ` Felix Kuehling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CH2PR12MB3926C6062226E00D4626CC43F0E20@CH2PR12MB3926.namprd12.prod.outlook.com \
    --to=yong.zhao@amd.com \
    --cc=Felix.Kuehling@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.