All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arunpravin Paneer Selvam <arunpravin.paneerselvam@amd.com>
To: "Christian König" <ckoenig.leichtzumerken@gmail.com>,
	"Wang, Yang(Kevin)" <KevinYang.Wang@amd.com>,
	"Koenig, Christian" <Christian.Koenig@amd.com>,
	"amd-gfx@lists.freedesktop.org" <amd-gfx@lists.freedesktop.org>
Cc: "Deucher, Alexander" <Alexander.Deucher@amd.com>,
	"Zhang, Hawking" <Hawking.Zhang@amd.com>
Subject: Re: [PATCH] drm/amdgpu: add amdgpu vram usage information into amdgpu_vram_mm
Date: Thu, 24 Nov 2022 20:26:41 +0530	[thread overview]
Message-ID: <e40d1956-0fb6-2972-6a85-193f4e89c328@amd.com> (raw)
In-Reply-To: <dea21b09-35ff-a180-7f64-a5b12adcebf1@gmail.com>



On 11/24/2022 1:17 PM, Christian König wrote:
> Am 24.11.22 um 08:45 schrieb Wang, Yang(Kevin):
>> [AMD Official Use Only - General]
>>
>> -----Original Message-----
>> From: Koenig, Christian <Christian.Koenig@amd.com>
>> Sent: Thursday, November 24, 2022 3:25 PM
>> To: Wang, Yang(Kevin) <KevinYang.Wang@amd.com>; 
>> amd-gfx@lists.freedesktop.org; Paneer Selvam, Arunpravin 
>> <Arunpravin.PaneerSelvam@amd.com>
>> Cc: Zhang, Hawking <Hawking.Zhang@amd.com>; Deucher, Alexander 
>> <Alexander.Deucher@amd.com>
>> Subject: Re: [PATCH] drm/amdgpu: add amdgpu vram usage information 
>> into amdgpu_vram_mm
>>
>> Am 24.11.22 um 06:49 schrieb Yang Wang:
>>> add vram usage information into dri debugfs amdgpu_vram_mm node.
>>>
>>> Background:
>>> when amdgpu driver introduces drm buddy allocator, the kernel driver
>>> (and developer) is difficult to track the vram usage information.
>>>
>>> Field:
>>> 0xaaaaaaaa-0xbbbbbbbb: vram usaged range.
>>> type: kernel, device, sg
>>> usage: normal, vm, user
>>> domain: C-CPU, G-GTT, V-VRAM, P-PRIV
>>> @xxxxx: the address of "amdgpu_bo" object in kernel space.
>>> 4096: vram range range.
>>>
>>> Example:
>>> 0x00000003fea68000-0x00000003fea68fff: (type:kernel usage:vm       
>>> domain:--V- --V-) @000000001d33dfee 4096 bytes
>>> 0x00000003fea69000-0x00000003fea69fff: (type:kernel usage:vm       
>>> domain:--V- --V-) @00000000a79155b5 4096 bytes
>>> 0x00000003fea6b000-0x00000003fea6bfff: (type:kernel usage:vm       
>>> domain:--V- --V-) @0000000038ad633b 4096 bytes
>>> 0x00000003fea6c000-0x00000003fea6cfff: (type:device usage:user     
>>> domain:--V- --V-) @00000000e302f90b 4096 bytes
>>> 0x00000003fea6d000-0x00000003fea6dfff: (type:device usage:user     
>>> domain:--V- --V-) @00000000e664c172 4096 bytes
>>> 0x00000003fea6e000-0x00000003fea6efff: (type:kernel usage:vm       
>>> domain:--V- --V-) @000000004528cb2f 4096 bytes
>>> 0x00000003fea6f000-0x00000003fea6ffff: (type:kernel usage:vm       
>>> domain:--V- --V-) @00000000a446bdbf 4096 bytes
>>> 0x00000003fea70000-0x00000003fea7ffff: (type:device usage:user     
>>> domain:--V- --V-) @0000000078fae42f 65536 bytes
>>> 0x00000003fead8000-0x00000003feadbfff: (type:kernel usage:normal   
>>> domain:--V- --V-) @000000001327b7ff 16384 bytes
>>> 0x00000003feadc000-0x00000003feadcfff: (type:kernel usage:normal   
>>> domain:--V- --V-) @000000001327b7ff 4096 bytes
>>> 0x00000003feadd000-0x00000003feaddfff: (type:kernel usage:normal   
>>> domain:--V- --V-) @00000000b9706fc1 4096 bytes
>>> 0x00000003feade000-0x00000003feadefff: (type:kernel usage:vm       
>>> domain:--V- --V-) @0000000071a25571 4096 bytes
>>>
>>> Note:
>>> although some vram ranges can be merged in the example above, but this
>>> can reflect the actual distribution of drm buddy allocator.
>> Well completely NAK. This is way to much complexity for simple 
>> debugging.
>>
>> Question is what are your requirements here? E.g. what information do 
>> you want and why doesn't the buddy allocator already expose this?
>>
>> Regards,
>> Christian.
>>
>> [Kevin]:
>>
>> For KMD debug.
>> The DRM buddy interface doesn't provide an interface to query which 
>> ranges of ram(resource) are used.
>> It is not easy to debug in KMD side if driver create BO fail at 
>> specific location.
>> and from the view of KMD, the VRAM at some locations has special 
>> purposes.
>> with this patch, we can know which range of vram are actually used.
>
> Well that's not a good reason to add this complexity. Debugging 
> doesn't justify that.
>
> Please work with Arun to add the necessary information to the buddy 
> allocator interface.
>
> Regards,
> Christian.
>
Hi Kevin,

I will check and list down some of the necessary information that we can 
add to the
buddy allocator interface.

Regards,
Arun.
>>
>> Best Regards,
>> Kevin
>>> Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
>>> ---
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.c   |   6 +-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_object.h   |   3 +
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c | 130 
>>> ++++++++++++++++++-
>>>    drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h |   1 +
>>>    4 files changed, 136 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> index 90eb07106609..117c754409b3 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
>>> @@ -53,7 +53,7 @@
>>>     *
>>>     */
>>>
>>> -static void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>>> +void amdgpu_bo_destroy(struct ttm_buffer_object *tbo)
>>>    {
>>>        struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
>>>
>>> @@ -66,7 +66,7 @@ static void amdgpu_bo_destroy(struct 
>>> ttm_buffer_object *tbo)
>>>        kvfree(bo);
>>>    }
>>>
>>> -static void amdgpu_bo_user_destroy(struct ttm_buffer_object *tbo)
>>> +void amdgpu_bo_user_destroy(struct ttm_buffer_object *tbo)
>>>    {
>>>        struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo);
>>>        struct amdgpu_bo_user *ubo;
>>> @@ -76,7 +76,7 @@ static void amdgpu_bo_user_destroy(struct 
>>> ttm_buffer_object *tbo)
>>>        amdgpu_bo_destroy(tbo);
>>>    }
>>>
>>> -static void amdgpu_bo_vm_destroy(struct ttm_buffer_object *tbo)
>>> +void amdgpu_bo_vm_destroy(struct ttm_buffer_object *tbo)
>>>    {
>>>        struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev);
>>>        struct amdgpu_bo *bo = ttm_to_amdgpu_bo(tbo); diff --git
>>> a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> index 147b79c10cbb..3f6a687309a7 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
>>> @@ -332,6 +332,9 @@ int amdgpu_bo_restore_shadow(struct amdgpu_bo 
>>> *shadow,
>>>                             struct dma_fence **fence);
>>>    uint32_t amdgpu_bo_get_preferred_domain(struct amdgpu_device *adev,
>>>                                            uint32_t domain);
>>> +void amdgpu_bo_destroy(struct ttm_buffer_object *tbo); void
>>> +amdgpu_bo_user_destroy(struct ttm_buffer_object *tbo); void
>>> +amdgpu_bo_vm_destroy(struct ttm_buffer_object *tbo);
>>>
>>>    /*
>>>     * sub allocation
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>>> index 27159f1d112e..165f4f1a8141 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
>>> @@ -23,9 +23,11 @@
>>>     */
>>>
>>>    #include <linux/dma-mapping.h>
>>> +#include <linux/interval_tree_generic.h>
>>>    #include <drm/ttm/ttm_range_manager.h>
>>>
>>>    #include "amdgpu.h"
>>> +#include "amdgpu_object.h"
>>>    #include "amdgpu_vm.h"
>>>    #include "amdgpu_res_cursor.h"
>>>    #include "amdgpu_atomfirmware.h"
>>> @@ -38,6 +40,122 @@ struct amdgpu_vram_reservation {
>>>        struct list_head blocks;
>>>    };
>>>
>>> +struct amdgpu_vram_node {
>>> +     struct rb_node node;
>>> +     u64 start;
>>> +     u64 last;
>>> +     u64 __subtree_last;
>>> +     struct ttm_buffer_object *tbo;
>>> +};
>>> +
>>> +#define START(node) ((node)->start)
>>> +#define LAST(node) ((node)->last)
>>> +
>>> +INTERVAL_TREE_DEFINE(struct amdgpu_vram_node, node, u64, 
>>> __subtree_last,
>>> +                  START, LAST, static, amdgpu_vram_it)
>>> +
>>> +#undef START
>>> +#undef LAST
>>> +
>>> +#define for_each_vram_mm_node(node, mgr) \
>>> +     for (node = amdgpu_vram_it_iter_first(&(mgr)->root, 0, 
>>> U64_MAX); node; \
>>> +          node = amdgpu_vram_it_iter_next(node, 0, U64_MAX))
>>> +
>>> +static void amdgpu_vram_mm_add_block(struct drm_buddy_block *block,
>>> +struct amdgpu_vram_mgr *mgr, struct ttm_buffer_object *tbo) {
>>> +     struct amdgpu_vram_node *node;
>>> +
>>> +     node = kvzalloc(sizeof(*node), GFP_KERNEL);
>>> +     if (!node)
>>> +             return;
>>> +
>>> +     node->start = amdgpu_vram_mgr_block_start(block);
>>> +     node->last = node->start + amdgpu_vram_mgr_block_size(block) - 1;
>>> +     node->tbo = tbo;
>>> +
>>> +     amdgpu_vram_it_insert(node, &mgr->root); }
>>> +
>>> +static void amdgpu_vram_mm_remove_block(struct drm_buddy_block
>>> +*block, struct amdgpu_vram_mgr *mgr) {
>>> +     struct amdgpu_vram_node *node;
>>> +     u64 start, last;
>>> +
>>> +     start = amdgpu_vram_mgr_block_start(block);
>>> +     last = start + amdgpu_vram_mgr_block_size(block) - 1;
>>> +
>>> +     node = amdgpu_vram_it_iter_first(&mgr->root, start, last);
>>> +     if (node) {
>>> +             amdgpu_vram_it_remove(node, &mgr->root);
>>> +             kvfree(node);
>>> +     }
>>> +}
>>> +
>>> +static inline const char* ttm_bo_type2str(enum ttm_bo_type type) {
>>> +     switch (type) {
>>> +     case ttm_bo_type_kernel:
>>> +             return "kernel";
>>> +     case ttm_bo_type_device:
>>> +             return "device";
>>> +     case ttm_bo_type_sg:
>>> +             return "sg";
>>> +     default:
>>> +             return "unknow";
>>> +     }
>>> +}
>>> +
>>> +static inline const char* amdgpu_vram_domain_str(u32 domain, char
>>> +*tmp) {
>>> +     int index = 0;
>>> +
>>> +     tmp[index++] = domain & AMDGPU_GEM_DOMAIN_CPU ? 'C' : '-';
>>> +     tmp[index++] = domain & AMDGPU_GEM_DOMAIN_GTT ? 'G' : '-';
>>> +     tmp[index++] = domain & AMDGPU_GEM_DOMAIN_VRAM ? 'V' : '-';
>>> +     tmp[index++] = domain & (AMDGPU_GEM_DOMAIN_GDS | 
>>> AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA) ?
>>> +             'P' : '-';
>>> +     tmp[index++] = '\0';
>>> +
>>> +     return tmp;
>>> +}
>>> +
>>> +static inline const char* amdgpu_vram_bo_usage(struct
>>> +ttm_buffer_object *tbo) {
>>> +     if (tbo->destroy == &amdgpu_bo_destroy)
>>> +             return "normal";
>>> +     else if (tbo->destroy == &amdgpu_bo_user_destroy)
>>> +             return "user";
>>> +     else if (tbo->destroy == &amdgpu_bo_vm_destroy)
>>> +             return "vm";
>>> +     else
>>> +             return "unknow";
>>> +}
>>> +
>>> +static void amdgpu_vram_mm_debug(struct amdgpu_vram_mgr *mgr, struct
>>> +drm_printer *p) {
>>> +     struct amdgpu_vram_node *node;
>>> +     struct ttm_buffer_object *tbo;
>>> +     struct amdgpu_bo *abo;
>>> +     char tmp[5];
>>> +
>>> +     for_each_vram_mm_node(node, mgr) {
>>> +             tbo = node->tbo;
>>> +             abo = ttm_to_amdgpu_bo(tbo);
>>> +             drm_printf(p, "%#018llx-%#018llx:", node->start, 
>>> node->last);
>>> +             if (abo)
>>> +                     drm_printf(p, " (type:%-5s usage:%-8s 
>>> domain:%s %s) @%p",
>>> + ttm_bo_type2str(tbo->type),
>>> +                                amdgpu_vram_bo_usage(tbo),
>>> + amdgpu_vram_domain_str(abo->preferred_domains, tmp),
>>> + amdgpu_vram_domain_str(abo->allowed_domains, tmp),
>>> +                                abo);
>>> +             else
>>> +                     drm_printf(p, " (reserved)");
>>> +             drm_printf(p, " %llu bytes\n",
>>> +                        node->last - node->start + 1);
>>> +     }
>>> +}
>>> +
>>>    static inline struct amdgpu_vram_mgr *
>>>    to_vram_mgr(struct ttm_resource_manager *man)
>>>    {
>>> @@ -288,6 +406,7 @@ static void amdgpu_vram_mgr_do_reserve(struct 
>>> ttm_resource_manager *man)
>>>                dev_dbg(adev->dev, "Reservation 0x%llx - %lld, 
>>> Succeeded\n",
>>>                        rsv->start, rsv->size);
>>>
>>> +             amdgpu_vram_mm_add_block(block, mgr, NULL);
>>>                vis_usage = amdgpu_vram_mgr_vis_size(adev, block);
>>>                atomic64_add(vis_usage, &mgr->vis_usage);
>>>                spin_lock(&man->bdev->lru_lock);
>>> @@ -540,6 +659,8 @@ static int amdgpu_vram_mgr_new(struct 
>>> ttm_resource_manager *man,
>>>                vres->base.start = max(vres->base.start, start);
>>>
>>>                vis_usage += amdgpu_vram_mgr_vis_size(adev, block);
>>> +
>>> +             amdgpu_vram_mm_add_block(block, mgr, tbo);
>>>        }
>>>
>>>        if (amdgpu_is_vram_mgr_blocks_contiguous(&vres->blocks))
>>> @@ -583,8 +704,10 @@ static void amdgpu_vram_mgr_del(struct 
>>> ttm_resource_manager *man,
>>>        uint64_t vis_usage = 0;
>>>
>>>        mutex_lock(&mgr->lock);
>>> -     list_for_each_entry(block, &vres->blocks, link)
>>> +     list_for_each_entry(block, &vres->blocks, link) {
>>>                vis_usage += amdgpu_vram_mgr_vis_size(adev, block);
>>> +             amdgpu_vram_mm_remove_block(block, mgr);
>>> +     }
>>>
>>>        amdgpu_vram_mgr_do_reserve(man);
>>>
>>> @@ -747,6 +870,9 @@ static void amdgpu_vram_mgr_debug(struct 
>>> ttm_resource_manager *man,
>>>        drm_printf(printer, "reserved:\n");
>>>        list_for_each_entry(block, &mgr->reserved_pages, link)
>>>                drm_buddy_block_print(mm, block, printer);
>>> +     drm_printf(printer, "vram usage:\n");
>>> +     amdgpu_vram_mm_debug(mgr, printer);
>>> +
>>>        mutex_unlock(&mgr->lock);
>>>    }
>>>
>>> @@ -769,6 +895,8 @@ int amdgpu_vram_mgr_init(struct amdgpu_device 
>>> *adev)
>>>        struct ttm_resource_manager *man = &mgr->manager;
>>>        int err;
>>>
>>> +     mgr->root = RB_ROOT_CACHED;
>>> +
>>>        ttm_resource_manager_init(man, &adev->mman.bdev,
>>>                                  adev->gmc.real_vram_size);
>>>
>>> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
>>> b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
>>> index 0e04e42cf809..a14c56e1e407 100644
>>> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
>>> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.h
>>> @@ -28,6 +28,7 @@
>>>
>>>    struct amdgpu_vram_mgr {
>>>        struct ttm_resource_manager manager;
>>> +     struct rb_root_cached root;
>>>        struct drm_buddy mm;
>>>        /* protects access to buffer objects */
>>>        struct mutex lock;
>


  reply	other threads:[~2022-11-24 14:56 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-24  5:49 [PATCH] drm/amdgpu: add amdgpu vram usage information into amdgpu_vram_mm Yang Wang
2022-11-24  7:24 ` Christian König
2022-11-24  7:45   ` Wang, Yang(Kevin)
2022-11-24  7:47     ` Christian König
2022-11-24 14:56       ` Arunpravin Paneer Selvam [this message]
2022-11-25  2:35         ` 回复: " Wang, Yang(Kevin)
2022-11-27  7:34           ` Arunpravin Paneer Selvam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e40d1956-0fb6-2972-6a85-193f4e89c328@amd.com \
    --to=arunpravin.paneerselvam@amd.com \
    --cc=Alexander.Deucher@amd.com \
    --cc=Christian.Koenig@amd.com \
    --cc=Hawking.Zhang@amd.com \
    --cc=KevinYang.Wang@amd.com \
    --cc=amd-gfx@lists.freedesktop.org \
    --cc=ckoenig.leichtzumerken@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.