nouveau.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
From: "Liam R. Howlett" <Liam.Howlett@Oracle.com>
To: Danilo Krummrich <dakr@redhat.com>
Cc: matthew.brost@intel.com, willy@infradead.org, daniel@ffwll.ch,
	dri-devel@lists.freedesktop.org, corbet@lwn.net,
	nouveau@lists.freedesktop.org, ogabbay@kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	mripard@kernel.org, linux-mm@kvack.org,
	boris.brezillon@collabora.com, bskeggs@redhat.com,
	alexdeucher@gmail.com, Dave Airlie <airlied@redhat.com>,
	bagasdotme@gmail.com, christian.koenig@amd.com,
	jason@jlekstrand.net
Subject: Re: [Nouveau] [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings
Date: Wed, 1 Mar 2023 21:38:29 -0500	[thread overview]
Message-ID: <20230302023829.kcxyukt7guacr7xg@revolver> (raw)
In-Reply-To: <0868bbeb-11b4-b832-a601-f289278e3e76@redhat.com>

* Danilo Krummrich <dakr@redhat.com> [230227 08:17]:

...
> > > Would this variant be significantly more efficient?
> > 
> > Well, what you are doing is walking the tree to see if there's anything
> > there... then re-walking the tree to store it.  So, yes, it's much more
> > efficient..  However, writing is heavier.  How much of the time is spent
> > walking vs writing depends on the size of the tree, but it's rather easy
> > to do this in a single walk of the tree so why wouldn't you?
> 
> I will, I was just curious about how much of an impact it has.
> 
> > 
> > > 
> > > Also, would this also work while already walking the tree?
> > 
> > Yes, to an extent.  If you are at the correct location in the tree, you
> > can write to that location.  If you are not in the correct location and
> > try to write to the tree then things will go poorly..  In this scenario,
> > we are very much walking the tree and writing to it in two steps.
> > 
> > > 
> > > To remove an entry while walking the tree I have a separate function
> > > drm_gpuva_iter_remove(). Would I need something similar for inserting
> > > entries?
> > 
> > I saw that.  Your remove function uses the erase operation which is
> > implemented as a walk to that location and a store of a null over the
> > range that is returned.  You do not need a function to insert an entry
> > if the maple state is at the correct location, and that doesn't just
> > mean setting mas.index/mas.last to the correct value.  There is a node &
> > offset saved in the maple state that needs to be in the correct
> > location.  If you store to that node then the node may be replaced, so
> > other iterators that you have may become stale, but the one you used
> > execute the store operation will now point to the new node with the new
> > entry.
> > 
> > > 
> > > I already provided this example in a separate mail thread, but it may makes
> > > sense to move this to the mailing list:
> > > 
> > > In __drm_gpuva_sm_map() we're iterating a given range of the tree, where the
> > > given range is the size of the newly requested mapping. __drm_gpuva_sm_map()
> > > invokes a callback for each sub-operation that needs to be taken in order to
> > > fulfill this mapping request. In most cases such a callback just creates a
> > > drm_gpuva_op object and stores it in a list.
> > > 
> > > However, drivers can also implement the callback, such that they directly
> > > execute this operation within the callback.
> > > 
> > > Let's have a look at the following example:
> > > 
> > >       0     a     2
> > > old: |-----------|       (bo_offset=n)
> > > 
> > >             1     b     3
> > > req:       |-----------| (bo_offset=m)
> > > 
> > >       0  a' 1     b     3
> > > new: |-----|-----------| (a.bo_offset=n,b.bo_offset=m)
> > > 
> > > This would result in the following operations.
> > > 
> > > __drm_gpuva_sm_map() finds entry "a" and calls back into the driver
> > > suggesting to re-map "a" with the new size. The driver removes entry "a"
> > > from the tree and adds "a'"
> > 
> > What you have here won't work.  The driver will cause your iterators
> > maple state to point to memory that is freed.  You will either need to
> > pass through your iterator so that the modifications can occur with that
> > maple state so it remains valid, or you will need to invalidate the
> > iterator on every modification by the driver.
> > 
> > I'm sure the first idea you have will be to invalidate the iterator, but
> > that is probably not the way to proceed.  Even ignoring the unclear
> > locking of two maple states trying to modify the tree, this is rather
> > inefficient - each invalidation means a re-walk of the tree.  You may as
> > well not use an iterator in this case.
> > 
> > Depending on how/when the lookups occur, you could still iterate over
> > the tree and let the driver modify the ending of "a", but leave the tree
> > alone and just store b over whatever - but the failure scenarios may
> > cause you grief.
> > 
> > If you pass the iterator through, then you can just use it to do your
> > writes and keep iterating as if nothing changed.
> 
> Passing through the iterater clearly seems to be the way to go.
> 
> I assume that if the entry to insert isn't at the location of the iterator
> (as in the following example) we can just keep walking to this location my
> changing the index of the mas and calling mas_walk()?

no.  You have to mas_set() to the value and walk from the top of the
tree.  mas_walk() walks down, not from side to side - well, it does go
forward within a node (increasing offset), but if you hit the node limit
then you have gotten yourself in trouble.

> This would also imply
> that the "outer" tree walk continues after the entry we just inserted,
> right?

I don't understand the "outer" tree walk statement.

> 
>            1     a     3
> old:       |-----------| (bo_offset=n)
> 
>      0     b     2
> req: |-----------|       (bo_offset=m)
> 
>      0     b     2  a' 3
> new: |-----------|-----| (b.bo_offset=m,a.bo_offset=n+2)
> 
> Again, after finding "a", we want to remove it and insert "a'" instead.

Ah, so you could walk to 0, see that it's NULL from 0 - 1, call
mas_next() and get "a" from 1 - 3, write "a'" from 2 - 3:

        0     1  a   2  a' 3
broken: |-----|------|-----| (a is broken in this 1/2 step)

mas_set_range(&mas, 0, 2); /* Resets the tree location to MAS_START */
mas_store(&mas, b);
        0     b     2  a' 3
new:    |-----------|-----| (b.bo_offset=m,a.bo_offset=n+2)


You can *probably* also get away with this:

walk to 0, see that it's NULL from 0 - 1, call mas_next() and get "a"
from 1 - 3, write "a'" from 2 - 3:

        0     1  a   2  a' 3
broken: |-----|------|-----| (a is broken in this 1/2 step)

mas_prev(&mas, 0); /* Looking at broken a from 1-2.
mas_store(&mas, NULL); /* NULL is expanded on write to 0-2.
            0    NULL   2  a' 3
broken':    |-----------|-----| (b.bo_offset=m,a.bo_offset=n+2)

mas_store(&mas, b);
        0     b     2  a' 3
new:    |-----------|-----| (b.bo_offset=m,a.bo_offset=n+2)

You may want to iterate backwards and do the writes as you go until you
have enough room.. it really depends how you want to go about doing
things.

> 
> > 
> > > 
> > > __drm_gpuva_sm_map(), ideally, continues the loop searching for nodes
> > > starting from the end of "a" (which is 2) till the end of the requested
> > > mapping "b" (which is 3). Since it doesn't find any other mapping within
> > > this range it calls back into the driver suggesting to finally map "b".
> > > 
> > > If there would have been another mapping between 2 and 3 it would have
> > > called back into the driver asking to unmap this mapping beforehand.
> > > 
> > > So, it boils down to re-mapping as described at the beginning (and
> > > analogously at the end) of a new mapping range and removing of entries that
> > > are enclosed by the new mapping range.
> > 
> > I assume the unmapped area is no longer needed, and the 're-map' is
> > really a removal of information?  Otherwise I'd suggest searching for a
> > gap which fits your request.  What you have here is a lot like
> > "MAP_FIXED" vs top-down/bottom-up search in the VMA code, this seems to
> > be like your __drm_gpuva_sm_map() and the drm mm range allocator with
> > DRM_MM_INSERT_LOW, and DRM_MM_INSERT_HIGH.
> > 
> > Why can these split/unmappings fail?  Is it because they are still
> > needed?
> > 
> 
> You mean the check before the mas_*() operations in drm_gpuva_insert()?

Yes, the callbacks.

> 
> Removing entries should never fail, inserting entries should fail when the
> caller tries to store to an area outside of the VA space (it doesn't
> necessarily span the whole 64-bit space), a kernel reserved area of the VA
> space, is not in any pre-allocated range of the VA space (if regions are
> enabled) or an entry already exists at that location.

In the mmap code, I have to deal with splitting the start/end VMA and
removing any VMAs in the way.  I do this by making a 'detached' tree
that is dealt with later, then just overwriting the area with one
mas_store() operation.  Would something like that work for you?

> 
> > > 
> > > > > +	if (unlikely(ret))
> > > > > +		return ret;
> > > > > +
> > > > > +	va->mgr = mgr;
> > > > > +	va->region = reg;
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +EXPORT_SYMBOL(drm_gpuva_insert);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuva_remove - remove a &drm_gpuva
> > > > > + * @va: the &drm_gpuva to remove
> > > > > + *
> > > > > + * This removes the given &va from the underlaying tree.
> > > > > + */
> > > > > +void
> > > > > +drm_gpuva_remove(struct drm_gpuva *va)
> > > > > +{
> > > > > +	MA_STATE(mas, &va->mgr->va_mt, va->va.addr, 0);
> > > > > +
> > > > > +	mas_erase(&mas);
> > > > > +}
> > > > > +EXPORT_SYMBOL(drm_gpuva_remove);
> > > > > +
> > > > ...
> > > > 
> > > > > +/**
> > > > > + * drm_gpuva_find_first - find the first &drm_gpuva in the given range
> > > > > + * @mgr: the &drm_gpuva_manager to search in
> > > > > + * @addr: the &drm_gpuvas address
> > > > > + * @range: the &drm_gpuvas range
> > > > > + *
> > > > > + * Returns: the first &drm_gpuva within the given range
> > > > > + */
> > > > > +struct drm_gpuva *
> > > > > +drm_gpuva_find_first(struct drm_gpuva_manager *mgr,
> > > > > +		     u64 addr, u64 range)
> > > > > +{
> > > > > +	MA_STATE(mas, &mgr->va_mt, addr, 0);
> > > > > +
> > > > > +	return mas_find(&mas, addr + range - 1);
> > > > > +}
> > > > > +EXPORT_SYMBOL(drm_gpuva_find_first);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuva_find - find a &drm_gpuva
> > > > > + * @mgr: the &drm_gpuva_manager to search in
> > > > > + * @addr: the &drm_gpuvas address
> > > > > + * @range: the &drm_gpuvas range
> > > > > + *
> > > > > + * Returns: the &drm_gpuva at a given &addr and with a given &range
> > > > 
> > > > Note that mas_find() will continue upwards in the address space if there
> > > > isn't anything at @addr.  This means that &drm_gpuva may not be at
> > > > &addr.  If you want to check just at &addr, use mas_walk().
> > > 
> > > Good catch. drm_gpuva_find() should then either also check for 'va->va.addr
> > > == addr' as well or, alternatively, use mas_walk(). As above, any reason to
> > > prefer mas_walk()?

I think I missed this question last time..

Internally, mas_find() is just a mas_walk() on the first call, then
mas_next() for each call after that.  If, during the mas_walk(), there
is no value at addr, it immediately calls mas_next() to get a value to
return.  It will continue upwards until the limit is reached (addr +
range - 1 in your case).

So if you only want to know if there is something at addr, then it's
best to use mas_walk() and keep things a bit more efficient.  Then you
can check mas.last for your end value.

If you do want the first VMA within the range passed in, then mas_find()
is the function you want.

> > > 
> > > > 
> > > > > + */
> > > > > +struct drm_gpuva *
> > > > > +drm_gpuva_find(struct drm_gpuva_manager *mgr,
> > > > > +	       u64 addr, u64 range)
> > > > > +{
> > > > > +	struct drm_gpuva *va;
> > > > > +
> > > > > +	va = drm_gpuva_find_first(mgr, addr, range);
> > > > > +	if (!va)
> > > > > +		goto out;
> > > > > +
> > > > > +	if (va->va.range != range)
> > > > > +		goto out;
> > > > > +
> > > > > +	return va;
> > > > > +
> > > > > +out:
> > > > > +	return NULL;
> > > > > +}
> > > > > +EXPORT_SYMBOL(drm_gpuva_find);
> > > > > +
> > > > > +/**
> > > > > + * drm_gpuva_find_prev - find the &drm_gpuva before the given address
> > > > > + * @mgr: the &drm_gpuva_manager to search in
> > > > > + * @start: the given GPU VA's start address
> > > > > + *
> > > > > + * Find the adjacent &drm_gpuva before the GPU VA with given &start address.
> > > > > + *
> > > > > + * Note that if there is any free space between the GPU VA mappings no mapping
> > > > > + * is returned.
> > > > > + *
> > > > > + * Returns: a pointer to the found &drm_gpuva or NULL if none was found
> > > > > + */
> > > > > +struct drm_gpuva *
> > > > > +drm_gpuva_find_prev(struct drm_gpuva_manager *mgr, u64 start)
> > > > 
> > > > find_prev() usually continues beyond 1 less than the address. I found
> > > > this name confusing.
> > > 
> > > Don't really get that, mind explaining?
> > 
> > When I ask for the previous one in a list or tree, I think the one
> > before.. but since you are limiting your search from start to start - 1,
> > you may as well walk to start - 1 and see if one exists.
> > 
> > Is that what you meant to do here?
> 
> Yes, I want to know whether there is a previous entry which ends right
> before the current entry, without a gap between the two.
> 
> > 
> > > 
> > > > You may as well use mas_walk(), it would be faster.
> > > 
> > > How would I use mas_walk() for that? If I understand it correctly,
> > > mas_walk() requires me to know that start address, which I don't know for
> > > the previous entry.
> > 
> > mas_walk() walks to the value you specify and returns the entry at that
> > address, not necessarily the start address, but any address in the
> > range.
> > 
> > If you have a tree and store A = [0x1000 - 0x2000] and set your maple
> > state to walk to 0x1500, mas_walk() will return A, and the maple state
> > will have mas.index = 0x1000 and mas.last = 0x2000.
> > 
> > You have set the maple state to start at "start" and called
> > mas_prev(&mas, start - 1).  start - 1 is the lower limit, so the
> > internal implementation will walk to start then go to the previous entry
> > until start - 1.. it will stop at start - 1 and return NULL if there
> > isn't one there.
> 
> Thanks for the clarification and all the other very helpful comments and
> explanations!
> 

Always glad to help.  The more users the tree has, the more I can see
where we may need to expand the interface to help others.

...


  reply	other threads:[~2023-05-04 12:33 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-17 13:44 [Nouveau] [PATCH drm-next v2 00/16] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI Danilo Krummrich
2023-02-17 13:44 ` [Nouveau] [PATCH drm-next v2 01/16] drm: execution context for GEM buffers Danilo Krummrich
2023-02-17 16:00   ` Christian König
2023-02-21 14:56     ` Danilo Krummrich
2023-02-17 13:44 ` [Nouveau] [PATCH drm-next v2 02/16] drm/exec: fix memory leak in drm_exec_prepare_obj() Danilo Krummrich
2023-02-17 13:44 ` [Nouveau] [PATCH drm-next v2 03/16] maple_tree: split up MA_STATE() macro Danilo Krummrich
2023-02-17 18:34   ` Liam R. Howlett
2023-02-20 13:48     ` Danilo Krummrich
2023-02-21 16:52       ` Liam R. Howlett
2023-02-17 19:45   ` Matthew Wilcox
2023-02-20 13:48     ` Danilo Krummrich
2023-02-17 13:44 ` [Nouveau] [PATCH drm-next v2 04/16] maple_tree: add flag MT_FLAGS_LOCK_NONE Danilo Krummrich
2023-02-17 18:18   ` Liam R. Howlett
2023-02-17 19:38   ` Matthew Wilcox
2023-02-20 14:00     ` Danilo Krummrich
2023-02-20 15:10       ` Matthew Wilcox
2023-02-20 17:06         ` Danilo Krummrich
2023-02-20 20:33           ` Matthew Wilcox
2023-02-21 14:37             ` Danilo Krummrich
2023-02-21 18:31               ` Matthew Wilcox
2023-02-22 16:11                 ` Danilo Krummrich
2023-02-22 16:32                   ` Matthew Wilcox
2023-02-22 17:28                     ` Danilo Krummrich
2023-02-27 17:39                 ` Danilo Krummrich
2023-02-27 18:36                   ` Matthew Wilcox
2023-02-27 18:59                     ` Danilo Krummrich
2023-02-17 13:44 ` [Nouveau] [PATCH drm-next v2 05/16] drm: manager to keep track of GPUs VA mappings Danilo Krummrich
2023-02-18  1:05   ` kernel test robot
2023-02-21 18:20   ` Liam R. Howlett
2023-02-22 18:13     ` Danilo Krummrich
2023-02-23 19:09       ` Liam R. Howlett
2023-02-27 12:23         ` Danilo Krummrich
2023-03-02  2:38           ` Liam R. Howlett [this message]
2023-03-06 15:46             ` Danilo Krummrich
2023-03-07 22:43               ` Liam R. Howlett
2023-03-13 23:46                 ` Danilo Krummrich
2023-03-20 19:16                   ` Liam R. Howlett
2023-02-28  2:17     ` Danilo Krummrich
2023-02-28 16:24       ` Liam R. Howlett
2023-03-06 13:39         ` Danilo Krummrich
2023-02-22 10:25   ` Christian König
2023-02-22 15:07     ` Danilo Krummrich
2023-02-22 15:14       ` Christian König
2023-02-22 16:40         ` Danilo Krummrich
2023-02-23  7:06           ` Christian König
2023-02-23 14:12             ` Danilo Krummrich
2023-02-17 13:48 ` [Nouveau] [PATCH drm-next v2 06/16] drm: debugfs: provide infrastructure to dump a DRM GPU VA space Danilo Krummrich
2023-02-18  2:47   ` kernel test robot
2023-02-17 13:48 ` [Nouveau] [PATCH drm-next v2 07/16] drm/nouveau: new VM_BIND uapi interfaces Danilo Krummrich
2023-02-17 13:48 ` [Nouveau] [PATCH drm-next v2 08/16] drm/nouveau: get vmm via nouveau_cli_vmm() Danilo Krummrich
2023-02-17 13:48 ` [Nouveau] [PATCH drm-next v2 09/16] drm/nouveau: bo: initialize GEM GPU VA interface Danilo Krummrich
2023-02-17 13:48 ` [Nouveau] [PATCH drm-next v2 10/16] drm/nouveau: move usercopy helpers to nouveau_drv.h Danilo Krummrich
2023-02-17 13:48 ` [Nouveau] [PATCH drm-next v2 11/16] drm/nouveau: fence: fail to emit when fence context is killed Danilo Krummrich
2023-02-17 13:48 ` [Nouveau] [PATCH drm-next v2 12/16] drm/nouveau: chan: provide nouveau_channel_kill() Danilo Krummrich
2023-02-17 13:48 ` [Nouveau] [PATCH drm-next v2 13/16] drm/nouveau: nvkm/vmm: implement raw ops to manage uvmm Danilo Krummrich
2023-02-18  1:16   ` kernel test robot
2023-02-17 13:48 ` [Nouveau] [PATCH drm-next v2 14/16] drm/nouveau: implement uvmm for user mode bindings Danilo Krummrich
2023-02-17 13:48 ` [Nouveau] [PATCH drm-next v2 15/16] drm/nouveau: implement new VM_BIND UAPI Danilo Krummrich
2023-02-17 13:48 ` [Nouveau] [PATCH drm-next v2 16/16] drm/nouveau: debugfs: implement DRM GPU VA debugfs Danilo Krummrich
2023-03-09  9:12 ` [Nouveau] [PATCH drm-next v2 00/16] [RFC] DRM GPUVA Manager & Nouveau VM_BIND UAPI Boris Brezillon
2023-03-09  9:48   ` Boris Brezillon
2023-03-10 16:45     ` Danilo Krummrich
2023-03-10 17:25       ` Boris Brezillon
2023-03-10 20:06         ` Danilo Krummrich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230302023829.kcxyukt7guacr7xg@revolver \
    --to=liam.howlett@oracle.com \
    --cc=airlied@redhat.com \
    --cc=alexdeucher@gmail.com \
    --cc=bagasdotme@gmail.com \
    --cc=boris.brezillon@collabora.com \
    --cc=bskeggs@redhat.com \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=dakr@redhat.com \
    --cc=daniel@ffwll.ch \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jason@jlekstrand.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.brost@intel.com \
    --cc=mripard@kernel.org \
    --cc=nouveau@lists.freedesktop.org \
    --cc=ogabbay@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).