All of lore.kernel.org
 help / color / mirror / Atom feed
From: Thomas Hellstrom <thellstrom@vmware.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: pv-drivers@vmware.com, linux-graphics-maintainer@vmware.com,
	dri-devel@lists.freedesktop.org
Subject: Re: [PATCH -fixes 5/5] drm/vmwgfx: Fix a buffer object eviction regression
Date: Thu, 13 Sep 2018 18:52:43 +0200	[thread overview]
Message-ID: <c575fa18-424c-bff7-48f2-9fea759fa627@vmware.com> (raw)
In-Reply-To: <20180913152811.GA11574@bombadil.infradead.org>

On 09/13/2018 05:28 PM, Matthew Wilcox wrote:
> On Thu, Sep 13, 2018 at 04:56:53PM +0200, Thomas Hellstrom wrote:
>> Hi,
>>
>> On 09/13/2018 04:10 PM, Matthew Wilcox wrote:
>>> On Thu, Sep 13, 2018 at 01:58:37PM +0200, Thomas Hellstrom wrote:
>>>> Commit 4eb085e42fde ("drm/vmwgfx: Convert to new IDA API") indroduced
>>>> an incorrect return value from the function vmw_gmrid_man_get_node(),
>>>> when we run out if integer ids. Instead of returning 0 (meaning
>>>> non-fatal error) we forward the ida_simple_get error code -ENOSPC.
>>>> This causes TTM not to retry allocation after buffer eviction and
>>>> instead return -ENOSPC to user-space.
>>>>
>>>> Fix this by returning 0 when ida_simple_get() returns -ENOSPC.
>>> Thanks.  I got confused by the convoluted code that was there before ;-(
>>>
>>> I think this could be better though ... if ida_alloc() ever starts
>>> returning a different errno in the future, you'll hit the same problem,
>>> right?  So how about this ...
>>>
>>>    	id = ida_alloc_max(&gman->gmr_ida, gman->max_gmr_ids - 1, GFP_KERNEL);
>>> +	if (id == -ENOMEM)
>>> +		return -ENOMEM;
>>> +	if (id < 0)
>>> +		return 0;
>>>    	spin_lock(&gman->lock);
>>>
>>> But I wonder ... why is -ENOMEM seen as a fatal error?  If you free up
>>> some memory, you'll free up an ID, so the next time around you should
>>> be able to allocate an ID.  So shouldn't this function just have
>>> been doing this all along?
>>>
>>>    	id = ida_alloc_max(&gman->gmr_ida, gman->max_gmr_ids - 1, GFP_KERNEL);
>>> +	if (id < 0)
>>> +		return 0;
>>>
>> Non-fatal errors are errors that can be remedied by GPU buffer eviction, and
>> buffer eviction will free up IDA space, so basically we need to target only
>> the error code that indicates we've run out of IDA space.
> Yes, but the following situation can happen:
>
>   - Allocate 1024 IDs
>   - Run very low on memory
>   - Allocating ID 1025 will fail (very very unlikely)
>   - ida_alloc_max() returns -ENOMEM
>
> In this situation, we want ttm_mem_evict_first() to be called which will
> free up one of the 1024 existing IDs and then we can allocate that ID for
> our new node.
>
> I'm assuming we're analysing the behaviour of ttm_bo_mem_force_space()
> here.

Well, that's true, but that situation depends I guess very much on the 
radix tree implementation of IDA? Also I would expect the eviction paths 
to try to allocate more memory here and there, so to me the preferred 
option when -ENOMEM happens, is really to back off as soon as possible 
to avoid interfering with shrinker work going on etc.

>> If we're worried that ida_alloc_max() will change return value, I guess we
>> will have to increase the IDA space and detect the error ourselves:  error
>> if (id >= gman->max_gmr_ids)
> My point was that your solution (detect the one error which should be
> deemed as non-fatal) was not as robust as its inverse (detect the one
> error which the previous code deemed as fatal).  But I now believe no
> error from the IDA should be seen as fatal.

If you insist, I can test on -ENOMEM instead of -ENOSPC to mimic the 
pre-change behaviour. We should really focus on the IDA api changes 
here, and defer changing -ENOMEM to non-fatal to a follow-up patch if 
needed.

Thanks,

Thomas


_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

  reply	other threads:[~2018-09-13 16:52 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-13 11:58 [PATCH -fixes 1/5] drm/vmwgfx: don't check for old_crtc_state enable status Thomas Hellstrom
2018-09-13 11:58 ` [PATCH -fixes 2/5] drm/vmwgfx: limit screen size to stdu_max during check_modeset Thomas Hellstrom
2018-09-13 11:58 ` [PATCH -fixes 3/5] drm/vmwgfx: limit mode size for all display unit to texture_max Thomas Hellstrom
2018-09-13 11:58 ` [PATCH -fixes 4/5] drm/vmwgfx: Don't impose STDU limits on framebuffer size Thomas Hellstrom
2018-09-13 11:58 ` [PATCH -fixes 5/5] drm/vmwgfx: Fix a buffer object eviction regression Thomas Hellstrom
2018-09-13 14:10   ` Matthew Wilcox
2018-09-13 14:56     ` Thomas Hellstrom
2018-09-13 15:28       ` Matthew Wilcox
2018-09-13 16:52         ` Thomas Hellstrom [this message]
2018-09-13 17:38           ` Matthew Wilcox
2018-09-13 18:17             ` Thomas Hellstrom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c575fa18-424c-bff7-48f2-9fea759fa627@vmware.com \
    --to=thellstrom@vmware.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-graphics-maintainer@vmware.com \
    --cc=pv-drivers@vmware.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.