[PATCH] i915: slab shrinker have to return -1 if it can't shrink any objects

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] i915: slab shrinker have to return -1 if it can't shrink any objects
@ 2011-06-24  8:03 KOSAKI Motohiro
  2011-06-30  3:53   ` Keith Packard
  0 siblings, 1 reply; 15+ messages in thread
From: KOSAKI Motohiro @ 2011-06-24  8:03 UTC (permalink / raw)
  To: linux-kernel, keithp, airlied, dri-devel; +Cc: kosaki.motohiro

Now, i915_gem_inactive_shrink() should return -1 instead of 0 if it
can't take a lock. Otherwise, vmscan is getting a lot of confusing
because vmscan can't distinguish "can't take a lock temporary" and
"we've shrank all of i915 objects".

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Keith Packard <keithp@keithp.com>
Cc: David Airlie <airlied@linux.ie>
Cc: dri-devel@lists.freedesktop.org
---

I've found this issue by reviewing. I hope i915 developers confirm this.
Thx.


 drivers/gpu/drm/i915/i915_gem.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 94c84d7..2f9a9b2 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -4104,7 +4104,7 @@ i915_gem_inactive_shrink(struct shrinker *shrinker, struct shrink_control *sc)
 	int cnt;

 	if (!mutex_trylock(&dev->struct_mutex))
-		return 0;
+		return nr_to_scan ? -1 : 0;

 	/* "fast-path" to count number of available objects */
 	if (nr_to_scan == 0) {
-- 
1.7.3.1




^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it can't shrink any objects
  2011-06-24  8:03 [PATCH] i915: slab shrinker have to return -1 if it can't shrink any objects KOSAKI Motohiro
@ 2011-06-30  3:53   ` Keith Packard
  0 siblings, 0 replies; 15+ messages in thread
From: Keith Packard @ 2011-06-30  3:53 UTC (permalink / raw)
  To: KOSAKI Motohiro, linux-kernel, airlied, dri-devel; +Cc: kosaki.motohiro

[-- Attachment #1: Type: text/plain, Size: 591 bytes --]

On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> Now, i915_gem_inactive_shrink() should return -1 instead of 0 if it
> can't take a lock. Otherwise, vmscan is getting a lot of confusing
> because vmscan can't distinguish "can't take a lock temporary" and
> "we've shrank all of i915 objects".

This doesn't look like the cleanest change possible. I think it would be
better if the shrink function could uniformly return an error
indication so that we wouldn't need the weird looking conditional return.

-- 
keith.packard@intel.com

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it can't shrink any objects
@ 2011-06-30  3:53   ` Keith Packard
  0 siblings, 0 replies; 15+ messages in thread
From: Keith Packard @ 2011-06-30  3:53 UTC (permalink / raw)
  To: linux-kernel, airlied, dri-devel; +Cc: kosaki.motohiro


[-- Attachment #1.1: Type: text/plain, Size: 591 bytes --]

On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> Now, i915_gem_inactive_shrink() should return -1 instead of 0 if it
> can't take a lock. Otherwise, vmscan is getting a lot of confusing
> because vmscan can't distinguish "can't take a lock temporary" and
> "we've shrank all of i915 objects".

This doesn't look like the cleanest change possible. I think it would be
better if the shrink function could uniformly return an error
indication so that we wouldn't need the weird looking conditional return.

-- 
keith.packard@intel.com

[-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects
  2011-06-30  3:53   ` Keith Packard
@ 2011-06-30  8:55     ` Chris Wilson
  -1 siblings, 0 replies; 15+ messages in thread
From: Chris Wilson @ 2011-06-30  8:55 UTC (permalink / raw)
  To: Keith Packard, KOSAKI Motohiro, linux-kernel, airlied, dri-devel
  Cc: kosaki.motohiro

On Wed, 29 Jun 2011 20:53:54 -0700, Keith Packard <keithp@keithp.com> wrote:
> On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> 
> > Now, i915_gem_inactive_shrink() should return -1 instead of 0 if it
> > can't take a lock. Otherwise, vmscan is getting a lot of confusing
> > because vmscan can't distinguish "can't take a lock temporary" and
> > "we've shrank all of i915 objects".
> 
> This doesn't look like the cleanest change possible. I think it would be
> better if the shrink function could uniformly return an error
> indication so that we wouldn't need the weird looking conditional return.

Unless I am mistaken, and there are more patches in flight, the return
code from i915_gem_inactive_shrink() is promoted to unsigned long and then
used in the calculation of how may objects to evict...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects
@ 2011-06-30  8:55     ` Chris Wilson
  0 siblings, 0 replies; 15+ messages in thread
From: Chris Wilson @ 2011-06-30  8:55 UTC (permalink / raw)
  To: Keith Packard, linux-kernel, airlied, dri-devel; +Cc: kosaki.motohiro

On Wed, 29 Jun 2011 20:53:54 -0700, Keith Packard <keithp@keithp.com> wrote:
> On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> 
> > Now, i915_gem_inactive_shrink() should return -1 instead of 0 if it
> > can't take a lock. Otherwise, vmscan is getting a lot of confusing
> > because vmscan can't distinguish "can't take a lock temporary" and
> > "we've shrank all of i915 objects".
> 
> This doesn't look like the cleanest change possible. I think it would be
> better if the shrink function could uniformly return an error
> indication so that we wouldn't need the weird looking conditional return.

Unless I am mistaken, and there are more patches in flight, the return
code from i915_gem_inactive_shrink() is promoted to unsigned long and then
used in the calculation of how may objects to evict...
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects
  2011-06-30  8:55     ` Chris Wilson
  (?)
@ 2011-07-12  9:36     ` KOSAKI Motohiro
  2011-07-12 10:06       ` Chris Wilson
  -1 siblings, 1 reply; 15+ messages in thread
From: KOSAKI Motohiro @ 2011-07-12  9:36 UTC (permalink / raw)
  To: chris; +Cc: keithp, linux-kernel, airlied, dri-devel

Hi,

sorry for the delay.

> On Wed, 29 Jun 2011 20:53:54 -0700, Keith Packard <keithp@keithp.com> wrote:
>> On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>>
>>> Now, i915_gem_inactive_shrink() should return -1 instead of 0 if it
>>> can't take a lock. Otherwise, vmscan is getting a lot of confusing
>>> because vmscan can't distinguish "can't take a lock temporary" and
>>> "we've shrank all of i915 objects".
>>
>> This doesn't look like the cleanest change possible. I think it would be
>> better if the shrink function could uniformly return an error
>> indication so that we wouldn't need the weird looking conditional return.

shrink_icache_memory() is good sample code.
It doesn't take a lock if sc->nr_to_scan==0. i915_gem_inactive_shrink() should do
it too, ideally.

My patch only take a first-aid.

Plus, if I understand correctly, i915_gem_inactive_shrink() have more fundamental
issue. actually, shrinker code shouldn't use mutex. Instead, use spinlock.
IOW, Don't call kmalloc(GFP_KERNEL) while taking dev->struct_mutex. Otherwise,
vmscan in its call path completely fail to shrink i915 cache and it makes big
memory reclaim confusing if i915 have a lot of shrinkable pages.

> Unless I am mistaken, and there are more patches in flight, the return
> code from i915_gem_inactive_shrink() is promoted to unsigned long and then
> used in the calculation of how may objects to evict...

shrinker->shrink has int type value. you can't change i915_gem_inactive_shrink()
unless generic shrinker code.
Do you really want to change it?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects
  2011-07-12  9:36     ` KOSAKI Motohiro
@ 2011-07-12 10:06       ` Chris Wilson
  2011-07-13  0:19         ` KOSAKI Motohiro
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2011-07-12 10:06 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: keithp, linux-kernel, airlied, dri-devel

On Tue, 12 Jul 2011 18:36:50 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> Hi,
> 
> sorry for the delay.
> 
> > On Wed, 29 Jun 2011 20:53:54 -0700, Keith Packard <keithp@keithp.com> wrote:
> >> On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> >>
> >>> Now, i915_gem_inactive_shrink() should return -1 instead of 0 if it
> >>> can't take a lock. Otherwise, vmscan is getting a lot of confusing
> >>> because vmscan can't distinguish "can't take a lock temporary" and
> >>> "we've shrank all of i915 objects".
> >>
> >> This doesn't look like the cleanest change possible. I think it would be
> >> better if the shrink function could uniformly return an error
> >> indication so that we wouldn't need the weird looking conditional return.
> 
> shrink_icache_memory() is good sample code.
> It doesn't take a lock if sc->nr_to_scan==0. i915_gem_inactive_shrink() should do
> it too, ideally.
> 
> My patch only take a first-aid.
> 
> Plus, if I understand correctly, i915_gem_inactive_shrink() have more fundamental
> issue. actually, shrinker code shouldn't use mutex. Instead, use spinlock.
Why? The shrinker code is run in a non-atomic context that is explicitly
allowed to wait, or so I thought. Where's the caveat that prevents mutex?
Why doesn't the kernel complain?

> IOW, Don't call kmalloc(GFP_KERNEL) while taking dev->struct_mutex. Otherwise,
> vmscan in its call path completely fail to shrink i915 cache and it makes big
> memory reclaim confusing if i915 have a lot of shrinkable pages.

i915 can have several GiB of shrinkable pages. Of which 2 GiB may be tied
up in the GTT upon which we have to wait for the GPU to release. In the
future, we will be able to tie up all of physical memory.

There is only a single potential kmalloc in the shrinker path, for which
we could preallocate a request so that we always have one available here.
 
> > Unless I am mistaken, and there are more patches in flight, the return
> > code from i915_gem_inactive_shrink() is promoted to unsigned long and then
> > used in the calculation of how may objects to evict...
> 
> shrinker->shrink has int type value. you can't change i915_gem_inactive_shrink()
> unless generic shrinker code.
> Do you really want to change it?

No, just pointing out that the patch causes warnings from the shrinker
code as it tries to process (unsigned long)-1 objects. shrink_slab() does
not use <0 as an error code!
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects
  2011-07-12 10:06       ` Chris Wilson
@ 2011-07-13  0:19         ` KOSAKI Motohiro
  2011-07-13  7:41           ` Chris Wilson
  0 siblings, 1 reply; 15+ messages in thread
From: KOSAKI Motohiro @ 2011-07-13  0:19 UTC (permalink / raw)
  To: chris; +Cc: keithp, linux-kernel, airlied, dri-devel

(2011/07/12 19:06), Chris Wilson wrote:
> On Tue, 12 Jul 2011 18:36:50 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>> Hi,
>>
>> sorry for the delay.
>>
>>> On Wed, 29 Jun 2011 20:53:54 -0700, Keith Packard <keithp@keithp.com> wrote:
>>>> On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>>>>
>>>>> Now, i915_gem_inactive_shrink() should return -1 instead of 0 if it
>>>>> can't take a lock. Otherwise, vmscan is getting a lot of confusing
>>>>> because vmscan can't distinguish "can't take a lock temporary" and
>>>>> "we've shrank all of i915 objects".
>>>>
>>>> This doesn't look like the cleanest change possible. I think it would be
>>>> better if the shrink function could uniformly return an error
>>>> indication so that we wouldn't need the weird looking conditional return.
>>
>> shrink_icache_memory() is good sample code.
>> It doesn't take a lock if sc->nr_to_scan==0. i915_gem_inactive_shrink() should do
>> it too, ideally.
>>
>> My patch only take a first-aid.
>>
>> Plus, if I understand correctly, i915_gem_inactive_shrink() have more fundamental
>> issue. actually, shrinker code shouldn't use mutex. Instead, use spinlock.
> Why? The shrinker code is run in a non-atomic context that is explicitly
> allowed to wait, or so I thought. Where's the caveat that prevents mutex?
> Why doesn't the kernel complain?

The matter is not in contention. The problem is happen if the mutex is taken
by shrink_slab calling thread. i915_gem_inactive_shrink() have no way to shink
objects. How do you detect such case?

>> IOW, Don't call kmalloc(GFP_KERNEL) while taking dev->struct_mutex. Otherwise,
>> vmscan in its call path completely fail to shrink i915 cache and it makes big
>> memory reclaim confusing if i915 have a lot of shrinkable pages.
> 
> i915 can have several GiB of shrinkable pages. Of which 2 GiB may be tied
> up in the GTT upon which we have to wait for the GPU to release. In the
> future, we will be able to tie up all of physical memory.
> 
> There is only a single potential kmalloc in the shrinker path, for which
> we could preallocate a request so that we always have one available here.

Again, waiting is no problem if it is enough little time. btw, I think
preallocation must be implemented, otherwise shrinker have no guarantee to
shrink.

thanks.


>>> Unless I am mistaken, and there are more patches in flight, the return
>>> code from i915_gem_inactive_shrink() is promoted to unsigned long and then
>>> used in the calculation of how may objects to evict...
>>
>> shrinker->shrink has int type value. you can't change i915_gem_inactive_shrink()
>> unless generic shrinker code.
>> Do you really want to change it?
> 
> No, just pointing out that the patch causes warnings from the shrinker
> code as it tries to process (unsigned long)-1 objects. shrink_slab() does
> not use <0 as an error code!

Look.

unsigned long shrink_slab(struct shrink_control *shrink,
                          unsigned long nr_pages_scanned,
                          unsigned long lru_pages)
{
(snip)
                while (total_scan >= SHRINK_BATCH) {
                        long this_scan = SHRINK_BATCH;
                        int shrink_ret;
                        int nr_before;

                        nr_before = do_shrinker_shrink(shrinker, shrink, 0);
                        shrink_ret = do_shrinker_shrink(shrinker, shrink,
                                                        this_scan);
                        if (shrink_ret == -1)
                                break;


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects
  2011-07-13  0:19         ` KOSAKI Motohiro
@ 2011-07-13  7:41           ` Chris Wilson
  2011-07-13  8:19             ` KOSAKI Motohiro
  0 siblings, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2011-07-13  7:41 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: keithp, linux-kernel, airlied, dri-devel

On Wed, 13 Jul 2011 09:19:24 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> (2011/07/12 19:06), Chris Wilson wrote:
> > On Tue, 12 Jul 2011 18:36:50 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> >> Hi,
> >>
> >> sorry for the delay.
> >>
> >>> On Wed, 29 Jun 2011 20:53:54 -0700, Keith Packard <keithp@keithp.com> wrote:
> >>>> On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> The matter is not in contention. The problem is happen if the mutex is taken
> by shrink_slab calling thread. i915_gem_inactive_shrink() have no way to shink
> objects. How do you detect such case?

In the primary allocator for the backing pages whilst the mutex is held we
do __NORETRY and a manual shrinkage of our buffers before failing. That's
the largest allocator, all the others are tiny and short-lived by
comparison and left to fail.

For a second process to hit shrink_slab whilst the driver is blocked on
the GPU, that is... unfortunate. Dropping that lock across that wait is
achievable, just very complicated.

> > No, just pointing out that the patch causes warnings from the shrinker
> > code as it tries to process (unsigned long)-1 objects. shrink_slab() does
> > not use <0 as an error code!
> 
> Look.
> 
> unsigned long shrink_slab(struct shrink_control *shrink,
>                           unsigned long nr_pages_scanned,
>                           unsigned long lru_pages)
> {
> (snip)
>                 while (total_scan >= SHRINK_BATCH) {
>                         long this_scan = SHRINK_BATCH;
>                         int shrink_ret;
>                         int nr_before;
> 
>                         nr_before = do_shrinker_shrink(shrinker, shrink, 0);
>                         shrink_ret = do_shrinker_shrink(shrinker, shrink,
>                                                         this_scan);
>                         if (shrink_ret == -1)
>                                 break;
> 

And fifteen lines above that you have:
  unsigned long max_pass = do_shrinker_shrink(shrinker, shrinker, 0);
  ...
  shrinker->nr += f(max_pass);
  if (shrinker->nr < 0) printk(KERN_ERR "...");

That's the *error* I hit when I originally returned -1.
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects
  2011-07-13  7:41           ` Chris Wilson
@ 2011-07-13  8:19             ` KOSAKI Motohiro
  2011-07-13  8:40               ` Chris Wilson
  2011-07-13 10:42               ` Dave Chinner
  0 siblings, 2 replies; 15+ messages in thread
From: KOSAKI Motohiro @ 2011-07-13  8:19 UTC (permalink / raw)
  To: chris; +Cc: keithp, linux-kernel, airlied, dri-devel

(2011/07/13 16:41), Chris Wilson wrote:
> On Wed, 13 Jul 2011 09:19:24 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>> (2011/07/12 19:06), Chris Wilson wrote:
>>> On Tue, 12 Jul 2011 18:36:50 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>>>> Hi,
>>>>
>>>> sorry for the delay.
>>>>
>>>>> On Wed, 29 Jun 2011 20:53:54 -0700, Keith Packard <keithp@keithp.com> wrote:
>>>>>> On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>> The matter is not in contention. The problem is happen if the mutex is taken
>> by shrink_slab calling thread. i915_gem_inactive_shrink() have no way to shink
>> objects. How do you detect such case?
> 
> In the primary allocator for the backing pages whilst the mutex is held we
> do __NORETRY and a manual shrinkage of our buffers before failing. That's
> the largest allocator, all the others are tiny and short-lived by
> comparison and left to fail.

__NORETRY perhaps might help to avoid false positive oom. But, __NORETRY still makes
full page reclaim and may drop a lot of innocent page cache, and then system may
become slow down.

Of course, you don't meet such worst case scenario so easy. But you may need to
think worst case if you touch memory management code.

> For a second process to hit shrink_slab whilst the driver is blocked on
> the GPU, that is... unfortunate. Dropping that lock across that wait is
> achievable, just very complicated.

I think that's no problem. waiting and complicated slow path have no matter
if it's only exceptional case. That don't makes false positive memory starvation.

thx.


>>> No, just pointing out that the patch causes warnings from the shrinker
>>> code as it tries to process (unsigned long)-1 objects. shrink_slab() does
>>> not use <0 as an error code!
>>
>> Look.
>>
>> unsigned long shrink_slab(struct shrink_control *shrink,
>>                           unsigned long nr_pages_scanned,
>>                           unsigned long lru_pages)
>> {
>> (snip)
>>                 while (total_scan >= SHRINK_BATCH) {
>>                         long this_scan = SHRINK_BATCH;
>>                         int shrink_ret;
>>                         int nr_before;
>>
>>                         nr_before = do_shrinker_shrink(shrinker, shrink, 0);
>>                         shrink_ret = do_shrinker_shrink(shrinker, shrink,
>>                                                         this_scan);
>>                         if (shrink_ret == -1)
>>                                 break;
>>
> 
> And fifteen lines above that you have:
>   unsigned long max_pass = do_shrinker_shrink(shrinker, shrinker, 0);
>   ...
>   shrinker->nr += f(max_pass);
>   if (shrinker->nr < 0) printk(KERN_ERR "...");
> 
> That's the *error* I hit when I originally returned -1.

You misunderstand the code. The third argument is critically important.
Only if it's 0 (ie sc->nr_to_scan==0), shrinker must not return negative.
Thus, my patch checked nr_to_scan argument. and I've suggested look at
shrink_icache_memory().

If you are thinking the shrinker protocol is too complicated, doc update
patch is really welcome.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects
  2011-07-13  8:19             ` KOSAKI Motohiro
@ 2011-07-13  8:40               ` Chris Wilson
  2011-07-13 11:34                 ` Dave Chinner
  2011-07-13 10:42               ` Dave Chinner
  1 sibling, 1 reply; 15+ messages in thread
From: Chris Wilson @ 2011-07-13  8:40 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: keithp, linux-kernel, airlied, dri-devel

On Wed, 13 Jul 2011 17:19:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> (2011/07/13 16:41), Chris Wilson wrote:
> > On Wed, 13 Jul 2011 09:19:24 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> >> (2011/07/12 19:06), Chris Wilson wrote:
> >>> On Tue, 12 Jul 2011 18:36:50 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> >>>> Hi,
> >>>>
> >>>> sorry for the delay.
> >>>>
> >>>>> On Wed, 29 Jun 2011 20:53:54 -0700, Keith Packard <keithp@keithp.com> wrote:
> >>>>>> On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> >> The matter is not in contention. The problem is happen if the mutex is taken
> >> by shrink_slab calling thread. i915_gem_inactive_shrink() have no way to shink
> >> objects. How do you detect such case?
> > 
> > In the primary allocator for the backing pages whilst the mutex is held we
> > do __NORETRY and a manual shrinkage of our buffers before failing. That's
> > the largest allocator, all the others are tiny and short-lived by
> > comparison and left to fail.
> 
> __NORETRY perhaps might help to avoid false positive oom. But, __NORETRY still makes
> full page reclaim and may drop a lot of innocent page cache, and then system may
> become slow down.

But in this context, that is memory the user has requested to be used with
the GPU, so the page cache is sacrificed to meet the allocation, if
possible.
 
> Of course, you don't meet such worst case scenario so easy. But you may need to
> think worst case if you touch memory management code.

Actually we'd much rather you took us into account when designing the mm.

> > That's the *error* I hit when I originally returned -1.
> 
> You misunderstand the code. The third argument is critically important.
> Only if it's 0 (ie sc->nr_to_scan==0), shrinker must not return negative.
> Thus, my patch checked nr_to_scan argument. and I've suggested look at
> shrink_icache_memory().

Ok.
 
> If you are thinking the shrinker protocol is too complicated, doc update
> patch is really welcome.

What I don't understand is the disconnect between objects to shrink and
the number of pages released. We may have tens of thousands of single page
objects that are expensive to free in comparison to a few 10-100MiB
objects that are just sitting idle. Would it be better to report the
estimated number of shrinkable pages instead?
-Chris

-- 
Chris Wilson, Intel Open Source Technology Centre

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects
  2011-07-13  8:19             ` KOSAKI Motohiro
  2011-07-13  8:40               ` Chris Wilson
@ 2011-07-13 10:42               ` Dave Chinner
  2011-07-14  2:48                 ` KOSAKI Motohiro
  1 sibling, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2011-07-13 10:42 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: chris, keithp, linux-kernel, airlied, dri-devel

On Wed, Jul 13, 2011 at 05:19:22PM +0900, KOSAKI Motohiro wrote:
> (2011/07/13 16:41), Chris Wilson wrote:
> >> (snip)
> >>                 while (total_scan >= SHRINK_BATCH) {
> >>                         long this_scan = SHRINK_BATCH;
> >>                         int shrink_ret;
> >>                         int nr_before;
> >>
> >>                         nr_before = do_shrinker_shrink(shrinker, shrink, 0);
> >>                         shrink_ret = do_shrinker_shrink(shrinker, shrink,
> >>                                                         this_scan);
> >>                         if (shrink_ret == -1)
> >>                                 break;
> >>
> > 
> > And fifteen lines above that you have:
> >   unsigned long max_pass = do_shrinker_shrink(shrinker, shrinker, 0);
> >   ...
> >   shrinker->nr += f(max_pass);
> >   if (shrinker->nr < 0) printk(KERN_ERR "...");
> > 
> > That's the *error* I hit when I originally returned -1.
> 
> You misunderstand the code. The third argument is critically important.
> Only if it's 0 (ie sc->nr_to_scan==0), shrinker must not return negative.

And once again the shitty shrinker API bites a user.

> Thus, my patch checked nr_to_scan argument. and I've suggested look at
> shrink_icache_memory().

Which is going away real soon - it's not the model of perfection
that you make it out to be. ;)

> If you are thinking the shrinker protocol is too complicated, doc update
> patch is really welcome.

Slab shrinkers have a nasty, crappy interface and the shrink_slab()
code is full of bugs.  Rather that telling people to "update the
documentation" because it's too complex, how about we fix the
interface and the bugs?

Indeed, how hard is it to require a subsystem to supply two shrinker
methods, one to return the count of reclaimable objects, the other
to scan the reclaimable objects to reclaim them? After all, that's
exactly the interface I'm exposing to filesystems underneath the
shrinker API in the per-sb shrinker patchset that gets rid of
shrink_icache_memory() rather than propagating the insanity....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects
  2011-07-13  8:40               ` Chris Wilson
@ 2011-07-13 11:34                 ` Dave Chinner
  0 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2011-07-13 11:34 UTC (permalink / raw)
  To: Chris Wilson; +Cc: KOSAKI Motohiro, keithp, linux-kernel, airlied, dri-devel

On Wed, Jul 13, 2011 at 09:40:31AM +0100, Chris Wilson wrote:
> On Wed, 13 Jul 2011 17:19:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> > (2011/07/13 16:41), Chris Wilson wrote:
> > > On Wed, 13 Jul 2011 09:19:24 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> > >> (2011/07/12 19:06), Chris Wilson wrote:
> > >>> On Tue, 12 Jul 2011 18:36:50 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> > >>>> Hi,
> > >>>>
> > >>>> sorry for the delay.
> > >>>>
> > >>>>> On Wed, 29 Jun 2011 20:53:54 -0700, Keith Packard <keithp@keithp.com> wrote:
> > >>>>>> On Fri, 24 Jun 2011 17:03:22 +0900, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> > >> The matter is not in contention. The problem is happen if the mutex is taken
> > >> by shrink_slab calling thread. i915_gem_inactive_shrink() have no way to shink
> > >> objects. How do you detect such case?
> > > 
> > > In the primary allocator for the backing pages whilst the mutex is held we
> > > do __NORETRY and a manual shrinkage of our buffers before failing. That's
> > > the largest allocator, all the others are tiny and short-lived by
> > > comparison and left to fail.
> > 
> > __NORETRY perhaps might help to avoid false positive oom. But, __NORETRY still makes
> > full page reclaim and may drop a lot of innocent page cache, and then system may
> > become slow down.
> 
> But in this context, that is memory the user has requested to be used with
> the GPU, so the page cache is sacrificed to meet the allocation, if
> possible.
>  
> > Of course, you don't meet such worst case scenario so easy. But you may need to
> > think worst case if you touch memory management code.
> 
> Actually we'd much rather you took us into account when designing the mm.

Heh. Now where have I heard that before? :/

> > If you are thinking the shrinker protocol is too complicated, doc update
> > patch is really welcome.
> 
> What I don't understand is the disconnect between objects to shrink and
> the number of pages released. We may have tens of thousands of single page
> objects that are expensive to free in comparison to a few 10-100MiB
> objects that are just sitting idle. Would it be better to report the
> estimated number of shrinkable pages instead?

Maybe. Then again, maybe not.

The shrinker API is designed for slab caches which have a fixed
object size, not a variable amount of memory per object, so if you
report the number of shrinkable pages, you can make whatever
decision you want as to the object(s) you free to free up that many
pages.

However, that means that if you have 1000 reclaimable pages, and the
dentry cache has 1000 reclaimable dentries, the same shrinker calls
will free 1000 pages from your cache, but maybe none from the dentry
cache due to slab fragmentation. Hence your cache could end up being
blown to pieces by light memory pressure by telling the shrinker how
many shrinkable pages you have cached. In that case, you want to
report a much smaller number so the cache is harder to reclaim under
light memory pressure, or don't reclaim as much as the shrinker is
asked to reclaim.

This is one of the issues I faced when converting the XFS buffer
cache to use an internal LRU and a shrinker to reclaim buffers that
hold one or more pages.  We used to cache the metadata in the page
cache and let the VM reclaim from there, but that was a crap-shoot
because page reclaim kept trashing the working set of metadata pages
and it was simply not fixable. Hence I changed the lifecycle of
buffers to include a priority based LRU for reclaiming buffer
objects and moved away from using the page cache from holding cached
metadata.

I let the shrinker know how many reclaimable buffers there are, but
it has no idea how much memory each buffer pіns. I don't even keep
track of it because from a performance perspective it is irrelevant;
what matters is maintaining a minimial working set of metadata
buffers under memory pressure.

In most cases the buffers hold only one or two pages, but because of
the reclaim reference count it can take up to 7 attempts to free a
buffer before it is finally reclaimed. Hence the buffer cache tends
to hold onto a critical working set quite well under different
levels of memory pressure because buffers more likely to be reused
are much harder to reclaim than those that are likely to be used ony
once.

As a result, the LRU resists aggressive reclaim to maintain the
necessary working set of buffers quite well.  The working set gets
smaller as memory pressure goes up, but the shrinker is not able to
completely trash the cache like the previous page-cache based
version did.  It's a very specific solution to the problem of tuning
a shrinker for good system behaviour, but it's the only way I found
that works....

Oh, and using a mutex to single thread cache reclaim rather than
spinlocks is usually a good idea from a scalability point of view
because your shrinker can be called simultaneously on every CPU.
Spinlocks really, really hurt when that happens, and performance
will plummet when it happens because you burn CPU on locks rather
than reclaimıng objects. Single threaded object reclaim is still the
fastest way to do reclaim if you have global lists and locks.

What I'm trying to say is that how you solve the shrinker balance
problem for you subsystem will be specific to how you need to hold
pages under memory pressure to maintain performance. Sorry I can't
give you a better answer than that, but that's what my experience
with caches and tuning shrinker behaviour indicates.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects
  2011-07-13 10:42               ` Dave Chinner
@ 2011-07-14  2:48                 ` KOSAKI Motohiro
  2011-07-14  3:47                   ` Dave Chinner
  0 siblings, 1 reply; 15+ messages in thread
From: KOSAKI Motohiro @ 2011-07-14  2:48 UTC (permalink / raw)
  To: david; +Cc: chris, keithp, linux-kernel, airlied, dri-devel

>> If you are thinking the shrinker protocol is too complicated, doc update
>> patch is really welcome.
> 
> Slab shrinkers have a nasty, crappy interface and the shrink_slab()
> code is full of bugs.  Rather that telling people to "update the
> documentation" because it's too complex, how about we fix the
> interface and the bugs?

If you can take your time for that, I don't hesitate to help you. ;)



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH] i915: slab shrinker have to return -1 if it cant shrink any objects
  2011-07-14  2:48                 ` KOSAKI Motohiro
@ 2011-07-14  3:47                   ` Dave Chinner
  0 siblings, 0 replies; 15+ messages in thread
From: Dave Chinner @ 2011-07-14  3:47 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: chris, keithp, linux-kernel, airlied, dri-devel

On Thu, Jul 14, 2011 at 11:48:08AM +0900, KOSAKI Motohiro wrote:
> >> If you are thinking the shrinker protocol is too complicated, doc update
> >> patch is really welcome.
> > 
> > Slab shrinkers have a nasty, crappy interface and the shrink_slab()
> > code is full of bugs.  Rather that telling people to "update the
> > documentation" because it's too complex, how about we fix the
> > interface and the bugs?
> 
> If you can take your time for that, I don't hesitate to help you. ;)

Its on my list of things to do if nobody else steps before I get
there ;)

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2011-07-14  3:47 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-06-24  8:03 [PATCH] i915: slab shrinker have to return -1 if it can't shrink any objects KOSAKI Motohiro
2011-06-30  3:53 ` Keith Packard
2011-06-30  3:53   ` Keith Packard
2011-06-30  8:55   ` [PATCH] i915: slab shrinker have to return -1 if it cant " Chris Wilson
2011-06-30  8:55     ` Chris Wilson
2011-07-12  9:36     ` KOSAKI Motohiro
2011-07-12 10:06       ` Chris Wilson
2011-07-13  0:19         ` KOSAKI Motohiro
2011-07-13  7:41           ` Chris Wilson
2011-07-13  8:19             ` KOSAKI Motohiro
2011-07-13  8:40               ` Chris Wilson
2011-07-13 11:34                 ` Dave Chinner
2011-07-13 10:42               ` Dave Chinner
2011-07-14  2:48                 ` KOSAKI Motohiro
2011-07-14  3:47                   ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.