On Thu, Nov 18, 2010 at 12:59 AM, Nick Piggin <npiggin@kernel.dk> wrote:

> On Wed, Nov 17, 2010 at 08:34:51PM -0800, Ying Han wrote:
> > Pass the reclaim priority down to the shrink_slab() which passes to the
> > shrink_icache_memory() for inode cache. It helps the situation when
> > shrink_slab() is being too agressive, it removes the inode as well as all
> > the pages associated with the inode. Especially when single inode has
> lots
> > of pages points to it. The application encounters performance hit when
> > that happens.
> >
> > The problem was observed on some workload we run, where it has small
> number
> > of large files. Page reclaim won't blow away the inode which is pinned by
> > dentry which in turn is pinned by open file descriptor. But if the
> application
> > is openning and closing the fds, it has the chance to trigger the issue.
> >
> > I have a script which reproduce the issue. The test is creating 1500
> empty
> > files and one big file in a cgroup. Then it starts adding memory pressure
> > in the cgroup. Both before/after the patch we see the slab drops (inode)
> in
> > slabinfo but the big file clean pages being preserves only after the
> change.
>
> I was going to do this as a flag when nearing OOM. Is there a reason
> to have it priority based? That seems a little arbitrary to me...
>

We pass down the priority from the page reclaim to hint the shrinker. Unless
the page reclaim path
really have hard time get some pages freed which brings down the priority to
zero, we probably don't
want to throw out tons of page cache pages in order to free a single inode
cache. So the priority here
is really a hint of how badly we want to shrink the inode no matter what.

So what the flag is based on to set? How we justify the nearing OOM
condition in the shrinker?

--Ying


> FWIW, we can just add this to the new shrinker API, and convert over
> the users who care about it, so it doesn't have to be done in a big
> patch.
>