From: Ed Tomlinson <tomlins@cam.org>
To: linux-kernel@vger.kernel.org
Subject: Re: inode highmem imbalance fix [Re: Bug with shared memory.]
Date: Fri, 24 May 2002 07:47:19 -0400 [thread overview]
Message-ID: <20020524114720.1F41B9E50@oscar.casa.dyndns.org> (raw)
In-Reply-To: <OF6D316E56.12B1A4B0-ONC1256BB9.004B5DB0@de.ibm.com> <3CE16683.29A888F8@zip.com.au> <20020520043040.GA21806@dualathlon.random> <20020524073341.GJ21164@dualathlon.random>
I Wonder if the changes I posted putting slab pages in the lru
would help? For now the patch is against rmap (pre8-ac5). I
would be very interested in hearing about results testing it.
See subject:
Re: [RFC][PATCH] using page aging to shrink caches (pre8-ac5)
Comments?
Ed Tomlinson
Andrea Arcangeli wrote:
> On Mon, May 20, 2002 at 06:30:40AM +0200, Andrea Arcangeli wrote:
>> As next thing I'll go ahead on the inode/highmem imbalance repored by
>> Alexey in the weekend. Then the only pending thing before next -aa is
>
> Here it is, you should apply it together with vm-35 that you need too
> for the bh/highmem balance (or on top of 2.4.19pre8aa3). I tested it
> slightly on uml and it didn't broke so far, so be careful because it's not
> very well tested yet. On the lines of what Alexey suggested originally,
> if goal isn't reached, in a second pass we shrink the cache too, but
> only if the cache is the only reason for the "pinning" beahiour of the
> inode. If for example there are dirty blocks of metadata or of data
> belonging to the inode we wakeup_bdflush instead and we never shrink the
> cache in such case. If the inode itself is dirty as well we let the two
> passes fail so we will schedule the work for keventd. This logic should
> ensure we never fall into shrinking the cache for no good reason and
> that we free the cache only for the inodes that we actually go ahead and
> free. (basically only dirty pages set with SetPageDirty aren't trapped
> by the logic before calling the invalidate, like ramfs, but that's
> expected of course, those pages cannot be damaged by the non destructive
> invalidate anyways)
>
> Comments?
>
> --- 2.4.19pre8aa4/fs/inode.c.~1~ Fri May 24 03:17:10 2002
> +++ 2.4.19pre8aa4/fs/inode.c Fri May 24 05:03:54 2002
> @@ -672,35 +672,87 @@ void prune_icache(int goal)
> {
> LIST_HEAD(list);
> struct list_head *entry, *freeable = &list;
> - int count;
> + int count, pass;
> struct inode * inode;
>
> - spin_lock(&inode_lock);
> + count = pass = 0;
>
> - count = 0;
> - entry = inode_unused.prev;
> - while (entry != &inode_unused)
> - {
> - struct list_head *tmp = entry;
> + spin_lock(&inode_lock);
> + while (goal && pass++ < 2) {
> + entry = inode_unused.prev;
> + while (entry != &inode_unused)
> + {
> + struct list_head *tmp = entry;
>
> - entry = entry->prev;
> - inode = INODE(tmp);
> - if (inode->i_state & (I_FREEING|I_CLEAR|I_LOCK))
> - continue;
> - if (!CAN_UNUSE(inode))
> - continue;
> - if (atomic_read(&inode->i_count))
> - continue;
> - list_del(tmp);
> - list_del(&inode->i_hash);
> - INIT_LIST_HEAD(&inode->i_hash);
> - list_add(tmp, freeable);
> - inode->i_state |= I_FREEING;
> - count++;
> - if (!--goal)
> - break;
> + entry = entry->prev;
> + inode = INODE(tmp);
> + if (inode->i_state & (I_FREEING|I_CLEAR|I_LOCK))
> + continue;
> + if (atomic_read(&inode->i_count))
> + continue;
> + if (pass == 2 && !inode->i_state && !CAN_UNUSE(inode)) {
> + if (inode_has_buffers(inode))
> + /*
> + * If the inode has dirty buffers
> + * pending, start flushing out bdflush.ndirty
> + * worth of data even if there's no dirty-memory
> + * pressure. Do nothing else in this
> + * case, until all dirty buffers are gone
> + * we can do nothing about the inode other than
> + * to keep flushing dirty stuff. We could also
> + * flush only the dirty buffers in the inode
> + * but there's no API to do it asynchronously
> + * and this simpler approch to deal with the
> + * dirty payload shouldn't make much difference
> + * in practice. Also keep in mind if somebody
> + * keeps overwriting data in a flood we'd
> + * never manage to drop the inode anyways,
> + * and we really shouldn't do that because
> + * it's an heavily used one.
> + */
> + wakeup_bdflush();
> + else if (inode->i_data.nrpages)
> + /*
> + * If we're here it means the only reason
> + * we cannot drop the inode is that its
> + * due its pagecache so go ahead and trim it
> + * hard. If it doesn't go away it means
> + * they're dirty or dirty/pinned pages ala
> + * ramfs.
> + *
> + * invalidate_inode_pages() is a non
> + * blocking operation but we introduce
> + * a dependency order between the
> + * inode_lock and the pagemap_lru_lock,
> + * the inode_lock must always be taken
> + * first from now on.
> + */
> + invalidate_inode_pages(inode);
> + }
> + if (!CAN_UNUSE(inode))
> + continue;
> + list_del(tmp);
> + list_del(&inode->i_hash);
> + INIT_LIST_HEAD(&inode->i_hash);
> + list_add(tmp, freeable);
> + inode->i_state |= I_FREEING;
> + count++;
> + if (!--goal)
> + break;
> + }
> }
> inodes_stat.nr_unused -= count;
> +
> + /*
> + * the unused list is hardly an LRU so it makes
> + * more sense to rotate it so we don't bang
> + * always on the same inodes in case they're
> + * unfreeable for whatever reason.
> + */
> + if (entry != &inode_unused) {
> + list_del(&inode_unused);
> + list_add(&inode_unused, entry);
> + }
> spin_unlock(&inode_lock);
>
> dispose_list(freeable);
>
> Andrea
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
next prev parent reply other threads:[~2002-05-24 11:47 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-05-14 15:13 Bug with shared memory Martin Schwidefsky
2002-05-14 19:33 ` Andrew Morton
2002-05-15 22:42 ` Mike Kravetz
2002-05-15 23:07 ` Andrew Morton
2002-05-17 17:53 ` Bill Davidsen
2002-05-17 20:07 ` Mike Kravetz
2002-05-17 20:29 ` Anton Blanchard
2002-05-20 4:30 ` Andrea Arcangeli
2002-05-20 5:21 ` Andrew Morton
2002-05-20 11:34 ` Andrey Savochkin
2002-05-20 14:15 ` Andrea Arcangeli
2002-05-20 19:24 ` Rik van Riel
2002-05-20 23:46 ` Andrea Arcangeli
2002-05-21 0:14 ` Martin J. Bligh
2002-05-21 1:40 ` Andrea Arcangeli
2002-05-20 16:22 ` Martin J. Bligh
2002-05-20 19:38 ` Rik van Riel
2002-05-20 20:06 ` William Lee Irwin III
2002-05-20 16:13 ` Martin J. Bligh
2002-05-20 16:37 ` Andrea Arcangeli
2002-05-20 17:23 ` Martin J. Bligh
2002-05-20 17:32 ` William Lee Irwin III
2002-05-24 7:33 ` inode highmem imbalance fix [Re: Bug with shared memory.] Andrea Arcangeli
2002-05-24 7:51 ` William Lee Irwin III
2002-05-24 8:04 ` Andrew Morton
2002-05-24 15:20 ` Andrea Arcangeli
2002-05-24 11:47 ` Ed Tomlinson [this message]
2002-05-30 11:25 ` Denis Lunev
2002-05-30 17:59 ` Andrea Arcangeli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20020524114720.1F41B9E50@oscar.casa.dyndns.org \
--to=tomlins@cam.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).