From: Mel Gorman <mgorman@suse.de>
To: Shaohua Li <shli@kernel.org>
Cc: Dan Streetman <ddstreet@ieee.org>,
Hugh Dickins <hughd@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@suse.cz>,
Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
Shaohua Li <shli@fusionio.com>, Weijie Yang <weijieut@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] swap: change swap_info singly-linked list to list_head
Date: Thu, 24 Apr 2014 09:30:18 +0100 [thread overview]
Message-ID: <20140424083018.GO23991@suse.de> (raw)
In-Reply-To: <20140424001705.GA8066@kernel.org>
On Thu, Apr 24, 2014 at 08:17:05AM +0800, Shaohua Li wrote:
> On Wed, Apr 23, 2014 at 11:34:00AM +0100, Mel Gorman wrote:
> > > @@ -366,7 +361,7 @@ static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
> > > }
> > > vm_unacct_memory(pages);
> > > *unused = pages_to_unuse;
> > > - *swapid = type;
> > > + *swapid = si->type;
> > > ret = 0;
> > > break;
> > > }
> > > @@ -413,7 +408,7 @@ void frontswap_shrink(unsigned long target_pages)
> > > /*
> > > * we don't want to hold swap_lock while doing a very
> > > * lengthy try_to_unuse, but swap_list may change
> > > - * so restart scan from swap_list.head each time
> > > + * so restart scan from swap_list_head each time
> > > */
> > > spin_lock(&swap_lock);
> > > ret = __frontswap_shrink(target_pages, &pages_to_unuse, &type);
> > > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > > index 4a7f7e6..b958645 100644
> > > --- a/mm/swapfile.c
> > > +++ b/mm/swapfile.c
> > > @@ -51,14 +51,14 @@ atomic_long_t nr_swap_pages;
> > > /* protected with swap_lock. reading in vm_swap_full() doesn't need lock */
> > > long total_swap_pages;
> > > static int least_priority;
> > > -static atomic_t highest_priority_index = ATOMIC_INIT(-1);
> > >
> > > static const char Bad_file[] = "Bad swap file entry ";
> > > static const char Unused_file[] = "Unused swap file entry ";
> > > static const char Bad_offset[] = "Bad swap offset entry ";
> > > static const char Unused_offset[] = "Unused swap offset entry ";
> > >
> > > -struct swap_list_t swap_list = {-1, -1};
> > > +/* all active swap_info */
> > > +LIST_HEAD(swap_list_head);
> > >
> > > struct swap_info_struct *swap_info[MAX_SWAPFILES];
> > >
> > > @@ -640,66 +640,50 @@ no_page:
> > >
> > > swp_entry_t get_swap_page(void)
> > > {
> > > - struct swap_info_struct *si;
> > > + struct swap_info_struct *si, *next;
> > > pgoff_t offset;
> > > - int type, next;
> > > - int wrapped = 0;
> > > - int hp_index;
> > > + struct list_head *tmp;
> > >
> > > spin_lock(&swap_lock);
> > > if (atomic_long_read(&nr_swap_pages) <= 0)
> > > goto noswap;
> > > atomic_long_dec(&nr_swap_pages);
> > >
> > > - for (type = swap_list.next; type >= 0 && wrapped < 2; type = next) {
> > > - hp_index = atomic_xchg(&highest_priority_index, -1);
> > > - /*
> > > - * highest_priority_index records current highest priority swap
> > > - * type which just frees swap entries. If its priority is
> > > - * higher than that of swap_list.next swap type, we use it. It
> > > - * isn't protected by swap_lock, so it can be an invalid value
> > > - * if the corresponding swap type is swapoff. We double check
> > > - * the flags here. It's even possible the swap type is swapoff
> > > - * and swapon again and its priority is changed. In such rare
> > > - * case, low prority swap type might be used, but eventually
> > > - * high priority swap will be used after several rounds of
> > > - * swap.
> > > - */
> > > - if (hp_index != -1 && hp_index != type &&
> > > - swap_info[type]->prio < swap_info[hp_index]->prio &&
> > > - (swap_info[hp_index]->flags & SWP_WRITEOK)) {
> > > - type = hp_index;
> > > - swap_list.next = type;
> > > - }
> > > -
> > > - si = swap_info[type];
> > > - next = si->next;
> > > - if (next < 0 ||
> > > - (!wrapped && si->prio != swap_info[next]->prio)) {
> > > - next = swap_list.head;
> > > - wrapped++;
> > > - }
> > > -
> > > + list_for_each(tmp, &swap_list_head) {
> > > + si = list_entry(tmp, typeof(*si), list);
> > > spin_lock(&si->lock);
> > > - if (!si->highest_bit) {
> > > - spin_unlock(&si->lock);
> > > - continue;
> > > - }
> > > - if (!(si->flags & SWP_WRITEOK)) {
> > > + if (!si->highest_bit || !(si->flags & SWP_WRITEOK)) {
> > > spin_unlock(&si->lock);
> > > continue;
> > > }
> > >
> > > - swap_list.next = next;
> > > + /*
> > > + * rotate the current swap_info that we're going to use
> > > + * to after any other swap_info that have the same prio,
> > > + * so that all equal-priority swap_info get used equally
> > > + */
> > > + next = si;
> > > + list_for_each_entry_continue(next, &swap_list_head, list) {
> > > + if (si->prio != next->prio)
> > > + break;
> > > + list_rotate_left(&si->list);
> > > + next = si;
> > > + }
> > >
> >
> > The list manipulations will be a lot of cache writes as the list is shuffled
> > around. On slow storage I do not think this will be noticable but it may
> > be noticable on faster swap devices that are SSD based. I've added Shaohua
> > Li to the cc as he has been concerned with the performance of swap in the
> > past. Shaohua, can you run this patchset through any of your test cases
> > with the addition that multiple swap files are used to see if the cache
> > writes are noticable? You'll need multiple swap files, some of which are
> > at equal priority so the list shuffling logic is triggered.
>
> get_swap_page isn't hot so far (and we hold the swap_lock, which isn't
> contended), guess it's because other problems hide it, for example tlb flush
> overhead.
>
The old method was not free either but I wanted to be sure you were
aware of the series just in case. Thanks.
--
Mel Gorman
SUSE Labs
next prev parent reply other threads:[~2014-04-24 8:30 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-13 10:42 [PATCH] mm: swap: Use swapfiles in priority order Mel Gorman
2014-02-13 15:58 ` Weijie Yang
2014-02-14 10:17 ` Mel Gorman
2014-02-14 13:33 ` Weijie Yang
[not found] ` <loom.20140214T135753-812@post.gmane.org>
[not found] ` <CABdxLJHS5kw0rpD=+77iQtc6PMeRXoWnh-nh5VzjjfGHJ5wLGQ@mail.gmail.com>
2014-02-24 8:28 ` Hugh Dickins
2014-04-12 21:00 ` [PATCH 0/2] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-04-12 21:00 ` [PATCH 1/2] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-04-23 10:34 ` Mel Gorman
2014-04-24 0:17 ` Shaohua Li
2014-04-24 8:30 ` Mel Gorman [this message]
2014-04-24 18:48 ` Dan Streetman
2014-04-25 4:15 ` Weijie Yang
2014-05-02 20:00 ` Dan Streetman
2014-05-04 9:39 ` Bob Liu
2014-05-04 20:16 ` Dan Streetman
2014-04-25 8:38 ` Mel Gorman
2014-04-12 21:00 ` [PATCH 2/2] swap: use separate priority list for available swap_infos Dan Streetman
2014-04-23 13:14 ` Mel Gorman
2014-04-24 17:52 ` Dan Streetman
2014-04-25 8:49 ` Mel Gorman
2014-05-02 19:02 ` [PATCHv2 0/4] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-05-02 19:02 ` [PATCHv2 1/4] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-05-02 19:02 ` [PATCH 2/4] plist: add helper functions Dan Streetman
2014-05-12 10:35 ` Mel Gorman
2014-05-02 19:02 ` [PATCH 3/4] plist: add plist_rotate Dan Streetman
2014-05-06 2:18 ` Steven Rostedt
2014-05-06 20:12 ` Dan Streetman
2014-05-06 20:39 ` Steven Rostedt
2014-05-06 21:47 ` Dan Streetman
2014-05-06 22:43 ` Steven Rostedt
2014-05-02 19:02 ` [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-05 15:51 ` Dan Streetman
2014-05-05 19:13 ` Steven Rostedt
2014-05-05 19:38 ` Peter Zijlstra
2014-05-09 20:42 ` [PATCH] plist: make CONFIG_DEBUG_PI_LIST selectable Dan Streetman
2014-05-09 21:17 ` Steven Rostedt
2014-05-12 11:11 ` [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head Mel Gorman
2014-05-12 13:00 ` Dan Streetman
2014-05-12 16:38 ` [PATCHv3 0/4] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-05-12 16:38 ` [PATCHv2 1/4] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-05-12 16:38 ` [PATCH 2/4] plist: add helper functions Dan Streetman
2014-05-12 16:38 ` [PATCHv2 3/4] plist: add plist_requeue Dan Streetman
2014-05-13 10:33 ` Mel Gorman
2014-05-12 16:38 ` [PATCHv2 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-13 10:34 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140424083018.GO23991@suse.de \
--to=mgorman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=ddstreet@ieee.org \
--cc=ehrhardt@linux.vnet.ibm.com \
--cc=hughd@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
--cc=shli@fusionio.com \
--cc=shli@kernel.org \
--cc=weijieut@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).