linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Dan Streetman <ddstreet@ieee.org>
Cc: Hugh Dickins <hughd@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.cz>,
	Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>,
	Rik van Riel <riel@redhat.com>, Weijie Yang <weijieut@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Shaohua Li <shli@fusionio.com>
Subject: Re: [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head
Date: Mon, 12 May 2014 12:11:55 +0100	[thread overview]
Message-ID: <20140512111155.GM23991@suse.de> (raw)
In-Reply-To: <1399057350-16300-5-git-send-email-ddstreet@ieee.org>

On Fri, May 02, 2014 at 03:02:30PM -0400, Dan Streetman wrote:
> Originally get_swap_page() started iterating through the singly-linked
> list of swap_info_structs using swap_list.next or highest_priority_index,
> which both were intended to point to the highest priority active swap
> target that was not full.  The first patch in this series changed the
> singly-linked list to a doubly-linked list, and removed the logic to start
> at the highest priority non-full entry; it starts scanning at the highest
> priority entry each time, even if the entry is full.
> 
> Replace the manually ordered swap_list_head with a plist, renamed to
> swap_active_head for clarity.  Add a new plist, swap_avail_head.
> The original swap_active_head plist contains all active swap_info_structs,
> as before, while the new swap_avail_head plist contains only
> swap_info_structs that are active and available, i.e. not full.
> Add a new spinlock, swap_avail_lock, to protect the swap_avail_head list.
> 
> Mel Gorman suggested using plists since they internally handle ordering
> the list entries based on priority, which is exactly what swap was doing
> manually.  All the ordering code is now removed, and swap_info_struct
> entries and simply added to their corresponding plist and automatically
> ordered correctly.
> 
> Using a new plist for available swap_info_structs simplifies and
> optimizes get_swap_page(), which no longer has to iterate over full
> swap_info_structs.  Using a new spinlock for swap_avail_head plist
> allows each swap_info_struct to add or remove themselves from the
> plist when they become full or not-full; previously they could not
> do so because the swap_info_struct->lock is held when they change
> from full<->not-full, and the swap_lock protecting the main
> swap_active_head must be ordered before any swap_info_struct->lock.
> 
> Signed-off-by: Dan Streetman <ddstreet@ieee.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Cc: Shaohua Li <shli@fusionio.com>
> 
> ---
> 
> Mel, I tried moving the ordering and rotating code into common list functions
> and I also tried plists, and you were right, using plists is much simpler and
> more maintainable.  The only required update to plist is the plist_rotate()
> function, which is even simpler to use in get_swap_page() than the
> list_rotate_left() function.
> 
> After looking more closely at plists, I don't see how they would reduce
> performance, so I don't think there is any concern there, although Shaohua if
> you have time it might be nice to check this updated patch set's performance.
> I will note that if CONFIG_DEBUG_PI_LIST is set, there's quite a lot of list
> checking going on for each list modification including rotate; that config is
> set if "RT Mutex debugging, deadlock detection" is set, so I assume in that
> case overall system performance is expected to be less than optimal.
> 
> Also, I might have over-commented in this patch; if so I can remove/reduce
> some of it. :)
> 
> Changelog since v1 https://lkml.org/lkml/2014/4/12/73
>   -use plists instead of regular lists
>   -update/add comments
> 
>  include/linux/swap.h     |   3 +-
>  include/linux/swapfile.h |   2 +-
>  mm/frontswap.c           |   6 +-
>  mm/swapfile.c            | 142 +++++++++++++++++++++++++++++------------------
>  4 files changed, 94 insertions(+), 59 deletions(-)
> 
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 8bb85d6..9155bcd 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -214,7 +214,8 @@ struct percpu_cluster {
>  struct swap_info_struct {
>  	unsigned long	flags;		/* SWP_USED etc: see above */
>  	signed short	prio;		/* swap priority of this type */
> -	struct list_head list;		/* entry in swap list */
> +	struct plist_node list;		/* entry in swap_active_head */
> +	struct plist_node avail_list;	/* entry in swap_avail_head */
>  	signed char	type;		/* strange name for an index */
>  	unsigned int	max;		/* extent of the swap_map */
>  	unsigned char *swap_map;	/* vmalloc'ed array of usage counts */
> diff --git a/include/linux/swapfile.h b/include/linux/swapfile.h
> index 2eab382..388293a 100644
> --- a/include/linux/swapfile.h
> +++ b/include/linux/swapfile.h
> @@ -6,7 +6,7 @@
>   * want to expose them to the dozens of source files that include swap.h
>   */
>  extern spinlock_t swap_lock;
> -extern struct list_head swap_list_head;
> +extern struct plist_head swap_active_head;
>  extern struct swap_info_struct *swap_info[];
>  extern int try_to_unuse(unsigned int, bool, unsigned long);
>  
> diff --git a/mm/frontswap.c b/mm/frontswap.c
> index fae1160..c30eec5 100644
> --- a/mm/frontswap.c
> +++ b/mm/frontswap.c
> @@ -331,7 +331,7 @@ static unsigned long __frontswap_curr_pages(void)
>  	struct swap_info_struct *si = NULL;
>  
>  	assert_spin_locked(&swap_lock);
> -	list_for_each_entry(si, &swap_list_head, list)
> +	plist_for_each_entry(si, &swap_active_head, list)
>  		totalpages += atomic_read(&si->frontswap_pages);
>  	return totalpages;
>  }
> @@ -346,7 +346,7 @@ static int __frontswap_unuse_pages(unsigned long total, unsigned long *unused,
>  	unsigned long pages = 0, pages_to_unuse = 0;
>  
>  	assert_spin_locked(&swap_lock);
> -	list_for_each_entry(si, &swap_list_head, list) {
> +	plist_for_each_entry(si, &swap_active_head, list) {
>  		si_frontswap_pages = atomic_read(&si->frontswap_pages);
>  		if (total_pages_to_unuse < si_frontswap_pages) {
>  			pages = pages_to_unuse = total_pages_to_unuse;
> @@ -408,7 +408,7 @@ void frontswap_shrink(unsigned long target_pages)
>  	/*
>  	 * we don't want to hold swap_lock while doing a very
>  	 * lengthy try_to_unuse, but swap_list may change
> -	 * so restart scan from swap_list_head each time
> +	 * so restart scan from swap_active_head each time
>  	 */
>  	spin_lock(&swap_lock);
>  	ret = __frontswap_shrink(target_pages, &pages_to_unuse, &type);
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 6c95a8c..ec230e3 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -61,7 +61,22 @@ static const char Unused_offset[] = "Unused swap offset entry ";
>   * all active swap_info_structs
>   * protected with swap_lock, and ordered by priority.
>   */
> -LIST_HEAD(swap_list_head);
> +PLIST_HEAD(swap_active_head);
> +
> +/*
> + * all available (active, not full) swap_info_structs
> + * protected with swap_avail_lock, ordered by priority.
> + * This is used by get_swap_page() instead of swap_active_head
> + * because swap_active_head includes all swap_info_structs,
> + * but get_swap_page() doesn't need to look at full ones.
> + * This uses its own lock instead of swap_lock because when a
> + * swap_info_struct changes between not-full/full, it needs to
> + * add/remove itself to/from this list, but the swap_info_struct->lock
> + * is held and the locking order requires swap_lock to be taken
> + * before any swap_info_struct->lock.
> + */
> +static PLIST_HEAD(swap_avail_head);
> +static DEFINE_SPINLOCK(swap_avail_lock);
>  
>  struct swap_info_struct *swap_info[MAX_SWAPFILES];
>  
> @@ -594,6 +609,9 @@ checks:
>  	if (si->inuse_pages == si->pages) {
>  		si->lowest_bit = si->max;
>  		si->highest_bit = 0;
> +		spin_lock(&swap_avail_lock);
> +		plist_del(&si->avail_list, &swap_avail_head);
> +		spin_unlock(&swap_avail_lock);
>  	}
>  	si->swap_map[offset] = usage;
>  	inc_cluster_info_page(si, si->cluster_info, offset);
> @@ -645,57 +663,60 @@ swp_entry_t get_swap_page(void)
>  {
>  	struct swap_info_struct *si, *next;
>  	pgoff_t offset;
> -	struct list_head *tmp;
>  
> -	spin_lock(&swap_lock);
>  	if (atomic_long_read(&nr_swap_pages) <= 0)
>  		goto noswap;
>  	atomic_long_dec(&nr_swap_pages);
>  
> -	list_for_each(tmp, &swap_list_head) {
> -		si = list_entry(tmp, typeof(*si), list);
> +	spin_lock(&swap_avail_lock);
> +start_over:
> +	plist_for_each_entry_safe(si, next, &swap_avail_head, avail_list) {
> +		/* rotate si to tail of same-priority siblings */
> +		plist_rotate(&si->avail_list, &swap_avail_head);
> +		spin_unlock(&swap_avail_lock);
>  		spin_lock(&si->lock);
>  		if (!si->highest_bit || !(si->flags & SWP_WRITEOK)) {
> +			spin_lock(&swap_avail_lock);
> +			if (plist_node_empty(&si->avail_list)) {
> +				spin_unlock(&si->lock);
> +				goto nextsi;
> +			}

It's a corner case but rather than dropping the swap_avail_lock early and
retaking it to remove an entry from avail_list, you could just drop it
after this check but before the scan_swap_map. It's not a big deal so
whether you change this or not

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

  parent reply	other threads:[~2014-05-12 11:12 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-13 10:42 [PATCH] mm: swap: Use swapfiles in priority order Mel Gorman
2014-02-13 15:58 ` Weijie Yang
2014-02-14 10:17   ` Mel Gorman
2014-02-14 13:33     ` Weijie Yang
     [not found]   ` <loom.20140214T135753-812@post.gmane.org>
     [not found]     ` <CABdxLJHS5kw0rpD=+77iQtc6PMeRXoWnh-nh5VzjjfGHJ5wLGQ@mail.gmail.com>
2014-02-24  8:28       ` Hugh Dickins
2014-04-12 21:00         ` [PATCH 0/2] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-04-12 21:00           ` [PATCH 1/2] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-04-23 10:34             ` Mel Gorman
2014-04-24  0:17               ` Shaohua Li
2014-04-24  8:30                 ` Mel Gorman
2014-04-24 18:48               ` Dan Streetman
2014-04-25  4:15                 ` Weijie Yang
2014-05-02 20:00                   ` Dan Streetman
2014-05-04  9:39                     ` Bob Liu
2014-05-04 20:16                       ` Dan Streetman
2014-04-25  8:38                 ` Mel Gorman
2014-04-12 21:00           ` [PATCH 2/2] swap: use separate priority list for available swap_infos Dan Streetman
2014-04-23 13:14             ` Mel Gorman
2014-04-24 17:52               ` Dan Streetman
2014-04-25  8:49                 ` Mel Gorman
2014-05-02 19:02           ` [PATCHv2 0/4] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-05-02 19:02             ` [PATCHv2 1/4] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-05-02 19:02             ` [PATCH 2/4] plist: add helper functions Dan Streetman
2014-05-12 10:35               ` Mel Gorman
2014-05-02 19:02             ` [PATCH 3/4] plist: add plist_rotate Dan Streetman
2014-05-06  2:18               ` Steven Rostedt
2014-05-06 20:12                 ` Dan Streetman
2014-05-06 20:39                   ` Steven Rostedt
2014-05-06 21:47                     ` Dan Streetman
2014-05-06 22:43                       ` Steven Rostedt
2014-05-02 19:02             ` [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-05 15:51               ` Dan Streetman
2014-05-05 19:13               ` Steven Rostedt
2014-05-05 19:38                 ` Peter Zijlstra
2014-05-09 20:42                 ` [PATCH] plist: make CONFIG_DEBUG_PI_LIST selectable Dan Streetman
2014-05-09 21:17                   ` Steven Rostedt
2014-05-12 11:11               ` Mel Gorman [this message]
2014-05-12 13:00                 ` [PATCH 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-12 16:38             ` [PATCHv3 0/4] swap: simplify/fix swap_list handling and iteration Dan Streetman
2014-05-12 16:38               ` [PATCHv2 1/4] swap: change swap_info singly-linked list to list_head Dan Streetman
2014-05-12 16:38               ` [PATCH 2/4] plist: add helper functions Dan Streetman
2014-05-12 16:38               ` [PATCHv2 3/4] plist: add plist_requeue Dan Streetman
2014-05-13 10:33                 ` Mel Gorman
2014-05-12 16:38               ` [PATCHv2 4/4] swap: change swap_list_head to plist, add swap_avail_head Dan Streetman
2014-05-13 10:34                 ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140512111155.GM23991@suse.de \
    --to=mgorman@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=ddstreet@ieee.org \
    --cc=ehrhardt@linux.vnet.ibm.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=riel@redhat.com \
    --cc=shli@fusionio.com \
    --cc=weijieut@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).