All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sahitya Tummala <stummala@codeaurora.org>
To: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Alexander Polakov <apolyakov@beget.ru>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>,
	viro@zeniv.linux.org.uk, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v2] fs/dcache.c: fix spin lockup issue on nlru->lock
Date: Thu, 22 Jun 2017 22:01:39 +0530	[thread overview]
Message-ID: <8d82c32d-6cbb-c39d-2f0e-0af23925b3c1@codeaurora.org> (raw)
In-Reply-To: <20170621163134.GA3273@esperanza>



On 6/21/2017 10:01 PM, Vladimir Davydov wrote:
>
>> index cddf397..c8ca150 100644
>> --- a/fs/dcache.c
>> +++ b/fs/dcache.c
>> @@ -1133,10 +1133,11 @@ void shrink_dcache_sb(struct super_block *sb)
>>   		LIST_HEAD(dispose);
>>   
>>   		freed = list_lru_walk(&sb->s_dentry_lru,
>> -			dentry_lru_isolate_shrink, &dispose, UINT_MAX);
>> +			dentry_lru_isolate_shrink, &dispose, 1024);
>>   
>>   		this_cpu_sub(nr_dentry_unused, freed);
>>   		shrink_dentry_list(&dispose);
>> +		cond_resched();
>>   	} while (freed > 0);
> In an extreme case, a single invocation of list_lru_walk() can skip all
> 1024 dentries, in which case 'freed' will be 0 forcing us to break the
> loop prematurely. I think we should loop until there's at least one
> dentry left on the LRU, i.e.
>
> 	while (list_lru_count(&sb->s_dentry_lru) > 0)
>
> However, even that wouldn't be quite correct, because list_lru_count()
> iterates over all memory cgroups to sum list_lru_one->nr_items, which
> can race with memcg offlining code migrating dentries off a dead cgroup
> (see memcg_drain_all_list_lrus()). So it looks like to make this check
> race-free, we need to account the number of entries on the LRU not only
> per memcg, but also per node, i.e. add list_lru_node->nr_items.
> Fortunately, list_lru entries can't be migrated between NUMA nodes.
It looks like list_lru_count() is iterating per node before iterating 
over all memory
cgroups as below -

unsigned long list_lru_count_node(struct list_lru *lru, int nid)
{
         long count = 0;
         int memcg_idx;

         count += __list_lru_count_one(lru, nid, -1);
         if (list_lru_memcg_aware(lru)) {
                 for_each_memcg_cache_index(memcg_idx)
                         count += __list_lru_count_one(lru, nid, memcg_idx);
         }
         return count;
}

The first call to __list_lru_count_one() is iterating all the items per 
node i.e, nlru->lru->nr_items.
Is my understanding correct? If not, could you please clarify on how to 
get the lru items per node?

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.

WARNING: multiple messages have this Message-ID (diff)
From: Sahitya Tummala <stummala@codeaurora.org>
To: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Alexander Polakov <apolyakov@beget.ru>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Kara <jack@suse.cz>,
	viro@zeniv.linux.org.uk, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v2] fs/dcache.c: fix spin lockup issue on nlru->lock
Date: Thu, 22 Jun 2017 22:01:39 +0530	[thread overview]
Message-ID: <8d82c32d-6cbb-c39d-2f0e-0af23925b3c1@codeaurora.org> (raw)
In-Reply-To: <20170621163134.GA3273@esperanza>



On 6/21/2017 10:01 PM, Vladimir Davydov wrote:
>
>> index cddf397..c8ca150 100644
>> --- a/fs/dcache.c
>> +++ b/fs/dcache.c
>> @@ -1133,10 +1133,11 @@ void shrink_dcache_sb(struct super_block *sb)
>>   		LIST_HEAD(dispose);
>>   
>>   		freed = list_lru_walk(&sb->s_dentry_lru,
>> -			dentry_lru_isolate_shrink, &dispose, UINT_MAX);
>> +			dentry_lru_isolate_shrink, &dispose, 1024);
>>   
>>   		this_cpu_sub(nr_dentry_unused, freed);
>>   		shrink_dentry_list(&dispose);
>> +		cond_resched();
>>   	} while (freed > 0);
> In an extreme case, a single invocation of list_lru_walk() can skip all
> 1024 dentries, in which case 'freed' will be 0 forcing us to break the
> loop prematurely. I think we should loop until there's at least one
> dentry left on the LRU, i.e.
>
> 	while (list_lru_count(&sb->s_dentry_lru) > 0)
>
> However, even that wouldn't be quite correct, because list_lru_count()
> iterates over all memory cgroups to sum list_lru_one->nr_items, which
> can race with memcg offlining code migrating dentries off a dead cgroup
> (see memcg_drain_all_list_lrus()). So it looks like to make this check
> race-free, we need to account the number of entries on the LRU not only
> per memcg, but also per node, i.e. add list_lru_node->nr_items.
> Fortunately, list_lru entries can't be migrated between NUMA nodes.
It looks like list_lru_count() is iterating per node before iterating 
over all memory
cgroups as below -

unsigned long list_lru_count_node(struct list_lru *lru, int nid)
{
         long count = 0;
         int memcg_idx;

         count += __list_lru_count_one(lru, nid, -1);
         if (list_lru_memcg_aware(lru)) {
                 for_each_memcg_cache_index(memcg_idx)
                         count += __list_lru_count_one(lru, nid, memcg_idx);
         }
         return count;
}

The first call to __list_lru_count_one() is iterating all the items per 
node i.e, nlru->lru->nr_items.
Is my understanding correct? If not, could you please clarify on how to 
get the lru items per node?

-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-06-22 16:31 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-12  0:47 [PATCH] mm/list_lru.c: use cond_resched_lock() for nlru->lock Sahitya Tummala
2017-06-12  0:47 ` Sahitya Tummala
2017-06-12 13:11 ` Jan Kara
2017-06-12 13:11   ` Jan Kara
2017-06-15 21:05 ` Andrew Morton
2017-06-15 21:05   ` Andrew Morton
2017-06-16 14:44   ` Sahitya Tummala
2017-06-16 14:44     ` Sahitya Tummala
2017-06-17 11:14   ` Vladimir Davydov
2017-06-17 11:14     ` Vladimir Davydov
2017-06-20  2:52     ` Sahitya Tummala
2017-06-20  2:52       ` Sahitya Tummala
2017-06-21  6:39       ` [PATCH v2] fs/dcache.c: fix spin lockup issue on nlru->lock Sahitya Tummala
2017-06-21  6:39         ` Sahitya Tummala
2017-06-21 16:31         ` Vladimir Davydov
2017-06-21 16:31           ` Vladimir Davydov
2017-06-22 16:31           ` Sahitya Tummala [this message]
2017-06-22 16:31             ` Sahitya Tummala
2017-06-22 17:49             ` Vladimir Davydov
2017-06-22 17:49               ` Vladimir Davydov
2017-06-28  6:07               ` [PATCH v3 1/2] mm/list_lru.c: fix list_lru_count_node() to be race free Sahitya Tummala
2017-06-28  6:07                 ` Sahitya Tummala
2017-06-28  6:07                 ` [PATCH v3 2/2] fs/dcache.c: fix spin lockup issue on nlru->lock Sahitya Tummala
2017-06-28  6:07                   ` Sahitya Tummala
2017-06-28 17:18                 ` [PATCH v3 1/2] mm/list_lru.c: fix list_lru_count_node() to be race free Vladimir Davydov
2017-06-28 17:18                   ` Vladimir Davydov
2017-06-29  3:39                   ` [PATCH v4 " Sahitya Tummala
2017-06-29  3:39                     ` Sahitya Tummala
2017-07-01 16:28                     ` Vladimir Davydov
2017-07-01 16:28                       ` Vladimir Davydov
2017-06-29  3:39                   ` [PATCH v4 2/2] fs/dcache.c: fix spin lockup issue on nlru->lock Sahitya Tummala
2017-06-29  3:39                     ` Sahitya Tummala
2017-06-29 22:48                     ` Andrew Morton
2017-06-29 22:48                       ` Andrew Morton
2017-06-30  3:16                       ` Sahitya Tummala
2017-06-30  3:16                         ` Sahitya Tummala
2017-07-01 16:28                     ` Vladimir Davydov
2017-07-01 16:28                       ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8d82c32d-6cbb-c39d-2f0e-0af23925b3c1@codeaurora.org \
    --to=stummala@codeaurora.org \
    --cc=akpm@linux-foundation.org \
    --cc=apolyakov@beget.ru \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=vdavydov.dev@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.