All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vinayak Menon <vinmenon@codeaurora.org>
To: Michal Hocko <mhocko@suse.cz>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes@cmpxchg.org,
	vdavydov@parallels.com, mgorman@suse.de, minchan@kernel.org,
	Christoph Lameter <cl@gentwo.org>
Subject: Re: [PATCH v2] mm: vmscan: fix the page state calculation in too_many_isolated
Date: Sat, 17 Jan 2015 20:48:18 +0530	[thread overview]
Message-ID: <54BA7D3A.40100@codeaurora.org> (raw)
In-Reply-To: <20150116154922.GB4650@dhcp22.suse.cz>

On 01/16/2015 09:19 PM, Michal Hocko wrote:
> On Thu 15-01-15 22:54:20, Vinayak Menon wrote:
>> On 01/14/2015 10:20 PM, Michal Hocko wrote:
>>> On Wed 14-01-15 17:06:59, Vinayak Menon wrote:
>>> [...]
>>>> In one such instance, zone_page_state(zone, NR_ISOLATED_FILE)
>>>> had returned 14, zone_page_state(zone, NR_INACTIVE_FILE)
>>>> returned 92, and GFP_IOFS was set, and this resulted
>>>> in too_many_isolated returning true. But one of the CPU's
>>>> pageset vm_stat_diff had NR_ISOLATED_FILE as "-14". So the
>>>> actual isolated count was zero. As there weren't any more
>>>> updates to NR_ISOLATED_FILE and vmstat_update deffered work
>>>> had not been scheduled yet, 7 tasks were spinning in the
>>>> congestion wait loop for around 4 seconds, in the direct
>>>> reclaim path.
>>>
>>> Not syncing for such a long time doesn't sound right. I am not familiar
>>> with the vmstat syncing but sysctl_stat_interval is HZ so it should
>>> happen much more often that every 4 seconds.
>>>
>>
>> Though the interval is HZ, since the vmstat_work is declared as a
>> deferrable work, IIUC the timer trigger can be deferred to the next
>> non-defferable timer expiry on the CPU which is in idle. This results
>> in the vmstat syncing on an idle CPU delayed by seconds. May be in
>> most cases this behavior is fine, except in cases like this.
>
> I am not sure I understand the above because CPU being idle doesn't
> seem important AFAICS. Anyway I have checked the current code which has
> changed quite recently by 7cc36bbddde5 (vmstat: on-demand vmstat workers
> V8). Let's CC Christoph (the thread starts here:
> http://thread.gmane.org/gmane.linux.kernel.mm/127229).
>

I will try to explain the exact observations. All the cases which I had 
encountered, had similar symptoms. In one of the cases, it was CPU3 
alone which had not updated the vmstat_diff. This CPU was in idle for 
around 30 secs. When I looked at the tvec base for this CPU, the timer 
associated with vmstat_update had its expiry time less than current 
jiffies. This timer had its deferrable flag set, and was tied to the 
next non-deferrable timer in the list. Since deferrable timers can't 
wake up the CPU, the vmstat sync for this CPU was deferred for a long 
time i.e. till the expiry of next non-deferrable timer. The issue was 
caught because, one of the tasks which was in reclaim path and in the 
congestion_wait loop had an associated watchdog, which resulted in a 
panic after 4secs. So 4 secs is actually the watchdog expiry, and the 
time we can get blocked in the congestion loop can be even more.



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation

WARNING: multiple messages have this Message-ID (diff)
From: Vinayak Menon <vinmenon@codeaurora.org>
To: Michal Hocko <mhocko@suse.cz>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, hannes@cmpxchg.org,
	vdavydov@parallels.com, mgorman@suse.de, minchan@kernel.org,
	Christoph Lameter <cl@gentwo.org>
Subject: Re: [PATCH v2] mm: vmscan: fix the page state calculation in too_many_isolated
Date: Sat, 17 Jan 2015 20:48:18 +0530	[thread overview]
Message-ID: <54BA7D3A.40100@codeaurora.org> (raw)
In-Reply-To: <20150116154922.GB4650@dhcp22.suse.cz>

On 01/16/2015 09:19 PM, Michal Hocko wrote:
> On Thu 15-01-15 22:54:20, Vinayak Menon wrote:
>> On 01/14/2015 10:20 PM, Michal Hocko wrote:
>>> On Wed 14-01-15 17:06:59, Vinayak Menon wrote:
>>> [...]
>>>> In one such instance, zone_page_state(zone, NR_ISOLATED_FILE)
>>>> had returned 14, zone_page_state(zone, NR_INACTIVE_FILE)
>>>> returned 92, and GFP_IOFS was set, and this resulted
>>>> in too_many_isolated returning true. But one of the CPU's
>>>> pageset vm_stat_diff had NR_ISOLATED_FILE as "-14". So the
>>>> actual isolated count was zero. As there weren't any more
>>>> updates to NR_ISOLATED_FILE and vmstat_update deffered work
>>>> had not been scheduled yet, 7 tasks were spinning in the
>>>> congestion wait loop for around 4 seconds, in the direct
>>>> reclaim path.
>>>
>>> Not syncing for such a long time doesn't sound right. I am not familiar
>>> with the vmstat syncing but sysctl_stat_interval is HZ so it should
>>> happen much more often that every 4 seconds.
>>>
>>
>> Though the interval is HZ, since the vmstat_work is declared as a
>> deferrable work, IIUC the timer trigger can be deferred to the next
>> non-defferable timer expiry on the CPU which is in idle. This results
>> in the vmstat syncing on an idle CPU delayed by seconds. May be in
>> most cases this behavior is fine, except in cases like this.
>
> I am not sure I understand the above because CPU being idle doesn't
> seem important AFAICS. Anyway I have checked the current code which has
> changed quite recently by 7cc36bbddde5 (vmstat: on-demand vmstat workers
> V8). Let's CC Christoph (the thread starts here:
> http://thread.gmane.org/gmane.linux.kernel.mm/127229).
>

I will try to explain the exact observations. All the cases which I had 
encountered, had similar symptoms. In one of the cases, it was CPU3 
alone which had not updated the vmstat_diff. This CPU was in idle for 
around 30 secs. When I looked at the tvec base for this CPU, the timer 
associated with vmstat_update had its expiry time less than current 
jiffies. This timer had its deferrable flag set, and was tied to the 
next non-deferrable timer in the list. Since deferrable timers can't 
wake up the CPU, the vmstat sync for this CPU was deferred for a long 
time i.e. till the expiry of next non-deferrable timer. The issue was 
caught because, one of the tasks which was in reclaim path and in the 
congestion_wait loop had an associated watchdog, which resulted in a 
panic after 4secs. So 4 secs is actually the watchdog expiry, and the 
time we can get blocked in the congestion loop can be even more.



-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a
member of the Code Aurora Forum, hosted by The Linux Foundation

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-01-17 15:18 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-01-14 11:36 [PATCH v2] mm: vmscan: fix the page state calculation in too_many_isolated Vinayak Menon
2015-01-14 11:36 ` Vinayak Menon
2015-01-14 16:50 ` Michal Hocko
2015-01-14 16:50   ` Michal Hocko
2015-01-15 17:24   ` Vinayak Menon
2015-01-15 17:24     ` Vinayak Menon
2015-01-16 15:49     ` Michal Hocko
2015-01-16 15:49       ` Michal Hocko
2015-01-16 17:57       ` Michal Hocko
2015-01-16 17:57         ` Michal Hocko
2015-01-16 19:17         ` Christoph Lameter
2015-01-16 19:17           ` Christoph Lameter
2015-01-17 15:18       ` Vinayak Menon [this message]
2015-01-17 15:18         ` Vinayak Menon
2015-01-17 19:48         ` Christoph Lameter
2015-01-17 19:48           ` Christoph Lameter
2015-01-19  4:27           ` Vinayak Menon
2015-01-19  4:27             ` Vinayak Menon
2015-01-21 14:39             ` Michal Hocko
2015-01-21 14:39               ` Michal Hocko
2015-01-22 15:16               ` Vlastimil Babka
2015-01-22 15:16                 ` Vlastimil Babka
2015-01-22 16:11               ` Christoph Lameter
2015-01-22 16:11                 ` Christoph Lameter
2015-01-26 17:46                 ` Michal Hocko
2015-01-26 17:46                   ` Michal Hocko
2015-01-26 18:35                   ` Christoph Lameter
2015-01-26 18:35                     ` Christoph Lameter
2015-01-27 10:52                     ` Michal Hocko
2015-01-27 10:52                       ` Michal Hocko
2015-01-27 16:59                       ` Christoph Lameter
2015-01-27 16:59                         ` Christoph Lameter
2015-01-30 15:28                         ` Michal Hocko
2015-01-30 15:28                           ` Michal Hocko
2015-01-26 17:28           ` Michal Hocko
2015-01-26 17:28             ` Michal Hocko
2015-01-26 18:35             ` Christoph Lameter
2015-01-26 18:35               ` Christoph Lameter
2015-01-26 22:11             ` Andrew Morton
2015-01-26 22:11               ` Andrew Morton
2015-01-27 10:41               ` Michal Hocko
2015-01-27 10:41                 ` Michal Hocko
2015-01-27 10:33             ` Vinayak Menon
2015-01-27 10:33               ` Vinayak Menon
2015-01-27 10:45               ` Michal Hocko
2015-01-27 10:45                 ` Michal Hocko
2015-01-29 17:32       ` Christoph Lameter
2015-01-29 17:32         ` Christoph Lameter
2015-01-30 15:27         ` Michal Hocko
2015-01-30 15:27           ` Michal Hocko
2015-01-16  1:17 ` Andrew Morton
2015-01-16  1:17   ` Andrew Morton
2015-01-16  5:10   ` Vinayak Menon
2015-01-16  5:10     ` Vinayak Menon
2015-01-17 16:29   ` Vinayak Menon
2015-01-17 16:29     ` Vinayak Menon
2015-02-11 22:14     ` Andrew Morton
2015-02-11 22:14       ` Andrew Morton
2015-02-12 16:19       ` Vlastimil Babka
2015-02-12 16:19         ` Vlastimil Babka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54BA7D3A.40100@codeaurora.org \
    --to=vinmenon@codeaurora.org \
    --cc=akpm@linux-foundation.org \
    --cc=cl@gentwo.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    --cc=minchan@kernel.org \
    --cc=vdavydov@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.