All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ralf-Peter Rohbeck <Ralf-Peter.Rohbeck@quantum.com>
To: Michal Hocko <mhocko@suse.cz>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	Vlastimil Babka <vbabka@suse.cz>
Subject: Re: OOM killer changes
Date: Sun, 14 Aug 2016 21:48:08 -0700	[thread overview]
Message-ID: <6cb37d4a-d2dd-6c2f-a65d-51474103bf86@Quantum.com> (raw)
In-Reply-To: <ccad54a2-be1e-44cf-b9c8-d6b34af4901d@quantum.com>

On 02.08.2016 12:25, Ralf-Peter Rohbeck wrote:
> I can do that but it'll be later this week.
>
> Ralf-Peter
> On 08/02/2016 12:10 AM, Michal Hocko wrote:
>> On Mon 01-08-16 14:27:51, Ralf-Peter Rohbeck wrote:
>>> On 01.08.2016 14:14, Ralf-Peter Rohbeck wrote:
>>>> On 01.08.2016 13:26, Michal Hocko wrote:
>>>>>> sdc, sdd and sde each at max speed, with a little bit of garden
>>>>>> variety IO
>>>>>> on sda and sdb.
>>>>> So do I get it right that the majority of the IO is to those 
>>>>> slower USB
>>>>> disks?  If yes then does lowering the dirty_bytes to something 
>>>>> smaller
>>>>> help?
>>>> ADMIN
>>>> Yes, the vast majority.
>>>>
>>>> I set dirty_bytes to 128MiB and started a fairly IO and memory 
>>>> intensive
>>>> process and the OOM killer kicked in within a few seconds.
>>>>
>>>> Same with 16MiB dirty_bytes and 1MiB.
>>>>
>>>> Some additional IO load from my fast subsystem is enough:
>>>>
>>>> At 1MiB dirty_bytes,
>>>>
>>>> find /btrfs0/ -type f -exec md5sum {} \;
>>>>
>>>> was enough (where /btrfs0 is on a LVM2 LV and the PV is on sda.) It 
>>>> read
>>>> a few dozen files (random stuff with very mixed file sizes, none very
>>>> big) until the OOM killer kicked in.
>>>>
>>>> I'll try 4.6.
>>> With Debian 4.6.0.1 (4.6.4-1) it works: Writing to 3 USB drives and 
>>> running
>>> each of the 3 tests that triggered the OOM killer in parallel, with 
>>> default
>>> dirty settings.
>> Thanks for retesting! Now that it seems you are able to reproduce this,
>> could you do some experiments, please? First of all it would be great to
>> find out why we do not retry the compaction and whether it could make
>> some progress. The patch below will tell us the first part. Tracepoints
>> can tell us the other part. Vlastimil, could you recommend some which
>> would give us some hints without generating way too much output?
>> ---
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 8b3e1341b754..a10b29a918d4 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -3274,6 +3274,7 @@ should_compact_retry(struct alloc_context *ac, 
>> int order, int alloc_flags,
>>               *migrate_mode = MIGRATE_SYNC_LIGHT;
>>               return true;
>>           }
>> +        pr_info("XXX: compaction_failed\n");
>>           return false;
>>       }
>>   @@ -3283,8 +3284,12 @@ should_compact_retry(struct alloc_context 
>> *ac, int order, int alloc_flags,
>>        * But do not retry if the given zonelist is not suitable for
>>        * compaction.
>>        */
>> -    if (compaction_withdrawn(compact_result))
>> -        return compaction_zonelist_suitable(ac, order, alloc_flags);
>> +    if (compaction_withdrawn(compact_result)) {
>> +        int ret = compaction_zonelist_suitable(ac, order, alloc_flags);
>> +        if (!ret)
>> +            pr_info("XXX: no zone suitable for compaction\n");
>> +        return ret;
>> +    }
>>         /*
>>        * !costly requests are much more important than __GFP_REPEAT
>> @@ -3299,6 +3304,7 @@ should_compact_retry(struct alloc_context *ac, 
>> int order, int alloc_flags,
>>       if (compaction_retries <= max_retries)
>>           return true;
>>   +    pr_info("XXX: compaction retries fail after %d\n", 
>> compaction_retries);
>>       return false;
>>   }
>>   #else
>>
>
Took me a little longer than expected due to work. The failure wouldn't 
happen for a while and so I started a couple of scripts and let them 
run. When I checked today the server didn't respond on the network and 
sure enough it had killed everything. This is with 4.7.0 with the config 
based on Debian 4.7-rc7.

trace_pipe got a little big (5GB) so I uploaded the logs to 
https://filebin.net/box0wycfouvhl6sr/OOM_4.7.0.tar.bz2. before_btrfs is 
before the btrfs filesystems were mounted.
I did run a btrfs balance because it creates IO load and I needed to 
balance anyway. Maybe that's what caused it?

I'll make the changes requested by Michal and try again.

Thanks,
Ralf-Peter


----------------------------------------------------------------------
The information contained in this transmission may be confidential. Any disclosure, copying, or further distribution of confidential information is not permitted unless such privilege is explicitly granted in writing by Quantum. Quantum reserves the right to have electronic communications, including email and attachments, sent across its networks filtered through anti virus and spam software programs and retain such messages in order to comply with applicable data security and retention requirements. Quantum is not responsible for the proper and complete transmission of the substance of this communication or for any delay in its receipt.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-08-15  4:48 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <d8f3adcc-3607-1ef6-9ec5-82b2e125eef2@quantum.com>
2016-08-01  6:16 ` OOM killer changes Michal Hocko
     [not found]   ` <b1a39756-a0b5-1900-6575-d6e1f502cb26@Quantum.com>
     [not found]     ` <20160801182358.GB31957@dhcp22.suse.cz>
     [not found]       ` <30dbabc4-585c-55a5-9f3a-4e243c28356a@Quantum.com>
2016-08-01 19:26         ` Michal Hocko
2016-08-01 19:35           ` Ralf-Peter Rohbeck
2016-08-01 19:43             ` Michal Hocko
2016-08-01 19:52               ` Ralf-Peter Rohbeck
2016-08-01 20:09                 ` Michal Hocko
2016-08-01 20:16                   ` Ralf-Peter Rohbeck
2016-08-01 20:26                     ` Michal Hocko
2016-08-01 21:14                       ` Ralf-Peter Rohbeck
2016-08-01 21:27                         ` Ralf-Peter Rohbeck
2016-08-02  7:10                           ` Michal Hocko
2016-08-02 19:25                             ` Ralf-Peter Rohbeck
2016-08-15  4:48                               ` Ralf-Peter Rohbeck [this message]
2016-08-15  9:16                                 ` Vlastimil Babka
2016-08-15 15:01                                   ` Michal Hocko
2016-08-15 18:42                                     ` Ralf-Peter Rohbeck
2016-08-16  7:32                                       ` Michal Hocko
2016-08-16  7:43                                         ` Michal Hocko
2016-08-17  9:14                                           ` Ralf-Peter Rohbeck
2016-08-17  9:23                                             ` Vlastimil Babka
2016-08-17  9:28                                               ` Ralf-Peter Rohbeck
2016-08-17  9:33                                                 ` Michal Hocko
2016-08-17 23:37                                                   ` Ralf-Peter Rohbeck
2016-08-18  6:57                                                     ` Vlastimil Babka
2016-08-18 20:01                                                       ` Ralf-Peter Rohbeck
2016-08-18 20:12                                                         ` Vlastimil Babka
2016-08-19  2:42                                                           ` Ralf-Peter Rohbeck
2016-08-19  6:27                                                             ` Vlastimil Babka
2016-08-19  7:33                                                               ` Michal Hocko
2016-08-19  7:47                                                                 ` Vlastimil Babka
2016-08-19  8:26                                                                   ` Michal Hocko
2016-08-24 18:13                                                                     ` Ralf-Peter Rohbeck
2016-08-25  7:22                                                                       ` Michal Hocko
2016-08-25 20:35                                                                         ` Ralf-Peter Rohbeck
2016-08-26  8:35                                                                           ` Michal Hocko
2016-09-06 11:09                                                                             ` Vlastimil Babka
2016-08-23  5:02                                                               ` Joonsoo Kim
2016-08-23  7:45                                                                 ` Michal Hocko
2016-08-17  0:26                                         ` Ralf-Peter Rohbeck
2016-08-17  7:43                                           ` Vlastimil Babka
2016-08-16  3:12                                   ` Joonsoo Kim
2016-08-16  7:44                                     ` Vlastimil Babka
2016-08-17  4:48                                     ` Ralf-Peter Rohbeck
2016-08-17  7:56                                       ` Vlastimil Babka
2016-08-17  8:16                                         ` Joonsoo Kim
2016-08-17  9:21                                           ` Ralf-Peter Rohbeck
2016-08-17  9:11                                         ` Ralf-Peter Rohbeck
2016-08-17  9:20                                           ` Vlastimil Babka
2016-08-02  7:11           ` Vlastimil Babka
2016-08-02  9:02           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6cb37d4a-d2dd-6c2f-a65d-51474103bf86@Quantum.com \
    --to=ralf-peter.rohbeck@quantum.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.