From: Mikulas Patocka <mpatocka@redhat.com> To: David Rientjes <rientjes@google.com> Cc: Michal Hocko <mhocko@kernel.org>, Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>, Ondrej Kozina <okozina@redhat.com>, Jerome Marchand <jmarchan@redhat.com>, Stanislav Kozina <skozina@redhat.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, dm-devel@redhat.com Subject: Re: System freezes after OOM Date: Thu, 14 Jul 2016 08:27:12 -0400 (EDT) [thread overview] Message-ID: <alpine.LRH.2.02.1607140818250.15554@file01.intranet.prod.int.rdu2.redhat.com> (raw) In-Reply-To: <alpine.DEB.2.10.1607131644590.92037@chino.kir.corp.google.com> On Wed, 13 Jul 2016, David Rientjes wrote: > On Wed, 13 Jul 2016, Mikulas Patocka wrote: > > > What are the real problems that f9054c70d28bc214b2857cf8db8269f4f45a5e23 > > tries to fix? > > > > It prevents the whole system from livelocking due to an oom killed process > stalling forever waiting for mempool_alloc() to return. No other threads > may be oom killed while waiting for it to exit. > > > Do you have a stacktrace where it deadlocked, or was just a theoretical > > consideration? > > > > schedule > schedule_timeout > io_schedule_timeout > mempool_alloc > __split_and_process_bio > dm_request > generic_make_request > submit_bio > mpage_readpages > ext4_readpages > __do_page_cache_readahead > ra_submit > filemap_fault > handle_mm_fault > __do_page_fault > do_page_fault > page_fault Device mapper should be able to proceed if there is no available memory. If it doesn't proceed, there is a bug in it. I'd like to ask - what device mapper targets did you use in this case? Are there some other deadlocked processes? (show sysrq-t, sysrq-w when this happened) Did the machine lock up completely with that stacktrace, or was it just slowed down? > > Mempool users generally (except for some flawed cases like fs_bio_set) do > > not require memory to proceed. So if you just loop in mempool_alloc, the > > processes that exhasted the mempool reserve will eventually return objects > > to the mempool and you should proceed. > > > > That's obviously not the case if we have hundreds of machines timing out > after two hours waiting for that fault to succeed. The mempool interface > cannot require that users return elements to the pool synchronous with all > allocators so that we can happily loop forever, the only requirement on Mempool users must return objects to the mempool. > the interface is that mempool_alloc() must succeed. If the context of the > thread doing mempool_alloc() allows access to memory reserves, this will > always be allowed by the page allocator. This is not a mempool problem. Mikulas
WARNING: multiple messages have this Message-ID (diff)
From: Mikulas Patocka <mpatocka@redhat.com> To: David Rientjes <rientjes@google.com> Cc: Michal Hocko <mhocko@kernel.org>, Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>, Ondrej Kozina <okozina@redhat.com>, Jerome Marchand <jmarchan@redhat.com>, Stanislav Kozina <skozina@redhat.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, dm-devel@redhat.com Subject: Re: System freezes after OOM Date: Thu, 14 Jul 2016 08:27:12 -0400 (EDT) [thread overview] Message-ID: <alpine.LRH.2.02.1607140818250.15554@file01.intranet.prod.int.rdu2.redhat.com> (raw) In-Reply-To: <alpine.DEB.2.10.1607131644590.92037@chino.kir.corp.google.com> On Wed, 13 Jul 2016, David Rientjes wrote: > On Wed, 13 Jul 2016, Mikulas Patocka wrote: > > > What are the real problems that f9054c70d28bc214b2857cf8db8269f4f45a5e23 > > tries to fix? > > > > It prevents the whole system from livelocking due to an oom killed process > stalling forever waiting for mempool_alloc() to return. No other threads > may be oom killed while waiting for it to exit. > > > Do you have a stacktrace where it deadlocked, or was just a theoretical > > consideration? > > > > schedule > schedule_timeout > io_schedule_timeout > mempool_alloc > __split_and_process_bio > dm_request > generic_make_request > submit_bio > mpage_readpages > ext4_readpages > __do_page_cache_readahead > ra_submit > filemap_fault > handle_mm_fault > __do_page_fault > do_page_fault > page_fault Device mapper should be able to proceed if there is no available memory. If it doesn't proceed, there is a bug in it. I'd like to ask - what device mapper targets did you use in this case? Are there some other deadlocked processes? (show sysrq-t, sysrq-w when this happened) Did the machine lock up completely with that stacktrace, or was it just slowed down? > > Mempool users generally (except for some flawed cases like fs_bio_set) do > > not require memory to proceed. So if you just loop in mempool_alloc, the > > processes that exhasted the mempool reserve will eventually return objects > > to the mempool and you should proceed. > > > > That's obviously not the case if we have hundreds of machines timing out > after two hours waiting for that fault to succeed. The mempool interface > cannot require that users return elements to the pool synchronous with all > allocators so that we can happily loop forever, the only requirement on Mempool users must return objects to the mempool. > the interface is that mempool_alloc() must succeed. If the context of the > thread doing mempool_alloc() allows access to memory reserves, this will > always be allowed by the page allocator. This is not a mempool problem. Mikulas -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-07-14 12:27 UTC|newest] Thread overview: 136+ messages / expand[flat|nested] mbox.gz Atom feed top [not found] <57837CEE.1010609@redhat.com> [not found] ` <f80dc690-7e71-26b2-59a2-5a1557d26713@redhat.com> [not found] ` <9be09452-de7f-d8be-fd5d-4a80d1cd1ba3@redhat.com> 2016-07-11 15:43 ` System freezes after OOM Mikulas Patocka 2016-07-11 15:43 ` Mikulas Patocka 2016-07-12 6:49 ` Michal Hocko 2016-07-12 6:49 ` Michal Hocko 2016-07-12 23:44 ` Mikulas Patocka 2016-07-12 23:44 ` Mikulas Patocka 2016-07-12 23:44 ` Mikulas Patocka 2016-07-13 8:35 ` Jerome Marchand 2016-07-13 8:35 ` Jerome Marchand 2016-07-13 11:14 ` Michal Hocko 2016-07-13 11:14 ` Michal Hocko 2016-07-13 11:14 ` Michal Hocko 2016-07-13 14:21 ` Mikulas Patocka 2016-07-13 14:21 ` Mikulas Patocka 2016-07-13 11:10 ` Michal Hocko 2016-07-13 11:10 ` Michal Hocko 2016-07-13 11:10 ` Michal Hocko 2016-07-13 12:50 ` Michal Hocko 2016-07-13 12:50 ` Michal Hocko 2016-07-13 13:44 ` Milan Broz 2016-07-13 13:44 ` Milan Broz 2016-07-13 15:21 ` Mikulas Patocka 2016-07-13 15:21 ` Mikulas Patocka 2016-07-14 9:09 ` Michal Hocko 2016-07-14 9:09 ` Michal Hocko 2016-07-14 9:46 ` Milan Broz 2016-07-14 9:46 ` Milan Broz 2016-07-13 12:50 ` Michal Hocko 2016-07-13 15:02 ` Mikulas Patocka 2016-07-13 15:02 ` Mikulas Patocka 2016-07-13 15:02 ` Mikulas Patocka 2016-07-14 10:51 ` [dm-devel] " Ondrej Kozina 2016-07-14 10:51 ` Ondrej Kozina 2016-07-14 12:51 ` Michal Hocko 2016-07-14 12:51 ` Michal Hocko 2016-07-14 14:00 ` Mikulas Patocka 2016-07-14 14:00 ` Mikulas Patocka 2016-07-14 14:59 ` Michal Hocko 2016-07-14 14:59 ` Michal Hocko 2016-07-14 15:25 ` Ondrej Kozina 2016-07-14 15:25 ` Ondrej Kozina 2016-07-14 17:35 ` Mikulas Patocka 2016-07-14 17:35 ` Mikulas Patocka 2016-07-15 8:35 ` Michal Hocko 2016-07-15 8:35 ` Michal Hocko 2016-07-15 12:11 ` Mikulas Patocka 2016-07-15 12:11 ` Mikulas Patocka 2016-07-15 12:22 ` Michal Hocko 2016-07-15 12:22 ` Michal Hocko 2016-07-15 17:02 ` Mikulas Patocka 2016-07-15 17:02 ` Mikulas Patocka 2016-07-18 7:22 ` Michal Hocko 2016-07-18 7:22 ` Michal Hocko 2016-07-14 14:08 ` Ondrej Kozina 2016-07-14 14:08 ` Ondrej Kozina 2016-07-14 14:08 ` Ondrej Kozina 2016-07-14 15:31 ` Michal Hocko 2016-07-14 15:31 ` Michal Hocko 2016-07-14 17:07 ` Ondrej Kozina 2016-07-14 17:07 ` Ondrej Kozina 2016-07-14 17:36 ` Michal Hocko 2016-07-14 17:36 ` Michal Hocko 2016-07-14 17:39 ` Michal Hocko 2016-07-14 17:39 ` Michal Hocko 2016-07-15 11:42 ` Tetsuo Handa 2016-07-15 11:42 ` Tetsuo Handa 2016-07-13 13:19 ` Tetsuo Handa 2016-07-13 13:19 ` Tetsuo Handa 2016-07-13 13:39 ` Michal Hocko 2016-07-13 13:39 ` Michal Hocko 2016-07-13 14:18 ` Mikulas Patocka 2016-07-13 14:18 ` Mikulas Patocka 2016-07-13 14:18 ` Mikulas Patocka 2016-07-13 14:56 ` Michal Hocko 2016-07-13 14:56 ` Michal Hocko 2016-07-13 15:11 ` Mikulas Patocka 2016-07-13 15:11 ` Mikulas Patocka 2016-07-13 23:53 ` David Rientjes 2016-07-13 23:53 ` David Rientjes 2016-07-14 11:01 ` Tetsuo Handa 2016-07-14 11:01 ` Tetsuo Handa 2016-07-14 12:29 ` Mikulas Patocka 2016-07-14 12:29 ` Mikulas Patocka 2016-07-14 20:26 ` David Rientjes 2016-07-14 20:26 ` David Rientjes 2016-07-14 21:40 ` Tetsuo Handa 2016-07-14 21:40 ` Tetsuo Handa 2016-07-14 22:04 ` David Rientjes 2016-07-14 22:04 ` David Rientjes 2016-07-15 11:25 ` Mikulas Patocka 2016-07-15 11:25 ` Mikulas Patocka 2016-07-15 21:21 ` David Rientjes 2016-07-15 21:21 ` David Rientjes 2016-07-14 11:01 ` Tetsuo Handa 2016-07-14 12:27 ` Mikulas Patocka [this message] 2016-07-14 12:27 ` Mikulas Patocka 2016-07-14 20:22 ` David Rientjes 2016-07-14 20:22 ` David Rientjes 2016-07-15 11:21 ` Mikulas Patocka 2016-07-15 11:21 ` Mikulas Patocka 2016-07-15 21:25 ` David Rientjes 2016-07-15 21:25 ` David Rientjes 2016-07-15 21:39 ` Mikulas Patocka 2016-07-15 21:39 ` Mikulas Patocka 2016-07-15 21:58 ` David Rientjes 2016-07-15 21:58 ` David Rientjes 2016-07-15 23:53 ` Mikulas Patocka 2016-07-15 23:53 ` Mikulas Patocka 2016-07-18 15:14 ` Johannes Weiner 2016-07-18 15:14 ` Johannes Weiner 2016-07-14 15:29 ` Michal Hocko 2016-07-14 15:29 ` Michal Hocko 2016-07-14 20:38 ` David Rientjes 2016-07-14 20:38 ` David Rientjes 2016-07-15 7:22 ` Michal Hocko 2016-07-15 7:22 ` Michal Hocko 2016-07-15 8:23 ` Michal Hocko 2016-07-15 8:23 ` Michal Hocko 2016-07-15 12:00 ` Mikulas Patocka 2016-07-15 12:00 ` Mikulas Patocka 2016-07-15 21:47 ` David Rientjes 2016-07-15 21:47 ` David Rientjes 2016-07-18 7:39 ` Michal Hocko 2016-07-18 7:39 ` Michal Hocko 2016-07-18 21:03 ` David Rientjes 2016-07-18 21:03 ` David Rientjes 2016-07-13 23:53 ` David Rientjes 2016-07-13 15:11 ` Mikulas Patocka 2016-07-13 14:56 ` Michal Hocko 2016-07-13 13:39 ` Michal Hocko 2016-07-14 0:01 ` David Rientjes 2016-07-14 0:01 ` David Rientjes 2016-07-14 0:01 ` David Rientjes 2016-07-13 13:19 ` Tetsuo Handa 2016-07-12 6:49 ` Michal Hocko 2016-07-11 15:43 ` Mikulas Patocka
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=alpine.LRH.2.02.1607140818250.15554@file01.intranet.prod.int.rdu2.redhat.com \ --to=mpatocka@redhat.com \ --cc=dm-devel@redhat.com \ --cc=jmarchan@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mhocko@kernel.org \ --cc=okozina@redhat.com \ --cc=penguin-kernel@i-love.sakura.ne.jp \ --cc=rientjes@google.com \ --cc=skozina@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.