All of lore.kernel.org
 help / color / mirror / Atom feed
From: Nagachandra P <nagachandra@gmail.com>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: Vikram MP <mp.vikram@gmail.com>, linux-ext4@vger.kernel.org
Subject: Re: Memory allocation can cause ext4 filesystem to be remounted r/o
Date: Thu, 27 Jun 2013 18:28:21 +0530	[thread overview]
Message-ID: <CAFy9=U5bPbDqfU=NVdaxvK5gNX9O33LD6mYJYqoTQLq7UTjqFw@mail.gmail.com> (raw)
In-Reply-To: <20130626180345.GA4128@thunk.org>

Hi Theodore,

Could you point me to the code where ext4_std_err is not triggered
because of LMK? As I see it, if a memory allocation returns error in
some of the case ext4_std_error would invariably be called. Please
consider the following call stack

send sigkill to 5648 (id.app.sbrowser), score_adj 1000,adj 15, size
13257 with ofree -2010 20287, cfree 18597 902 msa 1000 ma 15
id.app.sbrowser: page allocation failure: order:0, mode:0x50
[<c0013aa8>] (unwind_backtrace+0x0/0x11c) from [<c00d6530>]
(warn_alloc_failed+0xe8/0x110)
[<c00d6530>] (warn_alloc_failed+0xe8/0x110) from [<c00d9308>]
(__alloc_pages_nodemask+0x6d4/0x804)
[<c00d9308>] (__alloc_pages_nodemask+0x6d4/0x804) from [<c00d2b34>]
(find_or_create_page+0x40/0x84)
[<c00d2b34>] (find_or_create_page+0x40/0x84) from [<c0188858>]
(ext4_mb_load_buddy+0xd4/0x2b4)
[<c0188858>] (ext4_mb_load_buddy+0xd4/0x2b4) from [<c018c69c>]
(ext4_free_blocks+0x5d4/0xa08)
[<c018c69c>] (ext4_free_blocks+0x5d4/0xa08) from [<c0181218>]
(ext4_ext_remove_space+0x690/0xd9c)
[<c0181218>] (ext4_ext_remove_space+0x690/0xd9c) from [<c0183654>]
(ext4_ext_truncate+0x100/0x1c8)
[<c0183654>] (ext4_ext_truncate+0x100/0x1c8) from [<c015e2ec>]
(ext4_truncate+0xf4/0x194)
[<c015e2ec>] (ext4_truncate+0xf4/0x194) from [<c01629dc>]
(ext4_evict_inode+0x3b4/0x4ac)
[<c01629dc>] (ext4_evict_inode+0x3b4/0x4ac) from [<c011871c>] (evict+0x8c/0x150)
[<c011871c>] (evict+0x8c/0x150) from [<c010f030>] (do_unlinkat+0xdc/0x134)
[<c010f030>] (do_unlinkat+0xdc/0x134) from [<c000e100>]
(ret_fast_syscall+0x0/0x30)

The failure to allocate memory in above case is because of the kill
signal received.

__alloc_pages_slowpath would return NULL in case its received a KILL
signal. (I don't see any code in 3.4.5 that would check for something
similar to TIF_MEMDIE to make an decision on whether to call
ext4_std_error or not, is this added recently).

Thanks
Naga

On Wed, Jun 26, 2013 at 11:33 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> On Wed, Jun 26, 2013 at 10:35:22PM +0530, Nagachandra P wrote:
>>
>> These issue are not easy to reproduce!!! We are running multiple
>> applications (of different memory size) over a period of a 24 hrs to
>> 36 hrs and we hit this once. We have seen these issues easier to
>> reproduce typically with around 512MB memory (may be in about 16 hrs -
>> 20 hrs), and harder to reproduce with 1GB memory.
>>
>> Most of the time we get into these situation are when an application
>> (Typically AsyncTasks in Android) that is doing ext4 fs ops are of low
>> adj values (> 9, typically 10 - 12) and hence would be fairly gullible
>> to be killed (and there would be no way to distinguish this from
>> application perspective), this is one of the challenges we are facing.
>> Also, here we are don't have to completely be out of memory (but just
>> withing the LMK band for the process adj value).
>
> To be clear, if the application is killed by the low memory killer,
> we're not going to trigger the ext4_std_err() codepath.  The
> ext4_std_error() is getting called because free memory has fallen to
> _zero_ and so kmem_cache_alloc() returns an error.  Should ext4 do a
> better job with handling this?  Yes, absolutely.  I do consider this a
> fs bug that we should try to fix.  The reality though is if that free
> memory has gone to zero, it's going to put multiple kernel subsystems
> under stress.
>
> It is good to hear that this is only happening on highly memory
> constrained devices --- speaking as a owner of a Nexus 4 with 2GB of
> memory.  :-P
>
> That's why the bigger issue is why did free memory go to zero in the
> first place?  That means the LMK was probably not being aggressive
> enough, or something started consuming a lot of memory too quickly,
> before the page cleaner and write throttling algorithms could kick in
> and try to deal with it.
>
>> But, on rethinking your idea on retrying may work if we have some
>> tweaks in LMK as well (like killing multiple tasks instead of just
>> one).
>
> You might also consider looking at tweaking the mm low watermark and
> minimum watermark.  See the tunable /proc/sys/vm/min_free_kbytes.
>
> You might want to just simply try monitorinig the free memory levels
> on a continuous basis, and see how often it's dropping below some
> minimum level.  This will allow you to give you a figure of merit by
> which you can try tuning your system, without needing to wait for a
> file system error.
>
> Cheers,
>
>                                         - Ted

  reply	other threads:[~2013-06-27 12:58 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-25  9:25 (unknown), Nagachandra P
2013-06-26 14:02 ` Memory allocation can cause ext4 filesystem to be remounted r/o Theodore Ts'o
2013-06-26 14:54   ` Theodore Ts'o
2013-06-26 15:20     ` Nagachandra P
2013-06-26 16:34       ` Theodore Ts'o
2013-06-26 17:05         ` Nagachandra P
2013-06-26 18:03           ` Theodore Ts'o
2013-06-27 12:58             ` Nagachandra P [this message]
2013-06-27 17:36               ` Theodore Ts'o
2013-06-28 13:52                 ` Nagachandra P
2013-06-26 18:53     ` Joseph D. Wagner
2013-06-26 22:14       ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFy9=U5bPbDqfU=NVdaxvK5gNX9O33LD6mYJYqoTQLq7UTjqFw@mail.gmail.com' \
    --to=nagachandra@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=mp.vikram@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.