linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
@ 2012-11-25 15:03 Dimitrios Apostolou
  2012-12-02 12:44 ` Dimitrios Apostolou
  0 siblings, 1 reply; 18+ messages in thread
From: Dimitrios Apostolou @ 2012-11-25 15:03 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1016 bytes --]

Hello list,

on an old PIII-500MHz laptop, 128MB RAM, kernel 3.6.6, I started a
backup process (tar|xz -4, nice'd and ionice'd -c3) from ext4 on local
ATA disk to ext3 on external USB disk (USB-2.0 port on PCMCIA card).
Even though earlier system load was minimal, free memory was plenty, the
system now is unresponsive and is thrashing the disk, but the swapfile
is rarely touched.

I managed to get some information by renicing a root console to -20, but
even then each keypress showed with a minimum 10s delay! In the attached
file you can see the dmesg output, SysRq+{W,T}, ps, vmstat, slabs,
meminfo.

I think I'm seeing paging of executable pages because ext4_inode_cache
is aggressively using all memory, evicting other pages. However under no
condition should realtime processes be unresponsive. What do you think?
Please note that I've set vm.swappiness to 0 and gradually increased
vm.vfs_cache_pressure to 1000000, but see no difference.


Thanks in advance,
Dimitris


P.S. Please CC me in all replies



[-- Attachment #2: HANG.log.gz --]
[-- Type: application/x-gzip, Size: 34880 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-11-25 15:03 backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused Dimitrios Apostolou
@ 2012-12-02 12:44 ` Dimitrios Apostolou
  2012-12-02 22:50   ` Roland Eggner
  0 siblings, 1 reply; 18+ messages in thread
From: Dimitrios Apostolou @ 2012-12-02 12:44 UTC (permalink / raw)
  To: linux-kernel; +Cc: Catalin Marinas, Theodore Ts'o

Hi, the problem in the quoted message still happens, shouldn't all of
ext4_inode_cache slab be emptied after "echo 3
> /proc/sys/vm/drop_caches"? In my case slab uses too much memory even
after all processes finish and system is in bad shape due to lack of
physical RAM. CC'ing tytso and Catalin Marinas since I've not been able
to track any leak with kmemleak.

Kernel is booted with slub_debug=,ext4_inode_cache, as this is the only
way to avoid for some time the following message. Nevertheless it has
not been able to show any leak.

kmemleak: Cannot allocate a kmemleak_object structure
kmemleak: Automatic memory scanning thread ended
kmemleak: Kernel memory leak detector disabled

Please have a look at the log (archived at [1]).

[1] http://lkml.indiana.edu/hypermail/linux/kernel/1211.3/00183.html


Thanks,
Dimitris


On Sun, 2012-11-25 at 17:03 +0200, Dimitrios Apostolou wrote:
> Hello list,
> 
> on an old PIII-500MHz laptop, 128MB RAM, kernel 3.6.6, I started a
> backup process (tar|xz -4, nice'd and ionice'd -c3) from ext4 on local
> ATA disk to ext3 on external USB disk (USB-2.0 port on PCMCIA card).
> Even though earlier system load was minimal, free memory was plenty, the
> system now is unresponsive and is thrashing the disk, but the swapfile
> is rarely touched.
> 
> I managed to get some information by renicing a root console to -20, but
> even then each keypress showed with a minimum 10s delay! In the attached
> file you can see the dmesg output, SysRq+{W,T}, ps, vmstat, slabs,
> meminfo.
> 
> I think I'm seeing paging of executable pages because ext4_inode_cache
> is aggressively using all memory, evicting other pages. However under no
> condition should realtime processes be unresponsive. What do you think?
> Please note that I've set vm.swappiness to 0 and gradually increased
> vm.vfs_cache_pressure to 1000000, but see no difference.
> 
> 
> Thanks in advance,
> Dimitris
> 
> 
> P.S. Please CC me in all replies
> 
> 



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-12-02 12:44 ` Dimitrios Apostolou
@ 2012-12-02 22:50   ` Roland Eggner
  2012-12-02 23:56     ` Dimitrios Apostolou
  0 siblings, 1 reply; 18+ messages in thread
From: Roland Eggner @ 2012-12-02 22:50 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: linux-kernel, Catalin Marinas, Theodore Ts'o

[-- Attachment #1: Type: text/plain, Size: 4930 bytes --]

On 2012-11-25 Sun 23:59:55 +0100 Roland Eggner wrote:
>On 2012-11-25 Sunday at 21:30 +0200 Dimitrios Apostolou wrote:
> > On Sun, 2012-11-25 at 15:55 +0200, Dimitrios Apostolou wrote:
> > > on an old PIII-500MHz laptop, 128MB RAM, kernel 3.6.6, I started a
> > > backup process (tar|xz -4, nice'd and ionice'd -c3) from ext4 on local
> > > ATA disk to ext3 on external USB disk (USB-2.0 port on PCMCIA card).
> > > Even though earlier system load was minimal, free memory was plenty, the
> > > system now is unresponsive and is thrashing the disk, but the swapfile
> > > is rarely touched.
> >
> > I'm now having the same experience even though I replaced xz (which
> > needed ~50MB RAM) with gzip. Even though I feel the realtime root shell
> > is a bit more responsive than before, the OOM killer is out killing
> > small processes like syslog-ng and systemd-logind... The
> > ext4_inode_cache slab is taking almost all my memory (117MB). Please
> > advise!
> 
> Hello Dimitrios,
> 
> I would try a 2.6.27.* kernel, for following reasons:
> 
> (1)  Kernel development since 2.6.27 achieved significant performance
> improvements at the cost of exploding memory consumption by the kernel for
> _internal_  data structures.  I am currently using a 3.2.34 kernel on a Notebook
> with 4 G RAM.  0,5 … 1 G RAM is usually occupied just by kernel slab [1];  this
> memory cannot be swapped, it cannot be released by other means than rebooting,
> and there seems to be  _no_  adjustment to memory pressure.  I am surprised,
> that you have managed to boot a 3.6.* kernel at all with only 128 M RAM.
> 
> (2)  At the time, when I used a PIII-Notebook, kernels 2.6.27 to 2.6.29 where
> current.  Thus chances are good, that a 2.6.27.* kernel will support chipset,
> PCI bus and devices of your notebook.  2.6.27 got longterm maintainance, the
> latest release in the linux-stable git repository is 2.6.27.62.
> So Q @ LKML community:
> Does anybody know a x86 distribution or live-CD using a 2.6.27.* kernel?
> 
> 
> [1]  Picture described in my LKML message
> Date:  Fri, 20 Jan 2012 01:08:00 +0100
> Subject:  Re: [kmemleak report 1/2] kernel 3.1.6, x86_64: mm, xfs ?, vfs ?
> remained the same with  _every_  3.1.* and 3.2.* kernel tried so far.


On 2012-12-02 Sunday at 14:44 +0200 Dimitrios Apostolou wrote:
> Hi, the problem in the quoted message still happens, shouldn't all of
> ext4_inode_cache slab be emptied after "echo 3
> > /proc/sys/vm/drop_caches"? In my case slab uses too much memory even
> after all processes finish and system is in bad shape due to lack of
> physical RAM. CC'ing tytso and Catalin Marinas since I've not been able
> to track any leak with kmemleak.
> 
> Kernel is booted with slub_debug=,ext4_inode_cache, as this is the only
> way to avoid for some time the following message. Nevertheless it has
> not been able to show any leak.
> 
> kmemleak: Cannot allocate a kmemleak_object structure
> kmemleak: Automatic memory scanning thread ended
> kmemleak: Kernel memory leak detector disabled
> 
> Please have a look at the log (archived at [1]).
> 
> [1] http://lkml.indiana.edu/hypermail/linux/kernel/1211.3/00183.html


Hello Dimitrios!


Which part of …

“0,5 … 1 G RAM is usually occupied just by kernel slab [1];  this
memory cannot be swapped, it cannot be released by other means than rebooting,
and there seems to be  _no_  adjustment to memory pressure.” 

… should I explain?


When tar or gzip writes to ext4 filesystem on your external disk, the kernel 
keeps all inode data in slab memory  _by design_ , not by memory leaks.
Several 100 M slab data cannot be stored within 128 M total RAM.
This cannot by solved by usage of /proc/sys/vm/drop_caches.
This cannot by solved by oom-killer-actions.
This cannot be solved by zram tricks.
Unmounting the external disk would release part of slab memory, but then you
cannot backup.
Reformatting the ext4 filesystem on your external disk with 128 byte inode size, 
largest possible blocksize and with extents enabled might mitigate the memory 
pressure slightly … probably not sufficient to get a working system with 3.6 
kernel.


One advantage of Linux compared to other OS is much more support for old 
hardware, if a  _proper_  kernel version is selected.  Many years ago I used
a notebook with 40 M total RAM, with a 2.4 kernel, Blackbox window manager and
Opera web browser … it worked flawlessly … just rather slowly due to swapping.  
Your notebook has much more RAM than 40 M, thus there is surely a Linux solution 
for you.  Try a 2.6.27.62 kernel, it supports ext4 (“ext4dev”), probably it 
supports all devices of your notebook, and with a slim window manager e.g.  
WindowMaker or OpenBox your notebook will probably “fly”.


PS:  Please type your reply below the text you are replying to.

-- 
Roland

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-12-02 22:50   ` Roland Eggner
@ 2012-12-02 23:56     ` Dimitrios Apostolou
  2012-12-03 17:43       ` Theodore Ts'o
  2012-12-03 18:03       ` Roland Eggner
  0 siblings, 2 replies; 18+ messages in thread
From: Dimitrios Apostolou @ 2012-12-02 23:56 UTC (permalink / raw)
  To: Roland Eggner; +Cc: linux-kernel, Catalin Marinas, Theodore Ts'o

[-- Attachment #1: Type: TEXT/PLAIN, Size: 3451 bytes --]

Dear Ronald,

sorry for not replying at your first message but I didn't consider 
changing kernel as a resolution to my problem.

On Sun, 2 Dec 2012, Roland Eggner wrote:
> Hello Dimitrios!
>
>
> Which part of …
>
> “0,5 … 1 G RAM is usually occupied just by kernel slab [1];  this
> memory cannot be swapped, it cannot be released by other means than rebooting,
> and there seems to be  _no_  adjustment to memory pressure.”
>
> … should I explain?

Check my initial message. If you see the /proc/meminfo attached you'll see 
that from all slab memory most is reclaimable, however it's not being 
reclaimed despite the memory pressure:

Slab:              64880 kB
SReclaimable:      60496 kB
SUnreclaim:         4384 kB

Also I mentioned that a nice'd -20 (Realtime!) process was unresponsive 
because of two nice'd + ionice'd processes. This is a bug no matter the 
memory pressure, IMHO. If not I'd rather hear it from someone more 
authoritative.

> When tar or gzip writes to ext4 filesystem on your external disk, the kernel
> keeps all inode data in slab memory  _by design_ , not by memory leaks.

ext4_inode_cache is a *cache*. So the kernel does not keep /all/ inode 
data there, but only what was needed recently. And drops it afterwards, 
which didn't happen after the backup ended.

> Several 100 M slab data cannot be stored within 128 M total RAM.
> This cannot by solved by usage of /proc/sys/vm/drop_caches.

Please take a look at [2], it's what is called in that case. Via this call 
all slabs that have registered a shrinker, get actually shrinked. 
Other fs do, but I can't find whether ext4 actually registers a shrinker 
for its slabs.

[2] http://lxr.linux.no/#linux+v3.6.8/mm/vmscan.c#L207


> This cannot by solved by oom-killer-actions.
> This cannot be solved by zram tricks.
> Unmounting the external disk would release part of slab memory, but then you
> cannot backup.

Unmounting the back'ed up fs didn't really help in my case. Of course all 
my filesystems are ext[234], mounted by ext4 code, so they use the same 
slab for caches. But anyway, the slab didn't shrink significantly as I was 
expecting.

> Reformatting the ext4 filesystem on your external disk with 128 byte inode size,
> largest possible blocksize and with extents enabled might mitigate the memory
> pressure slightly … probably not sufficient to get a working system with 3.6
> kernel.
>
>
> One advantage of Linux compared to other OS is much more support for old
> hardware, if a  _proper_  kernel version is selected.  Many years ago I used
> a notebook with 40 M total RAM, with a 2.4 kernel, Blackbox window manager and
> Opera web browser … it worked flawlessly … just rather slowly due to swapping.
> Your notebook has much more RAM than 40 M, thus there is surely a Linux solution
> for you.  Try a 2.6.27.62 kernel, it supports ext4 (“ext4dev”), probably it
> supports all devices of your notebook, and with a slim window manager e.g.
> WindowMaker or OpenBox your notebook will probably “fly”.

I appreciate your advice. I remember when we were running 2.4 much less 
memory was needed, but still I consider the kernel /fairly/ lean 
considering the time passed and the bloat in userspace applications. I 
choose to run latest kernels on old hardware, hopefully this will continue 
to be the case, and if things deteriorate much then let's hope we'll have 
enough time to help and fix them! :-)


Thanks,
Dimitris

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-12-02 23:56     ` Dimitrios Apostolou
@ 2012-12-03 17:43       ` Theodore Ts'o
  2012-12-03 18:47         ` Eric Paris
  2012-12-03 18:03       ` Roland Eggner
  1 sibling, 1 reply; 18+ messages in thread
From: Theodore Ts'o @ 2012-12-03 17:43 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: Roland Eggner, linux-kernel, Catalin Marinas

On Mon, Dec 03, 2012 at 01:56:27AM +0200, Dimitrios Apostolou wrote:
> 
> Please take a look at [2], it's what is called in that case. Via
> this call all slabs that have registered a shrinker, get actually
> shrinked. Other fs do, but I can't find whether ext4 actually
> registers a shrinker for its slabs.

The ext4 inode cache gets shrunk via the VFS layer, in
prune_icache_sb() in fs/inode.c.  This is true for all file systems'
inode caches.

If you are under heavy memory pressure, or if you run echo 3 >
/proc/sys/vm/drop_caches, and the inodes aren't getting dropped, then
it's because the inodes are getting pinned for some reason --- i.e.,
they are referenced via some entry in the dentry cache, perhaps there
are files open, or processes are cd'ed into a directory, etc.

So an example of what happens with the ext4_inode_cache before and
after running "echo 3 > /proc/sys/vm/drop_caches":

ext4_inode_cache  183379 183379   1872   17    8 : tunables    0    0    0 : slabdata  10787  10787      0

ext4_inode_cache    1595   6120   1872   17    8 : tunables    0    0    0 : slabdata    360    360      0

(What's left is due to all of the executable files, shared libraries,
current directories, and open files for all my processes running on my
system --- which for context is Xfce plus Chrome, emacs, mutt, and a
bunch of terminal windows plus the usual assortment of system daemons.)

If you are seeing a large number of inodes still in the ext4 inode
cache after using drop_caches, then I'd look to see whether you have
something like SELinux or auditing enabled which is pinning a bunch of
dentries or inodes, or whether your backup program (or some other
program running on your system) is keeping lots of directories or
inodes open for some reason.

Hope this helps,

					- Ted

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-12-02 23:56     ` Dimitrios Apostolou
  2012-12-03 17:43       ` Theodore Ts'o
@ 2012-12-03 18:03       ` Roland Eggner
  2012-12-03 19:25         ` Dimitrios Apostolou
  1 sibling, 1 reply; 18+ messages in thread
From: Roland Eggner @ 2012-12-03 18:03 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: linux-kernel, Catalin Marinas, Theodore Ts'o

[-- Attachment #1: Type: text/plain, Size: 1721 bytes --]

On 2012-12-03 Monday at 01:56 +0200 Dimitrios Apostolou wrote:
> Dear Ronald,
Excuse me, my name is Roland.

> sorry for not replying at your first message but I didn't consider changing 
> kernel as a resolution to my problem.
>
> … …
> > … …
> > One advantage of Linux compared to other OS is much more support for old
> > hardware, if a  _proper_  kernel version is selected.  Many years ago I used
> > a notebook with 40 M total RAM, with a 2.4 kernel, Blackbox window manager and
> > Opera web browser … it worked flawlessly … just rather slowly due to swapping.
> > Your notebook has much more RAM than 40 M, thus there is surely a Linux solution
> > for you.  Try a 2.6.27.62 kernel, it supports ext4 (“ext4dev”), probably it
> > supports all devices of your notebook, and with a slim window manager e.g.
> > WindowMaker or OpenBox your notebook will probably “fly”.
> 
> I appreciate your advice. I remember when we were running 2.4 much less 
> memory was needed, but still I consider the kernel /fairly/ lean 
> considering the time passed and the bloat in userspace applications. I 
> choose to run latest kernels on old hardware, hopefully this will continue 
> to be the case, and if things deteriorate much then let's hope we'll have 
> enough time to help and fix them! :-)

Ok, this is another story:  You are not looking for a least-effort solution just 
for your notebook.  Instead you want to contribute to the LMKL community 
solutions how to run current kernel versions under extremly tight memory 
limitations like with your notebook, right?  If so:  This is highly appreciated,
thank you!  I will interestingly read your findings :)

-- 
Roland

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-12-03 17:43       ` Theodore Ts'o
@ 2012-12-03 18:47         ` Eric Paris
  2012-12-03 19:35           ` Dimitrios Apostolou
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Paris @ 2012-12-03 18:47 UTC (permalink / raw)
  To: Theodore Ts'o, Dimitrios Apostolou, Roland Eggner,
	Linux Kernel Mailing List, Catalin Marinas

On Mon, Dec 3, 2012 at 12:43 PM, Theodore Ts'o <tytso@mit.edu> wrote:

> If you are seeing a large number of inodes still in the ext4 inode
> cache after using drop_caches, then I'd look to see whether you have
> something like SELinux or auditing enabled which is pinning a bunch of
> dentries or inodes

You can safely ignore this suggestion as it does make sense.  SELinux
only grabs a references to dentries during its call to
fs_ops->getxattr, which can't last a meaningful length of time (unless
the filesystem is busted).  It only grabs references to inodes during
system initialization, when you couldn't have many in core.

Audit, likewise, only grabs a reference to a dentry during execve()
and only long enough to run getxattr and does not grab any reference
directly to an inode at all.

> or whether your backup program (or some other
> program running on your system) is keeping lots of directories or
> inodes open for some reason.

Certainly could be this suggestion though..

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-12-03 18:03       ` Roland Eggner
@ 2012-12-03 19:25         ` Dimitrios Apostolou
  0 siblings, 0 replies; 18+ messages in thread
From: Dimitrios Apostolou @ 2012-12-03 19:25 UTC (permalink / raw)
  To: Roland Eggner; +Cc: linux-kernel

On Mon, 3 Dec 2012, Roland Eggner wrote:
> On 2012-12-03 Monday at 01:56 +0200 Dimitrios Apostolou wrote:
>> Dear Ronald,
> Excuse me, my name is Roland.

Sorry, this was not intentional.


Dimitris


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-12-03 18:47         ` Eric Paris
@ 2012-12-03 19:35           ` Dimitrios Apostolou
  2012-12-03 20:00             ` Dimitrios Apostolou
  0 siblings, 1 reply; 18+ messages in thread
From: Dimitrios Apostolou @ 2012-12-03 19:35 UTC (permalink / raw)
  To: Eric Paris, Theodore Ts'o
  Cc: Roland Eggner, Linux Kernel Mailing List, Catalin Marinas

On Mon, 3 Dec 2012, Eric Paris wrote:

> On Mon, Dec 3, 2012 at 12:43 PM, Theodore Ts'o <tytso@mit.edu> wrote:
>
>> If you are seeing a large number of inodes still in the ext4 inode
>> cache after using drop_caches, then I'd look to see whether you have
>> something like SELinux or auditing enabled which is pinning a bunch of
>> dentries or inodes
>
> You can safely ignore this suggestion as it does make sense.  SELinux
> only grabs a references to dentries during its call to
> fs_ops->getxattr, which can't last a meaningful length of time (unless
> the filesystem is busted).  It only grabs references to inodes during
> system initialization, when you couldn't have many in core.
>
> Audit, likewise, only grabs a reference to a dentry during execve()
> and only long enough to run getxattr and does not grab any reference
> directly to an inode at all.
>

AFAICT I use neither SELinux nor audit.

>> or whether your backup program (or some other
>> program running on your system) is keeping lots of directories or
>> inodes open for some reason.
>
> Certainly could be this suggestion though..
>

I've managed to reproduce the scenario with concurrent "du" commands 
running on the filesystems. I'll try doing it once more, but it may take a 
while to get the dmesg/slabinfo etc output, since even a realtime root 
shell is non-responsive for many minutes.

What other debug output do you suggest to get, to find out why ext4 
inodes are pinned?


Thanks,
Dimitris


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-12-03 19:35           ` Dimitrios Apostolou
@ 2012-12-03 20:00             ` Dimitrios Apostolou
  0 siblings, 0 replies; 18+ messages in thread
From: Dimitrios Apostolou @ 2012-12-03 20:00 UTC (permalink / raw)
  To: Eric Paris, Theodore Ts'o
  Cc: Roland Eggner, Linux Kernel Mailing List, Catalin Marinas

On Mon, 3 Dec 2012, Dimitrios Apostolou wrote:
>
> I've managed to reproduce the scenario with concurrent "du" commands running 
> on the filesystems. I'll try doing it once more, but it may take a while to 
> get the dmesg/slabinfo etc output, since even a realtime root shell is 
> non-responsive for many minutes.
>
> What other debug output do you suggest to get, to find out why ext4 inodes 
> are pinned?

Important detail: one of the filesystems contains *huge* maildirs, 
containing hundreds of thousands of messages.


Dimitris


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-12-06 16:43       ` Jan Kara
@ 2012-12-07 15:26         ` Dimitrios Apostolou
  0 siblings, 0 replies; 18+ messages in thread
From: Dimitrios Apostolou @ 2012-12-07 15:26 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel, Roland Eggner, Catalin Marinas, Theodore Ts'o

Hi Jan, thanks for your help, with this commit I can't reproduce the 
problem. The problematic workload is no more inflating the 
ext4_inode_cache, in fact I've been able to run even heavier workloads 
with the ext4_inode_cache never surpassing 2-4MB.

Thanks everyone for helping,
Dimitris


On Thu, 6 Dec 2012, Jan Kara wrote:
> On Thu 06-12-12 17:15:37, Dimitrios Apostolou wrote:
>> On Thu, 6 Dec 2012, Jan Kara wrote:
>>> On Sun 25-11-12 21:30:00, Dimitrios Apostolou wrote:
>>>> On Sun, 2012-11-25 at 15:55 +0200, Dimitrios Apostolou wrote:
>>>>> on an old PIII-500MHz laptop, 128MB RAM, kernel 3.6.6, I started a
>>>>> backup process (tar|xz -4, nice'd and ionice'd -c3) from ext4 on local
>>>>> ATA disk to ext3 on external USB disk (USB-2.0 port on PCMCIA card).
>>>>> Even though earlier system load was minimal, free memory was plenty, the
>>>>> system now is unresponsive and is thrashing the disk, but the swapfile
>>>>> is rarely touched.
>>>>
>>>> I'm now having the same experience even though I replaced xz (which
>>>> needed ~50MB RAM) with gzip. Even though I feel the realtime root shell
>>>> is a bit more responsive than before, the OOM killer is out killing
>>>> small processes like syslog-ng and systemd-logind... The
>>>> ext4_inode_cache slab is taking almost all my memory (117MB). Please
>>>> advise!
>>> Hmm, it seems commit 4eff96dd5283a102e0c1cac95247090be74a38ed might be
>>> interesting for you. It landed in -stable kernels recently as well if I
>>> remember right...
>>
>> Thanks, I appreciate your help as I'm stuck in a dead end now, and
>> I've been trying to write some debug hook that prints all
>> ext4_inodes and the reason they are pinned (is there an easy way to
>> find this out?).
>>
>> So maybe there is a typo in the SHA1 sum you provided? Gitweb can't
>> find it in Linus' tree.
>  Strange. You are right gitweb doesn't show the SHA1 but I can see it in
> my git repo I pulled from Linus. Anyway, I've attached the fix for your
> convenience.
>
> 								Honza
>
> -- 
> Jan Kara <jack@suse.cz>
> SUSE Labs, CR
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-12-06 15:15     ` Dimitrios Apostolou
@ 2012-12-06 16:43       ` Jan Kara
  2012-12-07 15:26         ` Dimitrios Apostolou
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Kara @ 2012-12-06 16:43 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: Jan Kara, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1692 bytes --]

On Thu 06-12-12 17:15:37, Dimitrios Apostolou wrote:
> On Thu, 6 Dec 2012, Jan Kara wrote:
> >On Sun 25-11-12 21:30:00, Dimitrios Apostolou wrote:
> >>On Sun, 2012-11-25 at 15:55 +0200, Dimitrios Apostolou wrote:
> >>>on an old PIII-500MHz laptop, 128MB RAM, kernel 3.6.6, I started a
> >>>backup process (tar|xz -4, nice'd and ionice'd -c3) from ext4 on local
> >>>ATA disk to ext3 on external USB disk (USB-2.0 port on PCMCIA card).
> >>>Even though earlier system load was minimal, free memory was plenty, the
> >>>system now is unresponsive and is thrashing the disk, but the swapfile
> >>>is rarely touched.
> >>
> >>I'm now having the same experience even though I replaced xz (which
> >>needed ~50MB RAM) with gzip. Even though I feel the realtime root shell
> >>is a bit more responsive than before, the OOM killer is out killing
> >>small processes like syslog-ng and systemd-logind... The
> >>ext4_inode_cache slab is taking almost all my memory (117MB). Please
> >>advise!
> > Hmm, it seems commit 4eff96dd5283a102e0c1cac95247090be74a38ed might be
> >interesting for you. It landed in -stable kernels recently as well if I
> >remember right...
> 
> Thanks, I appreciate your help as I'm stuck in a dead end now, and
> I've been trying to write some debug hook that prints all
> ext4_inodes and the reason they are pinned (is there an easy way to
> find this out?).
> 
> So maybe there is a typo in the SHA1 sum you provided? Gitweb can't
> find it in Linus' tree.
  Strange. You are right gitweb doesn't show the SHA1 but I can see it in
my git repo I pulled from Linus. Anyway, I've attached the fix for your
convenience.

								Honza

-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

[-- Attachment #2: 0001-writeback-Put-unused-inodes-to-LRU-after-writeback-c.patch --]
[-- Type: text/x-patch, Size: 3356 bytes --]

>From 9501fee10d8594ab8ee7deb749fb48c1d3a7985e Mon Sep 17 00:00:00 2001
From: Jan Kara <jack@suse.cz>
Date: Mon, 19 Nov 2012 20:01:16 +0100
Subject: [PATCH v3] writeback: Put unused inodes to LRU after writeback completion

Commit 169ebd90 removed iget-iput pair from inode writeback. As a side effect,
inodes that are dirty during iput_final() call won't be ever added to inode LRU
(iput_final() doesn't add dirty inodes to LRU and later when the inode is
cleaned there's noone to add the inode there). Thus inodes are effectively
unreclaimable until someone looks them up again.

Practical effect of this bug is limited by the fact that inodes are
pinned by a dentry for long enough that the inode gets cleaned. But still
the bug can have nasty consequences leading up to OOM conditions under
certain circumstances. Following can easily reproduce the problem:

for (( i = 0; i < 1000; i++ )); do
  mkdir $i
  for (( j = 0; j < 1000; j++ )); do
    touch $i/$j
    echo 2 > /proc/sys/vm/drop_caches
  done
done

then one needs to run 'sync; ls -lR' to make inodes reclaimable again.

We fix the issue by inserting unused clean inodes into the LRU after writeback
finishes in inode_sync_complete().

CC: Al Viro <viro@zeniv.linux.org.uk>
CC: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
CC: stable@vger.kernel.org # >= 3.5
Reported-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/fs-writeback.c |    2 ++
 fs/inode.c        |   16 ++++++++++++++--
 fs/internal.h     |    1 +
 3 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 51ea267..3e3422f 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -228,6 +228,8 @@ static void requeue_io(struct inode *inode, struct bdi_writeback *wb)
 static void inode_sync_complete(struct inode *inode)
 {
 	inode->i_state &= ~I_SYNC;
+	/* If inode is clean an unused, put it into LRU now... */
+	inode_add_lru(inode);
 	/* Waiters must see I_SYNC cleared before being woken up */
 	smp_mb();
 	wake_up_bit(&inode->i_state, __I_SYNC);
diff --git a/fs/inode.c b/fs/inode.c
index b03c719..64999f1 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -408,6 +408,19 @@ static void inode_lru_list_add(struct inode *inode)
 	spin_unlock(&inode->i_sb->s_inode_lru_lock);
 }
 
+/*
+ * Add inode to LRU if needed (inode is unused and clean).
+ *
+ * Needs inode->i_lock held.
+ */
+void inode_add_lru(struct inode *inode)
+{
+	if (!(inode->i_state & (I_DIRTY | I_SYNC | I_FREEING | I_WILL_FREE)) &&
+	    !atomic_read(&inode->i_count) && inode->i_sb->s_flags & MS_ACTIVE)
+		inode_lru_list_add(inode);
+}
+
+
 static void inode_lru_list_del(struct inode *inode)
 {
 	spin_lock(&inode->i_sb->s_inode_lru_lock);
@@ -1390,8 +1403,7 @@ static void iput_final(struct inode *inode)
 
 	if (!drop && (sb->s_flags & MS_ACTIVE)) {
 		inode->i_state |= I_REFERENCED;
-		if (!(inode->i_state & (I_DIRTY|I_SYNC)))
-			inode_lru_list_add(inode);
+		inode_add_lru(inode);
 		spin_unlock(&inode->i_lock);
 		return;
 	}
diff --git a/fs/internal.h b/fs/internal.h
index 916b7cb..2f6af7f 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -110,6 +110,7 @@ extern int open_check_o_direct(struct file *f);
  * inode.c
  */
 extern spinlock_t inode_sb_list_lock;
+extern void inode_add_lru(struct inode *inode);
 
 /*
  * fs-writeback.c
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-12-06 14:20   ` Jan Kara
@ 2012-12-06 15:15     ` Dimitrios Apostolou
  2012-12-06 16:43       ` Jan Kara
  0 siblings, 1 reply; 18+ messages in thread
From: Dimitrios Apostolou @ 2012-12-06 15:15 UTC (permalink / raw)
  To: Jan Kara; +Cc: linux-kernel

On Thu, 6 Dec 2012, Jan Kara wrote:
> On Sun 25-11-12 21:30:00, Dimitrios Apostolou wrote:
>> On Sun, 2012-11-25 at 15:55 +0200, Dimitrios Apostolou wrote:
>>> on an old PIII-500MHz laptop, 128MB RAM, kernel 3.6.6, I started a
>>> backup process (tar|xz -4, nice'd and ionice'd -c3) from ext4 on local
>>> ATA disk to ext3 on external USB disk (USB-2.0 port on PCMCIA card).
>>> Even though earlier system load was minimal, free memory was plenty, the
>>> system now is unresponsive and is thrashing the disk, but the swapfile
>>> is rarely touched.
>>
>> I'm now having the same experience even though I replaced xz (which
>> needed ~50MB RAM) with gzip. Even though I feel the realtime root shell
>> is a bit more responsive than before, the OOM killer is out killing
>> small processes like syslog-ng and systemd-logind... The
>> ext4_inode_cache slab is taking almost all my memory (117MB). Please
>> advise!
>  Hmm, it seems commit 4eff96dd5283a102e0c1cac95247090be74a38ed might be
> interesting for you. It landed in -stable kernels recently as well if I
> remember right...

Thanks, I appreciate your help as I'm stuck in a dead end now, and I've 
been trying to write some debug hook that prints all ext4_inodes and the 
reason they are pinned (is there an easy way to find this out?).

So maybe there is a typo in the SHA1 sum you provided? Gitweb can't find 
it in Linus' tree.


Thanks,
Dimitris


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-11-25 19:30 ` Dimitrios Apostolou
  2012-11-25 22:59   ` Roland Eggner
@ 2012-12-06 14:20   ` Jan Kara
  2012-12-06 15:15     ` Dimitrios Apostolou
  1 sibling, 1 reply; 18+ messages in thread
From: Jan Kara @ 2012-12-06 14:20 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: linux-kernel

On Sun 25-11-12 21:30:00, Dimitrios Apostolou wrote:
> On Sun, 2012-11-25 at 15:55 +0200, Dimitrios Apostolou wrote:
> > on an old PIII-500MHz laptop, 128MB RAM, kernel 3.6.6, I started a
> > backup process (tar|xz -4, nice'd and ionice'd -c3) from ext4 on local
> > ATA disk to ext3 on external USB disk (USB-2.0 port on PCMCIA card).
> > Even though earlier system load was minimal, free memory was plenty, the
> > system now is unresponsive and is thrashing the disk, but the swapfile
> > is rarely touched.
> 
> I'm now having the same experience even though I replaced xz (which
> needed ~50MB RAM) with gzip. Even though I feel the realtime root shell
> is a bit more responsive than before, the OOM killer is out killing 
> small processes like syslog-ng and systemd-logind... The
> ext4_inode_cache slab is taking almost all my memory (117MB). Please
> advise!
  Hmm, it seems commit 4eff96dd5283a102e0c1cac95247090be74a38ed might be
interesting for you. It landed in -stable kernels recently as well if I
remember right...

								Honza

-- 
Jan Kara <jack@suse.cz>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-11-25 23:56     ` Alan Cox
@ 2012-11-26  3:11       ` Roland Eggner
  0 siblings, 0 replies; 18+ messages in thread
From: Roland Eggner @ 2012-11-26  3:11 UTC (permalink / raw)
  To: Alan Cox; +Cc: Dimitrios Apostolou, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1085 bytes --]

On 2012-11-25 Sunday at 23:56 +0000 Alan Cox wrote:
> > Does anybody know a x86 distribution or live-CD using a 2.6.27.* kernel?
> 
> Probably not a good idea, there are known exploitable holes in 2.6.27 era
> kernels and nobody maintains anything that old.

“old” is relative …

cd git/linux-stable  && git log -1 --date=iso v2.6.27.62
........................................................
commit bc4e1a77b06519a01e7aed1125695598e27ddeb2
Author: Willy Tarreau <w@1wt.eu>
Date:   2012-03-17 14:03:53 +0100

    Linux 2.6.27.62

    Signed-off-by: Willy Tarreau <w@1wt.eu>


… and surely less exploitable than a system suffering oom-killer actions.

Google "linux-2.6.27 download" gives some Ubuntu hits …


> I guess RHEL/Centos might work for you.
>
> I've not had any problems with leaks in 3.6 but there have been a few
> reports and its clear that some obscure configurations trigger something
> bad.
> 
> Gnome 3 on the other hand leaks like a sieve.

For a system with 128 M total RAM, Gnome is obviously off topic.


-- 
Roland

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-11-25 22:59   ` Roland Eggner
@ 2012-11-25 23:56     ` Alan Cox
  2012-11-26  3:11       ` Roland Eggner
  0 siblings, 1 reply; 18+ messages in thread
From: Alan Cox @ 2012-11-25 23:56 UTC (permalink / raw)
  To: Roland Eggner; +Cc: Dimitrios Apostolou, linux-kernel

> Does anybody know a x86 distribution or live-CD using a 2.6.27.* kernel?

Probably not a good idea, there are known exploitable holes in 2.6.27 era
kernels and nobody maintains anything that old.

I guess RHEL/Centos might work for you.

I've not had any problems with leaks in 3.6 but there have been a few
reports and its clear that some obscure configurations trigger something
bad.

Gnome 3 on the other hand leaks like a sieve.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
  2012-11-25 19:30 ` Dimitrios Apostolou
@ 2012-11-25 22:59   ` Roland Eggner
  2012-11-25 23:56     ` Alan Cox
  2012-12-06 14:20   ` Jan Kara
  1 sibling, 1 reply; 18+ messages in thread
From: Roland Eggner @ 2012-11-25 22:59 UTC (permalink / raw)
  To: Dimitrios Apostolou; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2178 bytes --]

On 2012-11-25 Sunday at 21:30 +0200 Dimitrios Apostolou wrote:
> On Sun, 2012-11-25 at 15:55 +0200, Dimitrios Apostolou wrote:
> > on an old PIII-500MHz laptop, 128MB RAM, kernel 3.6.6, I started a
> > backup process (tar|xz -4, nice'd and ionice'd -c3) from ext4 on local
> > ATA disk to ext3 on external USB disk (USB-2.0 port on PCMCIA card).
> > Even though earlier system load was minimal, free memory was plenty, the
> > system now is unresponsive and is thrashing the disk, but the swapfile
> > is rarely touched.
> 
> I'm now having the same experience even though I replaced xz (which
> needed ~50MB RAM) with gzip. Even though I feel the realtime root shell
> is a bit more responsive than before, the OOM killer is out killing 
> small processes like syslog-ng and systemd-logind... The
> ext4_inode_cache slab is taking almost all my memory (117MB). Please
> advise!

Hello Dimitrios,

I would try a 2.6.27.* kernel, for following reasons:

(1)  Kernel development since 2.6.27 achieved significant performance 
improvements at the cost of exploding memory consumption by the kernel for 
_internal_  data structures.  I am currently using a 3.2.34 kernel on a Notebook 
with 4 G RAM.  0,5 … 1 G RAM is usually occupied just by kernel slab [1];  this 
memory cannot be swapped, it cannot be released by other means than rebooting, 
and there seems to be  _no_  adjustment to memory pressure.  I am surprised, 
that you have managed to boot a 3.6.* kernel at all with only 128 M RAM.

(2)  At the time, when I used a PIII-Notebook, kernels 2.6.27 to 2.6.29 where 
current.  Thus chances are good, that a 2.6.27.* kernel will support chipset, 
PCI bus and devices of your notebook.  2.6.27 got longterm maintainance, the 
latest release in the linux-stable git repository is 2.6.27.62.
So Q @ LKML community:
Does anybody know a x86 distribution or live-CD using a 2.6.27.* kernel?


[1]  Picture described in my LKML message
Date:  Fri, 20 Jan 2012 01:08:00 +0100
Subject:  Re: [kmemleak report 1/2] kernel 3.1.6, x86_64: mm, xfs ?, vfs ?
remained the same with  _every_  3.1.* and 3.2.* kernel tried so far.


-- 
Roland

[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused
       [not found] <1353851735.22969.18.camel@soupermouf>
@ 2012-11-25 19:30 ` Dimitrios Apostolou
  2012-11-25 22:59   ` Roland Eggner
  2012-12-06 14:20   ` Jan Kara
  0 siblings, 2 replies; 18+ messages in thread
From: Dimitrios Apostolou @ 2012-11-25 19:30 UTC (permalink / raw)
  To: linux-kernel

On Sun, 2012-11-25 at 15:55 +0200, Dimitrios Apostolou wrote:
> on an old PIII-500MHz laptop, 128MB RAM, kernel 3.6.6, I started a
> backup process (tar|xz -4, nice'd and ionice'd -c3) from ext4 on local
> ATA disk to ext3 on external USB disk (USB-2.0 port on PCMCIA card).
> Even though earlier system load was minimal, free memory was plenty, the
> system now is unresponsive and is thrashing the disk, but the swapfile
> is rarely touched.

I'm now having the same experience even though I replaced xz (which
needed ~50MB RAM) with gzip. Even though I feel the realtime root shell
is a bit more responsive than before, the OOM killer is out killing 
small processes like syslog-ng and systemd-logind... The
ext4_inode_cache slab is taking almost all my memory (117MB). Please
advise!


Thanks,
Dimitris




^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-12-07 15:26 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-25 15:03 backing up ext4 fs, system unresponsive, thrashing like crazy even though swap is unused Dimitrios Apostolou
2012-12-02 12:44 ` Dimitrios Apostolou
2012-12-02 22:50   ` Roland Eggner
2012-12-02 23:56     ` Dimitrios Apostolou
2012-12-03 17:43       ` Theodore Ts'o
2012-12-03 18:47         ` Eric Paris
2012-12-03 19:35           ` Dimitrios Apostolou
2012-12-03 20:00             ` Dimitrios Apostolou
2012-12-03 18:03       ` Roland Eggner
2012-12-03 19:25         ` Dimitrios Apostolou
     [not found] <1353851735.22969.18.camel@soupermouf>
2012-11-25 19:30 ` Dimitrios Apostolou
2012-11-25 22:59   ` Roland Eggner
2012-11-25 23:56     ` Alan Cox
2012-11-26  3:11       ` Roland Eggner
2012-12-06 14:20   ` Jan Kara
2012-12-06 15:15     ` Dimitrios Apostolou
2012-12-06 16:43       ` Jan Kara
2012-12-07 15:26         ` Dimitrios Apostolou

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).