linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
       [not found] <bug-196157-27@https.bugzilla.kernel.org/>
@ 2017-06-22 19:37 ` Andrew Morton
  2017-06-22 20:58   ` Alkis Georgopoulos
  2017-06-23  7:13   ` Michal Hocko
  0 siblings, 2 replies; 18+ messages in thread
From: Andrew Morton @ 2017-06-22 19:37 UTC (permalink / raw)
  To: linux-mm
  Cc: bugzilla-daemon, alkisg, Michal Hocko, Mel Gorman, Johannes Weiner


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

hm, that's news to me.

Does anyone have access to a large i386 setup?  Interested in
reproducing this and figuring out what's going wrong?


On Thu, 22 Jun 2017 06:25:49 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=196157
> 
>             Bug ID: 196157
>            Summary: 100+ times slower disk writes on 4.x+/i386/16+RAM,
>                     compared to 3.x
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 4.x
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Page Allocator
>           Assignee: akpm@linux-foundation.org
>           Reporter: alkisg@gmail.com
>         Regression: No
> 
> Me and a lot of other users have an issue where disk writes start fast (e.g.
> 200 MB/sec), but after intensive disk usage, they end up 100+ times slower
> (e.g. 2 MB/sec), and never get fast again until we run "echo 3 >
> /proc/sys/vm/drop_caches".
> 
> This issue happens on systems with any 4.x kernel, i386 arch, 16+ GB RAM.
> It doesn't happen if we use 3.x kernels (i.e. it's a regression) or any 64bit
> kernels (i.e. it only affects i386).
> 
> My initial bug report was in Ubuntu:
> https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1698118
> 
> I included a test case there, which mostly says "Copy /lib around 100 times.
> You'll see that the first copy happens in 5 seconds, and the 30th copy may need
> more than 800 seconds".
> 
> Here is my latest version of the script (basically, the (3) step below):
> 1) . /etc/os-release; echo -n "$VERSION, $(uname -r), $(dpkg
> --print-architecture), RAM="; awk '/MemTotal:/ { print $2 }' /proc/meminfo
> 2) mount /dev/sdb2 /mnt && rm -rf /mnt/tmp/lib && mkdir -p /mnt/tmp/lib && sync
> && echo 3 > /proc/sys/vm/drop_caches && chroot /mnt
> 3) mkdir -p /tmp/lib; cd /tmp/lib; s=/lib; d=1; echo -n "Copying $s to $d: ";
> while /usr/bin/time -f %e sh -c "cp -a '$s' '$d'; sync"; do s=$d;
> d=$((($d+1)%100)); echo -n "Copying $s to $d: "; done
> 
> And here are some results, where you can see that all 4.x+ i386 kernels are
> affected:
> -----------------------------------------------------------------------------
> 14.04, Trusty Tahr, 3.13.0-24-generic, i386, RAM=16076400 [Live CD]
> 8-13 secs
> 
> 15.04 (Vivid Vervet), 3.19.0-15-generic, i386, RAM=16083080 [Live CD]
> 5-7 secs
> 
> 15.10 (Wily Werewolf), 4.2.0-16-generic, i386, RAM=16082536 [Live CD]
> 4-350 secs
> 
> 16.04.2 LTS (Xenial Xerus), 3.19.0-80-generic, i386, RAM=16294832 [HD install]
> 10-25 secs
> 
> 16.04.2 LTS (Xenial Xerus), 4.2.0-42-generic, i386, RAM=16294392 [HD install]
> 14-89 secs
> 
> 16.04.2 LTS (Xenial Xerus), 4.4.0-79-generic, i386, RAM=16293556 [HD install]
> 15-605 secs
> 
> 16.04.2 LTS (Xenial Xerus), 4.8.0-54-generic, i386, RAM=16292708 [HD install]
> 6-160 secs
> 
> 16.04.2 LTS (Xenial Xerus), 4.12.0-041200rc5-generic, i386, RAM=16292588 [HD
> install]
> 46-805 secs
> 
> 16.04.2 LTS (Xenial Xerus), 4.8.0-36-generic, amd64, RAM=16131028 [Live CD]
> 4-11 secs
> 
> An example single run of the script:
> -----------------------------------------------------------------------------
> 16.04.2 LTS (Xenial Xerus), 4.8.0-54-generic, i386, RAM=16292708 [HD install]
> -----------------------------------------------------------------------------
> Copying /lib to 1: 37.23
> Copying 1 to 2: 6.74
> Copying 2 to 3: 6.88
> Copying 3 to 4: 7.89
> Copying 4 to 5: 7.91
> Copying 5 to 6: 9.03
> Copying 6 to 7: 8.46
> Copying 7 to 8: 8.10
> Copying 8 to 9: 8.93
> Copying 9 to 10: 10.51
> Copying 10 to 11: 10.33
> Copying 11 to 12: 11.08
> Copying 12 to 13: 11.78
> Copying 13 to 14: 14.18
> Copying 14 to 15: 18.42
> Copying 15 to 16: 23.19
> Copying 16 to 17: 61.08
> Copying 17 to 18: 155.88
> Copying 18 to 19: 141.96
> Copying 19 to 20: 152.98
> Copying 20 to 21: 163.03
> Copying 21 to 22: 154.85
> Copying 22 to 23: 137.13
> Copying 23 to 24: 146.08
> Copying 24 to 25:
> 
> Thank you!
> 
> -- 
> You are receiving this mail because:
> You are the assignee for the bug.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2017-06-22 19:37 ` [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x Andrew Morton
@ 2017-06-22 20:58   ` Alkis Georgopoulos
  2017-06-23  7:13   ` Michal Hocko
  1 sibling, 0 replies; 18+ messages in thread
From: Alkis Georgopoulos @ 2017-06-22 20:58 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: bugzilla-daemon, Michal Hocko, Mel Gorman, Johannes Weiner

IGBPI?I1I? 22/06/2017 10:37 I 1/4 I 1/4 , I? Andrew Morton I-I3I?I+-I?Iu:
> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> hm, that's news to me.
> 
> Does anyone have access to a large i386 setup?  Interested in
> reproducing this and figuring out what's going wrong?
> 


I can arrange ssh/vnc access to an i386 box with 16 GB RAM that has the 
issue, if some kernel dev wants to work on that. Please PM me for 
details - also tell me your preferred distro.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2017-06-22 19:37 ` [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x Andrew Morton
  2017-06-22 20:58   ` Alkis Georgopoulos
@ 2017-06-23  7:13   ` Michal Hocko
  2017-06-23  7:44     ` Alkis Georgopoulos
  1 sibling, 1 reply; 18+ messages in thread
From: Michal Hocko @ 2017-06-23  7:13 UTC (permalink / raw)
  To: Andrew Morton, Alkis Georgopoulos
  Cc: linux-mm, bugzilla-daemon, Mel Gorman, Johannes Weiner

On Thu 22-06-17 12:37:36, Andrew Morton wrote:
[...]
> > Me and a lot of other users have an issue where disk writes start fast (e.g.
> > 200 MB/sec), but after intensive disk usage, they end up 100+ times slower
> > (e.g. 2 MB/sec), and never get fast again until we run "echo 3 >
> > /proc/sys/vm/drop_caches".

What is your dirty limit configuration. Is your highmem dirtyable
(highmem_is_dirtyable)?

> > This issue happens on systems with any 4.x kernel, i386 arch, 16+ GB RAM.
> > It doesn't happen if we use 3.x kernels (i.e. it's a regression) or any 64bit
> > kernels (i.e. it only affects i386).

I remember we've had some changes in the way how the dirty memory is
throttled and 32b would be more sensitive to those changes. Anyway, I
would _strongly_ discourage you from using 32b kernels with that much of
memory. You are going to hit walls constantly and many of those issues
will be inherent. Some of them less so but rather non-trivial to fix
without regressing somewhere else. You can tune your system somehow but
this will be fragile no mater what.

Sorry to say that but 32b systems with tons of memory are far from
priority of most mm people. Just use 64b kernel. There are more pressing
problems to deal with.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2017-06-23  7:13   ` Michal Hocko
@ 2017-06-23  7:44     ` Alkis Georgopoulos
  2017-06-23 11:38       ` Michal Hocko
  0 siblings, 1 reply; 18+ messages in thread
From: Alkis Georgopoulos @ 2017-06-23  7:44 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton
  Cc: linux-mm, bugzilla-daemon, Mel Gorman, Johannes Weiner

IGBPI?I1I? 23/06/2017 10:13 I?I 1/4 , I? Michal Hocko I-I3I?I+-I?Iu:
> On Thu 22-06-17 12:37:36, Andrew Morton wrote:
> 
> What is your dirty limit configuration. Is your highmem dirtyable
> (highmem_is_dirtyable)?
> 
>>> This issue happens on systems with any 4.x kernel, i386 arch, 16+ GB RAM.
>>> It doesn't happen if we use 3.x kernels (i.e. it's a regression) or any 64bit
>>> kernels (i.e. it only affects i386).
> 
> I remember we've had some changes in the way how the dirty memory is
> throttled and 32b would be more sensitive to those changes. Anyway, I
> would _strongly_ discourage you from using 32b kernels with that much of
> memory. You are going to hit walls constantly and many of those issues
> will be inherent. Some of them less so but rather non-trivial to fix
> without regressing somewhere else. You can tune your system somehow but
> this will be fragile no mater what.
> 
> Sorry to say that but 32b systems with tons of memory are far from
> priority of most mm people. Just use 64b kernel. There are more pressing
> problems to deal with.
> 



Hi, I'm attaching below all my settings from /proc/sys/vm.

I think that the regression also affects 4 GB and 8 GB RAM i386 systems, 
but not in an exponential manner; i.e. copies there are appear only 2-3 
times slower than they used to be in 3.x kernels.

Now I don't know the kernel internals, but if disk copies show up to be 
2-3 times slower, and the regression is in memory management, wouldn't 
that mean that the memory management is *hundreds* of times slower, to 
show up in disk writing benchmarks?

I.e. I'm afraid that this regression doesn't affect 16+ GB RAM systems 
only; it just happens that it's clearly visible there.

And it might even affect 64bit systems with even more RAM; but I don't 
have any such system to test with.

Kind regards,
Alkis


root@pc:/proc/sys/vm# grep . *
admin_reserve_kbytes:8192
block_dump:0
compact_unevictable_allowed:1
dirty_background_bytes:0
dirty_background_ratio:10
dirty_bytes:0
dirty_expire_centisecs:1500
dirty_ratio:20
dirtytime_expire_seconds:43200
dirty_writeback_centisecs:1500
drop_caches:3
extfrag_threshold:500
highmem_is_dirtyable:0
hugepages_treat_as_movable:0
hugetlb_shm_group:0
laptop_mode:0
legacy_va_layout:0
lowmem_reserve_ratio:256	32	32
max_map_count:65530
min_free_kbytes:34420
mmap_min_addr:65536
mmap_rnd_bits:8
nr_hugepages:0
nr_overcommit_hugepages:0
nr_pdflush_threads:0
oom_dump_tasks:1
oom_kill_allocating_task:0
overcommit_kbytes:0
overcommit_memory:0
overcommit_ratio:50
page-cluster:3
panic_on_oom:0
percpu_pagelist_fraction:0
stat_interval:1
swappiness:60
user_reserve_kbytes:131072
vdso_enabled:1
vfs_cache_pressure:100
watermark_scale_factor:10

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2017-06-23  7:44     ` Alkis Georgopoulos
@ 2017-06-23 11:38       ` Michal Hocko
  2017-06-26  5:28         ` Alkis Georgopoulos
  0 siblings, 1 reply; 18+ messages in thread
From: Michal Hocko @ 2017-06-23 11:38 UTC (permalink / raw)
  To: Alkis Georgopoulos
  Cc: Andrew Morton, linux-mm, bugzilla-daemon, Mel Gorman, Johannes Weiner

On Fri 23-06-17 10:44:36, Alkis Georgopoulos wrote:
> IGBPI?I1I? 23/06/2017 10:13 I?I 1/4 , I? Michal Hocko I-I3I?I+-I?Iu:
> >On Thu 22-06-17 12:37:36, Andrew Morton wrote:
> >
> >What is your dirty limit configuration. Is your highmem dirtyable
> >(highmem_is_dirtyable)?
> >
> >>>This issue happens on systems with any 4.x kernel, i386 arch, 16+ GB RAM.
> >>>It doesn't happen if we use 3.x kernels (i.e. it's a regression) or any 64bit
> >>>kernels (i.e. it only affects i386).
> >
> >I remember we've had some changes in the way how the dirty memory is
> >throttled and 32b would be more sensitive to those changes. Anyway, I
> >would _strongly_ discourage you from using 32b kernels with that much of
> >memory. You are going to hit walls constantly and many of those issues
> >will be inherent. Some of them less so but rather non-trivial to fix
> >without regressing somewhere else. You can tune your system somehow but
> >this will be fragile no mater what.
> >
> >Sorry to say that but 32b systems with tons of memory are far from
> >priority of most mm people. Just use 64b kernel. There are more pressing
> >problems to deal with.
> >
> 
> 
> 
> Hi, I'm attaching below all my settings from /proc/sys/vm.
> 
> I think that the regression also affects 4 GB and 8 GB RAM i386 systems, but
> not in an exponential manner; i.e. copies there are appear only 2-3 times
> slower than they used to be in 3.x kernels.

If the regression shows with 4-8GB 32b systems then the priority for
fixing would be certainly much higher.

> Now I don't know the kernel internals, but if disk copies show up to be 2-3
> times slower, and the regression is in memory management, wouldn't that mean
> that the memory management is *hundreds* of times slower, to show up in disk
> writing benchmarks?

Well, it is hard to judge what the real problem is here but you have
to realize that 32b system has some fundamental issues which come from
how the memory has split between kernel (lowmem - 896MB at maximum) and
highmem. The more memory you have the more lowmem you consume by kernel
data structure. Just consider that ~160MB of this space is eaten by
struct pages to describe 16GB of memory. There are other data structures
which can only live in the low memory.

> I.e. I'm afraid that this regression doesn't affect 16+ GB RAM systems only;
> it just happens that it's clearly visible there.
> 
> And it might even affect 64bit systems with even more RAM; but I don't have
> any such system to test with.

Not really. 64b systems do not need kernel/usespace split because the
address space large enough. If there are any regressions since 3.0 then
we are certainly interested in hearing about them.
 
> root@pc:/proc/sys/vm# grep . *
> dirty_ratio:20
> highmem_is_dirtyable:0

this means that the highmem is not dirtyable and so only 20% of the free
lowmem (+ page cache in that region) is considered and writers might
get throttled quite early (this might be a really low number when the
lowmem is congested already). Do you see the same problem when enabling
highmem_is_dirtyable = 1?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2017-06-23 11:38       ` Michal Hocko
@ 2017-06-26  5:28         ` Alkis Georgopoulos
  2017-06-26  5:46           ` Michal Hocko
  0 siblings, 1 reply; 18+ messages in thread
From: Alkis Georgopoulos @ 2017-06-26  5:28 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, bugzilla-daemon, Mel Gorman, Johannes Weiner

IGBPI?I1I? 23/06/2017 02:38 I 1/4 I 1/4 , I? Michal Hocko I-I3I?I+-I?Iu:
> this means that the highmem is not dirtyable and so only 20% of the free
> lowmem (+ page cache in that region) is considered and writers might
> get throttled quite early (this might be a really low number when the
> lowmem is congested already). Do you see the same problem when enabling
> highmem_is_dirtyable = 1?
> 

Excellent advice! :)
Indeed, setting highmem_is_dirtyable=1 completely eliminates the issue!

Is that something that should be =1 by default, i.e. should I notify the 
Ubuntu developers that the defaults they ship aren't appropriate,
or is it something that only 16+ GB RAM memory owners should adjust in 
their local configuration?

Thanks a lot!
Results of 2 test runs, with highmem_is_dirtyable=0 and 1:

1) echo 0 > highmem_is_dirtyable:
-----------------------------------------------------------------------------
16.04.2 LTS (Xenial Xerus), 4.8.0-56-generic, i386, RAM=16292548
-----------------------------------------------------------------------------
Copying /lib to 1: 18.60
Copying 1 to 2: 6.09
Copying 2 to 3: 6.04
Copying 3 to 4: 7.04
Copying 4 to 5: 6.28
Copying 5 to 6: 5.03
Copying 6 to 7: 6.50
Copying 7 to 8: 4.82
Copying 8 to 9: 5.49
Copying 9 to 10: 5.88
Copying 10 to 11: 5.09
Copying 11 to 12: 5.70
Copying 12 to 13: 5.19
Copying 13 to 14: 4.55
Copying 14 to 15: 4.69
Copying 15 to 16: 4.76
Copying 16 to 17: 5.38
Copying 17 to 18: 4.59
Copying 18 to 19: 4.26
Copying 19 to 20: 4.47
Copying 20 to 21: 4.32
Copying 21 to 22: 4.33
Copying 22 to 23: 5.55
Copying 23 to 24: 4.73
Copying 24 to 25: 4.80
Copying 25 to 26: 5.06
Copying 26 to 27: 16.84
Copying 27 to 28: 5.28
Copying 28 to 29: 5.45
Copying 29 to 30: 12.35
Copying 30 to 31: 5.90
Copying 31 to 32: 4.90
Copying 32 to 33: 4.76
Copying 33 to 34: 4.37
Copying 34 to 35: 5.82
Copying 35 to 36: 4.55
Copying 36 to 37: 8.80
Copying 37 to 38: 5.07
Copying 38 to 39: 5.69
Copying 39 to 40: 4.88
Copying 40 to 41: 5.26
Copying 41 to 42: 4.69
Copying 42 to 43: 5.10
Copying 43 to 44: 4.79
Copying 44 to 45: 4.54
Copying 45 to 46: 7.46
Copying 46 to 47: 5.54
Copying 47 to 48: 4.86
Copying 48 to 49: 6.12
Copying 49 to 50: 5.37
Copying 50 to 51: 7.63
Copying 51 to 52: 6.37
Copying 52 to 53: 5.81
...

2) echo 1 > highmem_is_dirtyable:
-----------------------------------------------------------------------------
16.04.2 LTS (Xenial Xerus), 4.8.0-56-generic, i386, RAM=16292548
-----------------------------------------------------------------------------
Copying /lib to 1: 21.47
Copying 1 to 2: 5.54
Copying 2 to 3: 6.63
Copying 3 to 4: 4.69
Copying 4 to 5: 5.38
Copying 5 to 6: 8.50
Copying 6 to 7: 9.34
Copying 7 to 8: 8.78
Copying 8 to 9: 9.48
Copying 9 to 10: 10.89
Copying 10 to 11: 10.52
Copying 11 to 12: 11.28
Copying 12 to 13: 14.70
Copying 13 to 14: 17.71
Copying 14 to 15: 52.43
Copying 15 to 16: 92.52
...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2017-06-26  5:28         ` Alkis Georgopoulos
@ 2017-06-26  5:46           ` Michal Hocko
  2017-06-26  7:02             ` Alkis Georgopoulos
  0 siblings, 1 reply; 18+ messages in thread
From: Michal Hocko @ 2017-06-26  5:46 UTC (permalink / raw)
  To: Alkis Georgopoulos
  Cc: Andrew Morton, linux-mm, bugzilla-daemon, Mel Gorman, Johannes Weiner

On Mon 26-06-17 08:28:07, Alkis Georgopoulos wrote:
> IGBPI?I1I? 23/06/2017 02:38 I 1/4 I 1/4 , I? Michal Hocko I-I3I?I+-I?Iu:
> >this means that the highmem is not dirtyable and so only 20% of the free
> >lowmem (+ page cache in that region) is considered and writers might
> >get throttled quite early (this might be a really low number when the
> >lowmem is congested already). Do you see the same problem when enabling
> >highmem_is_dirtyable = 1?
> >
> 
> Excellent advice! :)
> Indeed, setting highmem_is_dirtyable=1 completely eliminates the issue!
> 
> Is that something that should be =1 by default,

Unfortunatelly, this is not something that can be applied in general.
This can lead to a premature OOM killer invocations. E.g. a direct write
to the block device cannot use highmem, yet there won't be anything to
throttle those writes properly. Unfortunately, our documentation is
silent about this setting. I will post a patch later.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2017-06-26  5:46           ` Michal Hocko
@ 2017-06-26  7:02             ` Alkis Georgopoulos
  2017-06-26  9:12               ` Michal Hocko
  0 siblings, 1 reply; 18+ messages in thread
From: Alkis Georgopoulos @ 2017-06-26  7:02 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, bugzilla-daemon, Mel Gorman, Johannes Weiner

IGBPI?I1I? 26/06/2017 08:46 I?I 1/4 , I? Michal Hocko I-I3I?I+-I?Iu:
> Unfortunatelly, this is not something that can be applied in general.
> This can lead to a premature OOM killer invocations. E.g. a direct write
> to the block device cannot use highmem, yet there won't be anything to
> throttle those writes properly. Unfortunately, our documentation is
> silent about this setting. I will post a patch later.


I should also note that highmem_is_dirtyable was 0 in all the 3.x kernel 
tests that I did; yet they didn't have the "slow disk writes" issue.

I.e. I think that setting highmem_is_dirtyable=1 works around the issue, 
but is not the exact point which caused the regression that we see in 
4.x kernels...

--
Kind regards,
Alkis Georgopoulos

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2017-06-26  7:02             ` Alkis Georgopoulos
@ 2017-06-26  9:12               ` Michal Hocko
  2017-06-29  6:14                 ` Alkis Georgopoulos
  0 siblings, 1 reply; 18+ messages in thread
From: Michal Hocko @ 2017-06-26  9:12 UTC (permalink / raw)
  To: Alkis Georgopoulos
  Cc: Andrew Morton, linux-mm, bugzilla-daemon, Mel Gorman, Johannes Weiner

On Mon 26-06-17 10:02:23, Alkis Georgopoulos wrote:
> IGBPI?I1I? 26/06/2017 08:46 I?I 1/4 , I? Michal Hocko I-I3I?I+-I?Iu:
> >Unfortunatelly, this is not something that can be applied in general.
> >This can lead to a premature OOM killer invocations. E.g. a direct write
> >to the block device cannot use highmem, yet there won't be anything to
> >throttle those writes properly. Unfortunately, our documentation is
> >silent about this setting. I will post a patch later.
> 
> 
> I should also note that highmem_is_dirtyable was 0 in all the 3.x kernel
> tests that I did; yet they didn't have the "slow disk writes" issue.

Yes this is possible. There were some changes in the dirty memory
throttling that could lead to visible behavior changes. I remember that
ab8fabd46f81 ("mm: exclude reserved pages from dirtyable memory") had
noticeable effect. The patch is something that we really want and it is
unnfortunate it has eaten some more from the dirtyable lowmem.

> I.e. I think that setting highmem_is_dirtyable=1 works around the issue, but
> is not the exact point which caused the regression that we see in 4.x
> kernels...

yes as I've said this is a workaround for for something that is an
inherent 32b lowmem/highmem issue.

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2017-06-26  9:12               ` Michal Hocko
@ 2017-06-29  6:14                 ` Alkis Georgopoulos
  2017-06-29  7:16                   ` Michal Hocko
  0 siblings, 1 reply; 18+ messages in thread
From: Alkis Georgopoulos @ 2017-06-29  6:14 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, bugzilla-daemon, Mel Gorman, Johannes Weiner

I've been working on a system with highmem_is_dirtyable=1 for a couple
of hours.

While the disk benchmark showed no performance hit on intense disk
activity, there are other serious problems that make this workaround
unusable.

I.e. when there's intense disk activity, the mouse cursor moves with
extreme lag, like 1-2 fps. Switching with alt+tab from e.g. thunderbird
to pidgin needs 10 seconds. kswapd hits 100% cpu usage. Etc etc, the
system becomes unusable until the disk activity settles down.
I was testing via SSH so I hadn't noticed the extreme lag.

All those symptoms go away when resetting highmem_is_dirtyable=0.

So currently 32bit installations with 16 GB RAM have no option but to
remove the extra RAM...


About ab8fabd46f81 ("mm: exclude reserved pages from dirtyable memory"),
would it make sense for me to compile a kernel and test if everything
works fine without it? I.e. if we see that this caused all those
regressions, would it be revisited?

And an unrelated idea, is there any way to tell linux to use a limited
amount of RAM for page cache, e.g. only 1 GB?

Kind regards,
Alkis Georgopoulos

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2017-06-29  6:14                 ` Alkis Georgopoulos
@ 2017-06-29  7:16                   ` Michal Hocko
  2017-06-29  8:02                     ` Alkis Georgopoulos
  0 siblings, 1 reply; 18+ messages in thread
From: Michal Hocko @ 2017-06-29  7:16 UTC (permalink / raw)
  To: Alkis Georgopoulos
  Cc: Andrew Morton, linux-mm, bugzilla-daemon, Mel Gorman, Johannes Weiner

On Thu 29-06-17 09:14:55, Alkis Georgopoulos wrote:
> I've been working on a system with highmem_is_dirtyable=1 for a couple
> of hours.
> 
> While the disk benchmark showed no performance hit on intense disk
> activity, there are other serious problems that make this workaround
> unusable.
> 
> I.e. when there's intense disk activity, the mouse cursor moves with
> extreme lag, like 1-2 fps. Switching with alt+tab from e.g. thunderbird
> to pidgin needs 10 seconds. kswapd hits 100% cpu usage. Etc etc, the
> system becomes unusable until the disk activity settles down.
> I was testing via SSH so I hadn't noticed the extreme lag.
> 
> All those symptoms go away when resetting highmem_is_dirtyable=0.
> 
> So currently 32bit installations with 16 GB RAM have no option but to
> remove the extra RAM...

Or simply install 64b kernel. You can keep 32b userspace if you need
it but running 32b kernel will be always a fight.
 
> About ab8fabd46f81 ("mm: exclude reserved pages from dirtyable memory"),
> would it make sense for me to compile a kernel and test if everything
> works fine without it? I.e. if we see that this caused all those
> regressions, would it be revisited?

The patch makes a lot of sense in general. I do not think we will revert
it based on a configuration which is rare. We might come up with some
tweaks in the dirty memory throttling but that area is quite tricky
already. You can of course try to test without this commit applied (I
believe you would have to go and checkout ab8fabd46f81 and revert the
commit because a later revert sound more complicated to me. I might be
wrong here because I haven't tried that myself though).

> And an unrelated idea, is there any way to tell linux to use a limited
> amount of RAM for page cache, e.g. only 1 GB?

No.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2017-06-29  7:16                   ` Michal Hocko
@ 2017-06-29  8:02                     ` Alkis Georgopoulos
  2018-04-19 20:36                       ` Andrew Morton
  0 siblings, 1 reply; 18+ messages in thread
From: Alkis Georgopoulos @ 2017-06-29  8:02 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, linux-mm, bugzilla-daemon, Mel Gorman, Johannes Weiner

IGBPI?I1I? 29/06/2017 10:16 I?I 1/4 , I? Michal Hocko I-I3I?I+-I?Iu:
> 
> Or simply install 64b kernel. You can keep 32b userspace if you need
> it but running 32b kernel will be always a fight.

Results with 64bit kernel on 32bit userspace:
16.04.2 LTS (Xenial Xerus), 4.4.0-83-generic, i386, RAM=16131400
Copying /lib to 1: 27.00
Copying 1 to 2: 9.37
Copying 2 to 3: 8.80
Copying 3 to 4: 9.13
Copying 4 to 5: 9.25
Copying 5 to 6: 8.08
Copying 6 to 7: 8.00
Copying 7 to 8: 8.85
Copying 8 to 9: 8.67
Copying 9 to 10: 8.55
Copying 10 to 11: 8.67
Copying 11 to 12: 8.15
Copying 12 to 13: 7.57
Copying 13 to 14: 8.05
Copying 14 to 15: 8.22
Copying 15 to 16: 8.35
Copying 16 to 17: 8.50
Copying 17 to 18: 8.30
Copying 18 to 19: 7.97
Copying 19 to 20: 7.81
Copying 20 to 21: 7.11
Copying 21 to 22: 8.20
Copying 22 to 23: 7.54
Copying 23 to 24: 7.96
Copying 24 to 25: 8.04
Copying 25 to 26: 7.87
Copying 26 to 27: 7.70
Copying 27 to 28: 8.33
Copying 28 to 29: 6.88
Copying 29 to 30: 7.18

It doesn't have the 32bit slowness issue, and it's "only" 2 times slower
than the full 64bit installation (so maybe there's an additional delay
involved somewhere in userspace)...
...but it's also hard to setup (e.g. Ubuntu doesn't allow 4.8 32bit
kernel to coexist with 4.8 64bit because they have the same file names;
so the 64 bit kernel needs to be 4.4),
and it doesn't run some applications, e.g. VirtualBox or proprietary
nvidia drivers...


Thank you very much for your continuous input on this, we'll see what we
can do to locally avoid the issue, probably just tell sysadmins to avoid
using -pae with more than 8 GB RAM.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2017-06-29  8:02                     ` Alkis Georgopoulos
@ 2018-04-19 20:36                       ` Andrew Morton
  0 siblings, 0 replies; 18+ messages in thread
From: Andrew Morton @ 2018-04-19 20:36 UTC (permalink / raw)
  To: Alkis Georgopoulos
  Cc: Michal Hocko, linux-mm, bugzilla-daemon, Mel Gorman,
	Johannes Weiner, reserv0


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

https://bugzilla.kernel.org/show_bug.cgi?id=196157

People are still hurting from this.  It does seem a pretty major
regression for highmem machines.

I'm surprised that we aren't hearing about this from distros.  Maybe it
only affects a subset of highmem machines?

Anyway, can we please take another look at it?  Seems that we messed up
highmem dirty pagecache handling in the 4.2 timeframe.

Thanks.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2018-08-17 11:29 ` Thierry
@ 2018-08-17 14:46   ` Michal Hocko
  0 siblings, 0 replies; 18+ messages in thread
From: Michal Hocko @ 2018-08-17 14:46 UTC (permalink / raw)
  To: Thierry
  Cc: Alkis Georgopoulos, Andrew Morton, linux-mm, bugzilla-daemon,
	Mel Gorman, Johannes Weiner

On Fri 17-08-18 11:29:45, Thierry wrote:
> On Fri, 8/17/18, Michal Hocko <mhocko@kernel.org> wrote:
> 
> > Have you tried to set highmem_is_dirtyable as suggested elsewhere?
> 
> I tried everything, and yes, that too, to no avail. The only solution is to limit the
> available RAM to less than 12Gb, which is just unacceptable for me.
>  
> > I would like to stress out that 16GB with 32b kernels doesn't play really nice.
> 
> I would like to stress out that 32 Gb of RAM played totally nice and very smoothly
> with v4.1 and older kernels... This got broken in v4.2 and never repaired since.
> This is a very nasty regression, and my suggestion to keep v4.1 maintained till
> that regression would finally get worked around fell into deaf ears...
> 
> > Even small changes (larger kernel memory footprint) can lead to all sorts of
> > problems. I would really recommend using 64b kernels instead. There shouldn't be
> > any real reason to stick with 32bhighmem based  kernel for such a large beast.
> > I strongly doubt the cpu itself would be 32b only.
> 
> The reasons are many (one of them dealing with being able to run old 32 bits
> Linux distros but without the bugs and security flaws of old, unmaintained kernels).

You can easily run 32b distribution on top of 64b kernels.

> But the reasons are not the problem here. The problem is that v4.2 introduced a
> bug (*) that was never fixed since.
> 
> A shame, really. :-(

Well. I guess nobody is disputing this is really annoying. I do
agree! On the other nobody came up with an acceptable solution. I would
love to dive into solving this but there are so many other things to
work on with much higher priority. Really, my todo list is huge and
growing. 32b kernels with that much memory is simply not all that high
on that list because there is a clear possibility of running 64b kernel
on the hardware which supports.

I fully understand your frustration and feel sorry about that but we are
only so many of us working on this subsystem. If you are willing to dive
into this then by all means. I am pretty sure you will find a word of
help and support but be warned this is not really trivial.

good luck
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
       [not found] <1978465524.8206495.1534505385491.ref@mail.yahoo.com>
@ 2018-08-17 11:29 ` Thierry
  2018-08-17 14:46   ` Michal Hocko
  0 siblings, 1 reply; 18+ messages in thread
From: Thierry @ 2018-08-17 11:29 UTC (permalink / raw)
  To: Thierry, Michal Hocko
  Cc: Alkis Georgopoulos, Andrew Morton, linux-mm, bugzilla-daemon,
	Mel Gorman, Johannes Weiner

On Fri, 8/17/18, Michal Hocko <mhocko@kernel.org> wrote:

> Have you tried to set highmem_is_dirtyable as suggested elsewhere?

I tried everything, and yes, that too, to no avail. The only solution is to limit the
available RAM to less than 12Gb, which is just unacceptable for me.
 
> I would like to stress out that 16GB with 32b kernels doesn't play really nice.

I would like to stress out that 32 Gb of RAM played totally nice and very smoothly
with v4.1 and older kernels... This got broken in v4.2 and never repaired since.
This is a very nasty regression, and my suggestion to keep v4.1 maintained till
that regression would finally get worked around fell into deaf ears...

> Even small changes (larger kernel memory footprint) can lead to all sorts of
> problems. I would really recommend using 64b kernels instead. There shouldn't be
> any real reason to stick with 32bhighmem based  kernel for such a large beast.
> I strongly doubt the cpu itself would be 32b only.

The reasons are many (one of them dealing with being able to run old 32 bits
Linux distros but without the bugs and security flaws of old, unmaintained kernels).

But the reasons are not the problem here. The problem is that v4.2 introduced a
bug (*) that was never fixed since.

A shame, really. :-(

(*) and that bug also affected 64 bits kernels, at first, mind you, till v4.8.4 got
released; see my comment in my initial report here:
https://bugzilla.kernel.org/show_bug.cgi?id=110031#c14

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
  2018-08-17  9:01 ` Thierry
@ 2018-08-17  9:29   ` Michal Hocko
  0 siblings, 0 replies; 18+ messages in thread
From: Michal Hocko @ 2018-08-17  9:29 UTC (permalink / raw)
  To: Thierry
  Cc: Alkis Georgopoulos, Andrew Morton, linux-mm, bugzilla-daemon,
	Mel Gorman, Johannes Weiner

On Fri 17-08-18 09:01:41, Thierry wrote:
> Bug still present for 32 bits kernel in v4.18.1, and now, v4.1 (last
> working Linux kernel for 32 bits machines with 16Gb or more RAM) has
> gone unmaintained...

Have you tried to set highmem_is_dirtyable as suggested elsewhere?

I would like to stress out that 16GB with 32b kernels doesn't play
really nice. Even small changes (larger kernel memory footprint) can
lead to all sorts of problems. I would really recommend using 64b
kernels instead. There shouldn't be any real reason to stick with 32b
highmem based kernel for such a large beast. I strongly doubt the cpu
itself would be 32b only.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
       [not found] <328204943.8183321.1534496501208.ref@mail.yahoo.com>
@ 2018-08-17  9:01 ` Thierry
  2018-08-17  9:29   ` Michal Hocko
  0 siblings, 1 reply; 18+ messages in thread
From: Thierry @ 2018-08-17  9:01 UTC (permalink / raw)
  To: Alkis Georgopoulos, Andrew Morton
  Cc: Michal Hocko, linux-mm, bugzilla-daemon, Mel Gorman,
	Johannes Weiner, reserv0

Bug still present for 32 bits kernel in v4.18.1, and now, v4.1 (last working Linux kernel for 32 bits machines with 16Gb or more RAM) has gone unmaintained...

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
       [not found] <234273956.1792577.1524210929280.ref@mail.yahoo.com>
@ 2018-04-20  7:55 ` Thierry
  0 siblings, 0 replies; 18+ messages in thread
From: Thierry @ 2018-04-20  7:55 UTC (permalink / raw)
  To: Alkis Georgopoulos, Andrew Morton
  Cc: Michal Hocko, linux-mm, bugzilla-daemon, Mel Gorman,
	Johannes Weiner, reserv0

On Thu, 4/19/18, Andrew Morton <akpm@linux-foundation.org> wrote:

> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> https://bugzilla.kernel.org/show_bug.cgi?id=196157
>
> People are still hurting from this.  It does seem a pretty major
> regression for highmem machines.
>
> I'm surprised that we aren't hearing about this from distros.  Maybe it
> only affects a subset of highmem machines?

Supposition: it would only affect distros with a given glibc version (my
affected machines run glibc v2.13) ?

Please, also take note that I encountered this bug on the 64 bits flavor of the
same distro (Rosa 2012), on 64 bits capable machines, with Linux v4.2+ and
until Linux v4.8.4 was released (and another interesting fact is that another
64 bits distro one the same machines was not affected at all by that bug,
which would reinforce my suspicion about a glibc-triggered and
glibc-version-dependent bug).

> Anyway, can we please take another look at it?  Seems that we messed up
> highmem dirty pagecache handling in the 4.2 timeframe.

Oh, yes, please, do have a look ! :-D

In the mean time, could you guys also consider extending the lifetime of the
v4.1 kernel until this ***showstopper*** bug is resolved in the mainline kernel
version ?

Many (many, many, many) thanks in advance !

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-08-17 14:46 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-196157-27@https.bugzilla.kernel.org/>
2017-06-22 19:37 ` [Bug 196157] New: 100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x Andrew Morton
2017-06-22 20:58   ` Alkis Georgopoulos
2017-06-23  7:13   ` Michal Hocko
2017-06-23  7:44     ` Alkis Georgopoulos
2017-06-23 11:38       ` Michal Hocko
2017-06-26  5:28         ` Alkis Georgopoulos
2017-06-26  5:46           ` Michal Hocko
2017-06-26  7:02             ` Alkis Georgopoulos
2017-06-26  9:12               ` Michal Hocko
2017-06-29  6:14                 ` Alkis Georgopoulos
2017-06-29  7:16                   ` Michal Hocko
2017-06-29  8:02                     ` Alkis Georgopoulos
2018-04-19 20:36                       ` Andrew Morton
     [not found] <234273956.1792577.1524210929280.ref@mail.yahoo.com>
2018-04-20  7:55 ` Thierry
     [not found] <328204943.8183321.1534496501208.ref@mail.yahoo.com>
2018-08-17  9:01 ` Thierry
2018-08-17  9:29   ` Michal Hocko
     [not found] <1978465524.8206495.1534505385491.ref@mail.yahoo.com>
2018-08-17 11:29 ` Thierry
2018-08-17 14:46   ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).