All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: print a warning once the vm dirtiness settings is illogical
@ 2017-09-28  9:54 Yafang Shao
  2017-11-25 16:05 ` Tetsuo Handa
  0 siblings, 1 reply; 37+ messages in thread
From: Yafang Shao @ 2017-09-28  9:54 UTC (permalink / raw)
  To: akpm; +Cc: jack, mhocko, linux-mm, Yafang Shao

The vm direct limit setting must be set greater than vm background
limit setting.
Otherwise we will print a warning to help the operator to figure
out that the vm dirtiness settings is in illogical state.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 Documentation/sysctl/vm.txt | 7 +++++++
 mm/page-writeback.c         | 5 ++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
index 9baf66a..30fd16b 100644
--- a/Documentation/sysctl/vm.txt
+++ b/Documentation/sysctl/vm.txt
@@ -157,6 +157,10 @@ Note: the minimum value allowed for dirty_bytes is two pages (in bytes); any
 value lower than this limit will be ignored and the old configuration will be
 retained.
 
+Note: the value of dirty_bytes also must be set greater than
+dirty_background_bytes or the amount of memory corresponding to
+dirty_background_ratio.
+
 ==============================================================
 
 dirty_expire_centisecs
@@ -176,6 +180,9 @@ generating disk writes will itself start writing out dirty data.
 
 The total available memory is not equal to total system memory.
 
+Note: dirty_ratio must be set greater than dirty_background_ratio or
+ratio corresponding to dirty_background_bytes.
+
 ==============================================================
 
 dirty_writeback_centisecs
diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 0b9c5cb..8b747dd 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -433,8 +433,11 @@ static void domain_dirty_limits(struct dirty_throttle_control *dtc)
 	else
 		bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
 
-	if (bg_thresh >= thresh)
+	if (unlikely(bg_thresh >= thresh)) {
+		pr_warn("vm direct limit must be set greater than background limit.\n");
 		bg_thresh = thresh / 2;
+	}
+
 	tsk = current;
 	if (tsk->flags & PF_LESS_THROTTLE || rt_task(tsk)) {
 		bg_thresh += bg_thresh / 4 + global_wb_domain.dirty_limit / 32;
-- 
1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-09-28  9:54 [PATCH] mm: print a warning once the vm dirtiness settings is illogical Yafang Shao
@ 2017-11-25 16:05 ` Tetsuo Handa
  2017-11-26  2:24   ` Yafang Shao
  2017-11-27  9:19   ` [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical) Michal Hocko
  0 siblings, 2 replies; 37+ messages in thread
From: Tetsuo Handa @ 2017-11-25 16:05 UTC (permalink / raw)
  To: Yafang Shao, akpm; +Cc: jack, mhocko, linux-mm

On 2017/09/28 18:54, Yafang Shao wrote:
> The vm direct limit setting must be set greater than vm background
> limit setting.
> Otherwise we will print a warning to help the operator to figure
> out that the vm dirtiness settings is in illogical state.
> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>

I got this warning by simple OOM killer flooding. Is this what you meant?

----------
[  621.747994] vmtoolsd invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
[  621.751357] vmtoolsd cpuset=/ mems_allowed=0
[  621.753124] CPU: 1 PID: 684 Comm: vmtoolsd Not tainted 4.14.0-next-20171124+ #681
[  621.755392] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[  621.758535] Call Trace:
[  621.760324]  dump_stack+0x5f/0x86
[  621.761981]  dump_header+0x69/0x431
[  621.763459]  ? trace_hardirqs_on_caller+0xe7/0x180
[  621.765200]  oom_kill_process+0x294/0x670
[  621.766800]  out_of_memory+0x423/0x5c0
[  621.768432]  __alloc_pages_nodemask+0x11e2/0x1450
[  621.770135]  filemap_fault+0x4b7/0x710
[  621.771656]  __xfs_filemap_fault.constprop.0+0x68/0x210
[  621.773463]  __do_fault+0x15/0xc0
[  621.775252]  __handle_mm_fault+0xd7c/0x1390
[  621.777204]  ? __audit_syscall_exit+0x195/0x290
[  621.779318]  handle_mm_fault+0x173/0x330
[  621.781047]  __do_page_fault+0x2a7/0x510
[  621.784331]  do_page_fault+0x2c/0x2f0
[  621.786126]  page_fault+0x22/0x30
[  621.787581] RIP: 0033:0x7f397f268dd0
[  621.789102] RSP: 002b:00007fffa8b225f8 EFLAGS: 00010202
[  621.790842] RAX: 00005635d5c715b0 RBX: 0000000000000001 RCX: 0000000000000001
[  621.793421] RDX: 0000000000000001 RSI: 00007f3981f16350 RDI: 00005635d5c715b0
[  621.795610] RBP: 00005635d5c715b0 R08: 0000000000000000 R09: 0000000000000004
[  621.797732] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000000000
[  621.800252] R13: 00007f3981f16350 R14: 0000000000000000 R15: 0000000000000000
[  621.802406] Mem-Info:
[  621.803682] active_anon:881485 inactive_anon:2093 isolated_anon:0
 active_file:54 inactive_file:468 isolated_file:0
 unevictable:0 dirty:13 writeback:0 unstable:0
 slab_reclaimable:6684 slab_unreclaimable:16061
 mapped:626 shmem:2165 pagetables:3418 bounce:0
 free:21384 free_pcp:74 free_cma:0
[  621.814512] Node 0 active_anon:3525940kB inactive_anon:8372kB active_file:216kB inactive_file:1872kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:2504kB dirty:52kB writeback:0kB shmem:8660kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 636928kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[  621.821534] Node 0 DMA free:14848kB min:284kB low:352kB high:420kB active_anon:992kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB kernel_stack:0kB pagetables:24kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  621.829035] lowmem_reserve[]: 0 2687 3645 3645
[  621.831655] Node 0 DMA32 free:53004kB min:49608kB low:62008kB high:74408kB active_anon:2712648kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:2773132kB mlocked:0kB kernel_stack:96kB pagetables:5096kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  621.839945] lowmem_reserve[]: 0 0 958 958
[  621.842811] Node 0 Normal free:17140kB min:17684kB low:22104kB high:26524kB active_anon:812300kB inactive_anon:8372kB active_file:1228kB inactive_file:1868kB unevictable:0kB writepending:52kB present:1048576kB managed:981224kB mlocked:0kB kernel_stack:3520kB pagetables:8552kB bounce:0kB free_pcp:120kB local_pcp:120kB free_cma:0kB
[  621.852473] lowmem_reserve[]: 0 0 0 0
[  621.854094] Node 0 DMA: 2*4kB (M) 1*8kB (M) 1*16kB (M) 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 2*1024kB (UM) 0*2048kB 3*4096kB (ME) = 14848kB
[  621.857755] Node 0 DMA32: 65*4kB (UM) 5*8kB (UME) 118*16kB (UME) 66*32kB (UME) 25*64kB (UME) 12*128kB (UE) 8*256kB (UE) 3*512kB (ME) 5*1024kB (E) 10*2048kB (E) 4*4096kB (ME) = 53004kB
[  621.864205] Node 0 Normal: 619*4kB (UME) 506*8kB (UME) 207*16kB (UME) 121*32kB (UME) 33*64kB (UME) 7*128kB (ME) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 16972kB
[  621.869661] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[  621.872283] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[  621.875600] 2823 total pagecache pages
[  621.878108] 0 pages in swap cache
[  621.879818] Swap cache stats: add 0, delete 0, find 0/0
[  621.864205] Node 0 Normal: 619*4kB (UME) 506*8kB (UME) 207*16kB (UME) 121*32kB (UME) 33*64kB (UME) 7*128kB (ME) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 16972kB
[  621.869661] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[  621.872283] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[  621.875600] 2823 total pagecache pages
[  621.878108] 0 pages in swap cache
[  621.879818] Swap cache stats: add 0, delete 0, find 0/0
[  621.881796] Free swap  = 0kB
[  621.883389] Total swap = 0kB
[  621.884940] 1048445 pages RAM
[  621.886442] 0 pages HighMem/MovableOnly
[  621.888197] 105880 pages reserved
[  621.889964] 0 pages hwpoisoned
[  621.891477] Out of memory: Kill process 8459 (a.out) score 999 or sacrifice child
[  621.894363] Killed process 8459 (a.out) total-vm:4180kB, anon-rss:88kB, file-rss:0kB, shmem-rss:0kB
[  621.897172] oom_reaper: reaped process 8459 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
[  622.424664] vm direct limit must be set greater than background limit.
[  622.427810] vm direct limit must be set greater than background limit.
[  622.631937] vm direct limit must be set greater than background limit.
[  622.634316] vm direct limit must be set greater than background limit.
[  622.789310] a.out invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
[  622.792603] a.out cpuset=/ mems_allowed=0
[  622.794797] CPU: 2 PID: 8455 Comm: a.out Not tainted 4.14.0-next-20171124+ #681
[  622.797307] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[  622.800901] Call Trace:
[  622.802264]  dump_stack+0x5f/0x86
[  622.803733]  dump_header+0x69/0x431
[  622.805198]  ? trace_hardirqs_on_caller+0xe7/0x180
[  622.807001]  oom_kill_process+0x294/0x670
[  622.808655]  out_of_memory+0x423/0x5c0
[  622.810771]  __alloc_pages_nodemask+0x11e2/0x1450
[  622.812632]  ? __lock_acquire+0x306/0x1420
[  622.814315]  alloc_pages_vma+0x7b/0x1e0
[  622.816091]  __handle_mm_fault+0x1001/0x1390
[  622.817754]  ? retint_kernel+0x2d/0x2d
[  622.819785]  handle_mm_fault+0x173/0x330
[  622.821331]  __do_page_fault+0x2a7/0x510
[  622.822886]  do_page_fault+0x2c/0x2f0
[  622.824307]  page_fault+0x22/0x30
[  622.828293] RIP: 0033:0x4006f0
[  622.829930] RSP: 002b:00007fffa638c440 EFLAGS: 00010206
[  622.831701] RAX: 00000000d380e000 RBX: 0000000100000000 RCX: 00007f9ed5110190
[  622.834199] RDX: 0000000000000000 RSI: 00007fffa638c260 RDI: 00007fffa638c260
[  622.836253] RBP: 00007f9cd5245010 R08: 00007fffa638c370 R09: 00007fffa638c1b0
[  622.838271] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006
[  622.840430] R13: 00007f9cd5245010 R14: 0000000000000000 R15: 0000000000000000
[  622.854308] Mem-Info:
[  622.855493] active_anon:881658 inactive_anon:2093 isolated_anon:0
 active_file:1 inactive_file:120 isolated_file:0
 unevictable:0 dirty:3 writeback:0 unstable:0
 slab_reclaimable:6676 slab_unreclaimable:16057
 mapped:585 shmem:2165 pagetables:3412 bounce:0
 free:21273 free_pcp:124 free_cma:0
[  622.867896] Node 0 active_anon:3526676kB inactive_anon:8372kB active_file:8kB inactive_file:828kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:2344kB dirty:20kB writeback:0kB shmem:8660kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 636928kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[  622.875151] Node 0 DMA free:14848kB min:284kB low:352kB high:420kB active_anon:992kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB kernel_stack:0kB pagetables:24kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  622.882244] lowmem_reserve[]: 0 2687 3645 3645
[  622.883861] Node 0 DMA32 free:53004kB min:49608kB low:62008kB high:74408kB active_anon:2712648kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:2773132kB mlocked:0kB kernel_stack:96kB pagetables:5096kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  622.891029] lowmem_reserve[]: 0 0 958 958
[  622.892814] Node 0 Normal free:18000kB min:17684kB low:22104kB high:26524kB active_anon:813036kB inactive_anon:8372kB active_file:8kB inactive_file:1300kB unevictable:0kB writepending:20kB present:1048576kB managed:981224kB mlocked:0kB kernel_stack:3488kB pagetables:8532kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB
[  622.901963] lowmem_reserve[]: 0 0 0 0
[  622.903501] Node 0 DMA: 2*4kB (M) 1*8kB (M) 1*16kB (M) 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 2*1024kB (UM) 0*2048kB 3*4096kB (ME) = 14848kB
[  622.907115] Node 0 DMA32: 65*4kB (UM) 5*8kB (UME) 118*16kB (UME) 66*32kB (UME) 25*64kB (UME) 12*128kB (UE) 8*256kB (UE) 3*512kB (ME) 5*1024kB (E) 10*2048kB (E) 4*4096kB (ME) = 53004kB
[  622.912651] Node 0 Normal: 823*4kB (UME) 505*8kB (UME) 214*16kB (UME) 127*32kB (UME) 33*64kB (UME) 7*128kB (ME) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18084kB
[  622.917512] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[  622.920171] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[  622.922742] 2503 total pagecache pages
[  622.924417] 0 pages in swap cache
[  622.928889] Swap cache stats: add 0, delete 0, find 0/0
[  622.931040] Free swap  = 0kB
[  622.932560] Total swap = 0kB
[  622.934031] 1048445 pages RAM
[  622.935503] 0 pages HighMem/MovableOnly
[  622.937440] 105880 pages reserved
[  622.939066] 0 pages hwpoisoned
[  622.940578] Out of memory: Kill process 8460 (a.out) score 999 or sacrifice child
[  622.943093] Killed process 8460 (a.out) total-vm:4180kB, anon-rss:88kB, file-rss:0kB, shmem-rss:0kB
[  622.946331] oom_reaper: reaped process 8460 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
----------

----------
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(int argc, char *argv[])
{
        static char buffer[4096] = { };
        char *buf = NULL;
        unsigned long size;
        unsigned long i;
        for (i = 0; i < 16; i++) {
                if (fork() == 0) {
                        int fd = open("/proc/self/oom_score_adj", O_WRONLY);
                        write(fd, "1000", 4);
                        close(fd);
                        snprintf(buffer, sizeof(buffer), "/tmp/file.%u", getpid());
                        fd = open(buffer, O_WRONLY | O_CREAT | O_APPEND, 0600);
                        sleep(1);
                        while (write(fd, buffer, sizeof(buffer)) == sizeof(buffer));
                        _exit(0);
                }
        }
        for (size = 1048576; size < 512UL * (1 << 30); size <<= 1) {
                char *cp = realloc(buf, size);
                if (!cp) {
                        size >>= 1;
                        break;
                }
                buf = cp;
        }
        sleep(2);
        /* Will cause OOM due to overcommit */
        for (i = 0; i < size; i += 4096)
                buf[i] = 0;
        return 0;
}
----------

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-25 16:05 ` Tetsuo Handa
@ 2017-11-26  2:24   ` Yafang Shao
  2017-11-26  2:42     ` Tetsuo Handa
  2017-11-27  9:19   ` [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical) Michal Hocko
  1 sibling, 1 reply; 37+ messages in thread
From: Yafang Shao @ 2017-11-26  2:24 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Andrew Morton, Jan Kara, Michal Hocko, Linux MM

2017-11-26 0:05 GMT+08:00 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
> On 2017/09/28 18:54, Yafang Shao wrote:
>> The vm direct limit setting must be set greater than vm background
>> limit setting.
>> Otherwise we will print a warning to help the operator to figure
>> out that the vm dirtiness settings is in illogical state.
>>
>> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
>
> I got this warning by simple OOM killer flooding. Is this what you meant?
>

This message is only printed when the vm dirtiness settings are illogic.
Pls. get the bellow four knobs and check whether they are set proper or not.

vm.dirty_ratio
vm.dirty_bytes
vm.dirty_background_ratio
vm.dirty_background_bytes

If these four valuses are set properly, this message should not be
printed when OOM happens.

I have also verified your test code on my machine, but can not find
this message.


> ----------
> [  621.747994] vmtoolsd invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
> [  621.751357] vmtoolsd cpuset=/ mems_allowed=0
> [  621.753124] CPU: 1 PID: 684 Comm: vmtoolsd Not tainted 4.14.0-next-20171124+ #681
> [  621.755392] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
> [  621.758535] Call Trace:
> [  621.760324]  dump_stack+0x5f/0x86
> [  621.761981]  dump_header+0x69/0x431
> [  621.763459]  ? trace_hardirqs_on_caller+0xe7/0x180
> [  621.765200]  oom_kill_process+0x294/0x670
> [  621.766800]  out_of_memory+0x423/0x5c0
> [  621.768432]  __alloc_pages_nodemask+0x11e2/0x1450
> [  621.770135]  filemap_fault+0x4b7/0x710
> [  621.771656]  __xfs_filemap_fault.constprop.0+0x68/0x210
> [  621.773463]  __do_fault+0x15/0xc0
> [  621.775252]  __handle_mm_fault+0xd7c/0x1390
> [  621.777204]  ? __audit_syscall_exit+0x195/0x290
> [  621.779318]  handle_mm_fault+0x173/0x330
> [  621.781047]  __do_page_fault+0x2a7/0x510
> [  621.784331]  do_page_fault+0x2c/0x2f0
> [  621.786126]  page_fault+0x22/0x30
> [  621.787581] RIP: 0033:0x7f397f268dd0
> [  621.789102] RSP: 002b:00007fffa8b225f8 EFLAGS: 00010202
> [  621.790842] RAX: 00005635d5c715b0 RBX: 0000000000000001 RCX: 0000000000000001
> [  621.793421] RDX: 0000000000000001 RSI: 00007f3981f16350 RDI: 00005635d5c715b0
> [  621.795610] RBP: 00005635d5c715b0 R08: 0000000000000000 R09: 0000000000000004
> [  621.797732] R10: 0000000000000005 R11: 0000000000000246 R12: 0000000000000000
> [  621.800252] R13: 00007f3981f16350 R14: 0000000000000000 R15: 0000000000000000
> [  621.802406] Mem-Info:
> [  621.803682] active_anon:881485 inactive_anon:2093 isolated_anon:0
>  active_file:54 inactive_file:468 isolated_file:0
>  unevictable:0 dirty:13 writeback:0 unstable:0
>  slab_reclaimable:6684 slab_unreclaimable:16061
>  mapped:626 shmem:2165 pagetables:3418 bounce:0
>  free:21384 free_pcp:74 free_cma:0
> [  621.814512] Node 0 active_anon:3525940kB inactive_anon:8372kB active_file:216kB inactive_file:1872kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:2504kB dirty:52kB writeback:0kB shmem:8660kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 636928kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
> [  621.821534] Node 0 DMA free:14848kB min:284kB low:352kB high:420kB active_anon:992kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB kernel_stack:0kB pagetables:24kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [  621.829035] lowmem_reserve[]: 0 2687 3645 3645
> [  621.831655] Node 0 DMA32 free:53004kB min:49608kB low:62008kB high:74408kB active_anon:2712648kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:2773132kB mlocked:0kB kernel_stack:96kB pagetables:5096kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [  621.839945] lowmem_reserve[]: 0 0 958 958
> [  621.842811] Node 0 Normal free:17140kB min:17684kB low:22104kB high:26524kB active_anon:812300kB inactive_anon:8372kB active_file:1228kB inactive_file:1868kB unevictable:0kB writepending:52kB present:1048576kB managed:981224kB mlocked:0kB kernel_stack:3520kB pagetables:8552kB bounce:0kB free_pcp:120kB local_pcp:120kB free_cma:0kB
> [  621.852473] lowmem_reserve[]: 0 0 0 0
> [  621.854094] Node 0 DMA: 2*4kB (M) 1*8kB (M) 1*16kB (M) 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 2*1024kB (UM) 0*2048kB 3*4096kB (ME) = 14848kB
> [  621.857755] Node 0 DMA32: 65*4kB (UM) 5*8kB (UME) 118*16kB (UME) 66*32kB (UME) 25*64kB (UME) 12*128kB (UE) 8*256kB (UE) 3*512kB (ME) 5*1024kB (E) 10*2048kB (E) 4*4096kB (ME) = 53004kB
> [  621.864205] Node 0 Normal: 619*4kB (UME) 506*8kB (UME) 207*16kB (UME) 121*32kB (UME) 33*64kB (UME) 7*128kB (ME) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 16972kB
> [  621.869661] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> [  621.872283] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> [  621.875600] 2823 total pagecache pages
> [  621.878108] 0 pages in swap cache
> [  621.879818] Swap cache stats: add 0, delete 0, find 0/0
> [  621.864205] Node 0 Normal: 619*4kB (UME) 506*8kB (UME) 207*16kB (UME) 121*32kB (UME) 33*64kB (UME) 7*128kB (ME) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 16972kB
> [  621.869661] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> [  621.872283] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> [  621.875600] 2823 total pagecache pages
> [  621.878108] 0 pages in swap cache
> [  621.879818] Swap cache stats: add 0, delete 0, find 0/0
> [  621.881796] Free swap  = 0kB
> [  621.883389] Total swap = 0kB
> [  621.884940] 1048445 pages RAM
> [  621.886442] 0 pages HighMem/MovableOnly
> [  621.888197] 105880 pages reserved
> [  621.889964] 0 pages hwpoisoned
> [  621.891477] Out of memory: Kill process 8459 (a.out) score 999 or sacrifice child
> [  621.894363] Killed process 8459 (a.out) total-vm:4180kB, anon-rss:88kB, file-rss:0kB, shmem-rss:0kB
> [  621.897172] oom_reaper: reaped process 8459 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [  622.424664] vm direct limit must be set greater than background limit.
> [  622.427810] vm direct limit must be set greater than background limit.
> [  622.631937] vm direct limit must be set greater than background limit.
> [  622.634316] vm direct limit must be set greater than background limit.
> [  622.789310] a.out invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
> [  622.792603] a.out cpuset=/ mems_allowed=0
> [  622.794797] CPU: 2 PID: 8455 Comm: a.out Not tainted 4.14.0-next-20171124+ #681
> [  622.797307] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
> [  622.800901] Call Trace:
> [  622.802264]  dump_stack+0x5f/0x86
> [  622.803733]  dump_header+0x69/0x431
> [  622.805198]  ? trace_hardirqs_on_caller+0xe7/0x180
> [  622.807001]  oom_kill_process+0x294/0x670
> [  622.808655]  out_of_memory+0x423/0x5c0
> [  622.810771]  __alloc_pages_nodemask+0x11e2/0x1450
> [  622.812632]  ? __lock_acquire+0x306/0x1420
> [  622.814315]  alloc_pages_vma+0x7b/0x1e0
> [  622.816091]  __handle_mm_fault+0x1001/0x1390
> [  622.817754]  ? retint_kernel+0x2d/0x2d
> [  622.819785]  handle_mm_fault+0x173/0x330
> [  622.821331]  __do_page_fault+0x2a7/0x510
> [  622.822886]  do_page_fault+0x2c/0x2f0
> [  622.824307]  page_fault+0x22/0x30
> [  622.828293] RIP: 0033:0x4006f0
> [  622.829930] RSP: 002b:00007fffa638c440 EFLAGS: 00010206
> [  622.831701] RAX: 00000000d380e000 RBX: 0000000100000000 RCX: 00007f9ed5110190
> [  622.834199] RDX: 0000000000000000 RSI: 00007fffa638c260 RDI: 00007fffa638c260
> [  622.836253] RBP: 00007f9cd5245010 R08: 00007fffa638c370 R09: 00007fffa638c1b0
> [  622.838271] R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000000006
> [  622.840430] R13: 00007f9cd5245010 R14: 0000000000000000 R15: 0000000000000000
> [  622.854308] Mem-Info:
> [  622.855493] active_anon:881658 inactive_anon:2093 isolated_anon:0
>  active_file:1 inactive_file:120 isolated_file:0
>  unevictable:0 dirty:3 writeback:0 unstable:0
>  slab_reclaimable:6676 slab_unreclaimable:16057
>  mapped:585 shmem:2165 pagetables:3412 bounce:0
>  free:21273 free_pcp:124 free_cma:0
> [  622.867896] Node 0 active_anon:3526676kB inactive_anon:8372kB active_file:8kB inactive_file:828kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:2344kB dirty:20kB writeback:0kB shmem:8660kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 636928kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
> [  622.875151] Node 0 DMA free:14848kB min:284kB low:352kB high:420kB active_anon:992kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB kernel_stack:0kB pagetables:24kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [  622.882244] lowmem_reserve[]: 0 2687 3645 3645
> [  622.883861] Node 0 DMA32 free:53004kB min:49608kB low:62008kB high:74408kB active_anon:2712648kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:2773132kB mlocked:0kB kernel_stack:96kB pagetables:5096kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [  622.891029] lowmem_reserve[]: 0 0 958 958
> [  622.892814] Node 0 Normal free:18000kB min:17684kB low:22104kB high:26524kB active_anon:813036kB inactive_anon:8372kB active_file:8kB inactive_file:1300kB unevictable:0kB writepending:20kB present:1048576kB managed:981224kB mlocked:0kB kernel_stack:3488kB pagetables:8532kB bounce:0kB free_pcp:120kB local_pcp:0kB free_cma:0kB
> [  622.901963] lowmem_reserve[]: 0 0 0 0
> [  622.903501] Node 0 DMA: 2*4kB (M) 1*8kB (M) 1*16kB (M) 1*32kB (U) 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 2*1024kB (UM) 0*2048kB 3*4096kB (ME) = 14848kB
> [  622.907115] Node 0 DMA32: 65*4kB (UM) 5*8kB (UME) 118*16kB (UME) 66*32kB (UME) 25*64kB (UME) 12*128kB (UE) 8*256kB (UE) 3*512kB (ME) 5*1024kB (E) 10*2048kB (E) 4*4096kB (ME) = 53004kB
> [  622.912651] Node 0 Normal: 823*4kB (UME) 505*8kB (UME) 214*16kB (UME) 127*32kB (UME) 33*64kB (UME) 7*128kB (ME) 1*256kB (M) 0*512kB 0*1024kB 0*2048kB 0*4096kB = 18084kB
> [  622.917512] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> [  622.920171] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> [  622.922742] 2503 total pagecache pages
> [  622.924417] 0 pages in swap cache
> [  622.928889] Swap cache stats: add 0, delete 0, find 0/0
> [  622.931040] Free swap  = 0kB
> [  622.932560] Total swap = 0kB
> [  622.934031] 1048445 pages RAM
> [  622.935503] 0 pages HighMem/MovableOnly
> [  622.937440] 105880 pages reserved
> [  622.939066] 0 pages hwpoisoned
> [  622.940578] Out of memory: Kill process 8460 (a.out) score 999 or sacrifice child
> [  622.943093] Killed process 8460 (a.out) total-vm:4180kB, anon-rss:88kB, file-rss:0kB, shmem-rss:0kB
> [  622.946331] oom_reaper: reaped process 8460 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> ----------
>
> ----------
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <fcntl.h>
>
> int main(int argc, char *argv[])
> {
>         static char buffer[4096] = { };
>         char *buf = NULL;
>         unsigned long size;
>         unsigned long i;
>         for (i = 0; i < 16; i++) {
>                 if (fork() == 0) {
>                         int fd = open("/proc/self/oom_score_adj", O_WRONLY);
>                         write(fd, "1000", 4);
>                         close(fd);
>                         snprintf(buffer, sizeof(buffer), "/tmp/file.%u", getpid());
>                         fd = open(buffer, O_WRONLY | O_CREAT | O_APPEND, 0600);
>                         sleep(1);
>                         while (write(fd, buffer, sizeof(buffer)) == sizeof(buffer));
>                         _exit(0);
>                 }
>         }
>         for (size = 1048576; size < 512UL * (1 << 30); size <<= 1) {
>                 char *cp = realloc(buf, size);
>                 if (!cp) {
>                         size >>= 1;
>                         break;
>                 }
>                 buf = cp;
>         }
>         sleep(2);
>         /* Will cause OOM due to overcommit */
>         for (i = 0; i < size; i += 4096)
>                 buf[i] = 0;
>         return 0;
> }
> ----------

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-26  2:24   ` Yafang Shao
@ 2017-11-26  2:42     ` Tetsuo Handa
  2017-11-26  4:32       ` Yafang Shao
  0 siblings, 1 reply; 37+ messages in thread
From: Tetsuo Handa @ 2017-11-26  2:42 UTC (permalink / raw)
  To: laoar.shao; +Cc: akpm, jack, mhocko, linux-mm

Yafang Shao wrote:
> 2017-11-26 0:05 GMT+08:00 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
> > On 2017/09/28 18:54, Yafang Shao wrote:
> >> The vm direct limit setting must be set greater than vm background
> >> limit setting.
> >> Otherwise we will print a warning to help the operator to figure
> >> out that the vm dirtiness settings is in illogical state.
> >>
> >> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> >
> > I got this warning by simple OOM killer flooding. Is this what you meant?
> >
> 
> This message is only printed when the vm dirtiness settings are illogic.
> Pls. get the bellow four knobs and check whether they are set proper or not.
> 
> vm.dirty_ratio
> vm.dirty_bytes
> vm.dirty_background_ratio
> vm.dirty_background_bytes
> 
> If these four valuses are set properly, this message should not be
> printed when OOM happens.

I didn't tune vm related values.

----------
vm.admin_reserve_kbytes = 8192
vm.block_dump = 0
vm.compact_unevictable_allowed = 1
vm.dirty_background_bytes = 0
vm.dirty_background_ratio = 10
vm.dirty_bytes = 0
vm.dirty_expire_centisecs = 3000
vm.dirty_ratio = 30
vm.dirty_writeback_centisecs = 500
vm.dirtytime_expire_seconds = 43200
vm.drop_caches = 0
vm.extfrag_threshold = 500
vm.hugetlb_shm_group = 0
vm.laptop_mode = 0
vm.legacy_va_layout = 0
vm.lowmem_reserve_ratio = 256   256     32
vm.max_map_count = 65530
vm.memory_failure_early_kill = 0
vm.memory_failure_recovery = 1
vm.min_free_kbytes = 67584
vm.min_slab_ratio = 5
vm.min_unmapped_ratio = 1
vm.mmap_min_addr = 4096
vm.mmap_rnd_bits = 28
vm.mmap_rnd_compat_bits = 8
vm.nr_hugepages = 0
vm.nr_hugepages_mempolicy = 0
vm.nr_overcommit_hugepages = 0
vm.numa_stat = 1
vm.numa_zonelist_order = Node
vm.oom_dump_tasks = 0
vm.oom_kill_allocating_task = 0
vm.overcommit_kbytes = 0
vm.overcommit_memory = 0
vm.overcommit_ratio = 50
vm.page-cluster = 3
vm.panic_on_oom = 0
vm.percpu_pagelist_fraction = 0
vm.stat_interval = 1
vm.swappiness = 30
vm.user_reserve_kbytes = 116665
vm.vfs_cache_pressure = 100
vm.watermark_scale_factor = 10
vm.zone_reclaim_mode = 0
----------

> 
> I have also verified your test code on my machine, but can not find
> this message.
> 

Not always printed. It is timing dependent.

----------
[  343.783160] a.out invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=1000
[  343.793554] a.out cpuset=/ mems_allowed=0
[  343.795437] CPU: 2 PID: 2930 Comm: a.out Not tainted 4.14.0-next-20171124+ #681
[  343.798112] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
[  343.801348] Call Trace:
[  343.802844]  dump_stack+0x5f/0x86
[  343.804545]  dump_header+0x69/0x431
[  343.806268]  oom_kill_process+0x294/0x670
[  343.808042]  out_of_memory+0x423/0x5c0
[  343.809701]  __alloc_pages_nodemask+0x11e2/0x1450
[  343.811553]  filemap_fault+0x4b7/0x710
[  343.813184]  __xfs_filemap_fault.constprop.0+0x68/0x210
[  343.815129]  __do_fault+0x15/0xc0
[  343.816625]  __handle_mm_fault+0xd7c/0x1390
[  343.818315]  handle_mm_fault+0x173/0x330
[  343.820051]  __do_page_fault+0x2a7/0x510
[  343.821842]  do_page_fault+0x2c/0x2f0
[  343.823376]  page_fault+0x22/0x30
[  343.824834] RIP: 0033:0x7f12886f9840
[  343.826302] RSP: 002b:00007ffdfa961828 EFLAGS: 00010246
[  343.828075] RAX: 0000000000001000 RBX: 0000000000000003 RCX: 00007f12886f9840
[  343.830324] RDX: 0000000000001000 RSI: 00000000006010a0 RDI: 0000000000000003
[  343.832526] RBP: 0000000000000000 R08: 00007ffdfa961760 R09: 00007ffdfa9615a0
[  343.834693] R10: 0000000000000008 R11: 0000000000000246 R12: 000000000040079d
[  343.836825] R13: 00007ffdfa961930 R14: 0000000000000000 R15: 0000000000000000
[  343.839058] Mem-Info:
[  343.840557] active_anon:882238 inactive_anon:2091 isolated_anon:0
[  343.840557]  active_file:27 inactive_file:0 isolated_file:0
[  343.840557]  unevictable:0 dirty:2 writeback:0 unstable:0
[  343.840557]  slab_reclaimable:6412 slab_unreclaimable:15971
[  343.840557]  mapped:695 shmem:2162 pagetables:3290 bounce:0
[  343.840557]  free:21394 free_pcp:1 free_cma:0
[  343.851466] Node 0 active_anon:3528952kB inactive_anon:8364kB active_file:108kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:2780kB dirty:8kB writeback:0kB shmem:8648kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 2797568kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
[  343.859073] Node 0 DMA free:14944kB min:284kB low:352kB high:420kB active_anon:928kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  343.866543] lowmem_reserve[]: 0 2708 3666 3666
[  343.868345] Node 0 DMA32 free:53196kB min:49608kB low:62008kB high:74408kB active_anon:2713592kB inactive_anon:0kB active_file:100kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:2773132kB mlocked:0kB kernel_stack:16kB pagetables:3372kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  343.877512] lowmem_reserve[]: 0 0 958 958
[  343.878700] vm direct limit must be set greater than background limit.
[  343.878706] vm direct limit must be set greater than background limit.
[  343.878712] vm direct limit must be set greater than background limit.
[  343.878718] vm direct limit must be set greater than background limit.
[  343.878723] vm direct limit must be set greater than background limit.
[  343.878728] vm direct limit must be set greater than background limit.
[  343.878734] vm direct limit must be set greater than background limit.
[  343.878739] vm direct limit must be set greater than background limit.
[  343.878744] vm direct limit must be set greater than background limit.
[  343.878749] vm direct limit must be set greater than background limit.
[  343.878755] vm direct limit must be set greater than background limit.
[  343.878760] vm direct limit must be set greater than background limit.
[  343.878767] vm direct limit must be set greater than background limit.
[  343.878772] vm direct limit must be set greater than background limit.
[  343.878777] vm direct limit must be set greater than background limit.
[  343.878782] vm direct limit must be set greater than background limit.
[  343.878786] vm direct limit must be set greater than background limit.
[  343.878791] vm direct limit must be set greater than background limit.
[  343.878796] vm direct limit must be set greater than background limit.
[  343.878800] vm direct limit must be set greater than background limit.
[  343.878805] vm direct limit must be set greater than background limit.
[  343.878809] vm direct limit must be set greater than background limit.
[  343.878814] vm direct limit must be set greater than background limit.
[  343.878818] vm direct limit must be set greater than background limit.
[  343.878826] vm direct limit must be set greater than background limit.
[  343.878831] vm direct limit must be set greater than background limit.
[  343.878835] vm direct limit must be set greater than background limit.
[  343.878840] vm direct limit must be set greater than background limit.
[  343.878844] vm direct limit must be set greater than background limit.
[  343.878848] vm direct limit must be set greater than background limit.
[  343.878852] vm direct limit must be set greater than background limit.
[  343.878856] vm direct limit must be set greater than background limit.
[  343.878861] vm direct limit must be set greater than background limit.
[  343.878865] vm direct limit must be set greater than background limit.
[  343.878891] vm direct limit must be set greater than background limit.
[  343.878897] vm direct limit must be set greater than background limit.
[  343.878898] vm direct limit must be set greater than background limit.
[  343.878915] vm direct limit must be set greater than background limit.
[  343.956240] Node 0 Normal free:17796kB min:17684kB low:22104kB high:26524kB active_anon:815300kB inactive_anon:8364kB active_file:0kB inactive_file:4kB unevictable:0kB writepending:0kB present:1048576kB managed:981224kB mlocked:0kB kernel_stack:3632kB pagetables:9792kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
[  343.963339] lowmem_reserve[]: 0 0 0 0
[  343.964644] Node 0 DMA: 0*4kB 0*8kB 0*16kB 1*32kB (M) 3*64kB (UM) 1*128kB (U) 1*256kB (U) 0*512kB 2*1024kB (UM) 0*2048kB 3*4096kB (ME) = 14944kB
[  343.968026] Node 0 DMA32: 12*4kB (UM) 15*8kB (UM) 28*16kB (UM) 28*32kB (UM) 12*64kB (UM) 21*128kB (UM) 14*256kB (UM) 12*512kB (UM) 24*1024kB (UM) 5*2048kB (E) 1*4096kB (E) = 53608kB
[  343.972148] Node 0 Normal: 513*4kB (UME) 282*8kB (UME) 215*16kB (UME) 128*32kB (UME) 41*64kB (UME) 12*128kB (UME) 3*256kB (M) 2*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 17796kB
[  343.976695] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
[  343.979084] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
[  343.981469] 2290 total pagecache pages
[  343.982709] 0 pages in swap cache
[  343.983989] Swap cache stats: add 0, delete 0, find 0/0
[  343.987118] Free swap  = 0kB
[  343.988182] Total swap = 0kB
[  343.990799] 1048445 pages RAM
[  343.991844] 0 pages HighMem/MovableOnly
[  343.994723] 105880 pages reserved
[  343.995878] 0 pages hwpoisoned
[  343.998745] Out of memory: Kill process 2929 (a.out) score 999 or sacrifice child
[  344.002681] Killed process 2929 (a.out) total-vm:4180kB, anon-rss:88kB, file-rss:0kB, shmem-rss:0kB
----------

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-26  2:42     ` Tetsuo Handa
@ 2017-11-26  4:32       ` Yafang Shao
  2017-11-26  8:03         ` Tetsuo Handa
  0 siblings, 1 reply; 37+ messages in thread
From: Yafang Shao @ 2017-11-26  4:32 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Andrew Morton, Jan Kara, Michal Hocko, Linux MM

2017-11-26 10:42 GMT+08:00 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
> Yafang Shao wrote:
>> 2017-11-26 0:05 GMT+08:00 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
>> > On 2017/09/28 18:54, Yafang Shao wrote:
>> >> The vm direct limit setting must be set greater than vm background
>> >> limit setting.
>> >> Otherwise we will print a warning to help the operator to figure
>> >> out that the vm dirtiness settings is in illogical state.
>> >>
>> >> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
>> >
>> > I got this warning by simple OOM killer flooding. Is this what you meant?
>> >
>>
>> This message is only printed when the vm dirtiness settings are illogic.
>> Pls. get the bellow four knobs and check whether they are set proper or not.
>>
>> vm.dirty_ratio
>> vm.dirty_bytes
>> vm.dirty_background_ratio
>> vm.dirty_background_bytes
>>
>> If these four valuses are set properly, this message should not be
>> printed when OOM happens.
>
> I didn't tune vm related values.
>
> ----------
> vm.admin_reserve_kbytes = 8192
> vm.block_dump = 0
> vm.compact_unevictable_allowed = 1
> vm.dirty_background_bytes = 0
> vm.dirty_background_ratio = 10
> vm.dirty_bytes = 0
> vm.dirty_expire_centisecs = 3000
> vm.dirty_ratio = 30
> vm.dirty_writeback_centisecs = 500
> vm.dirtytime_expire_seconds = 43200
> vm.drop_caches = 0
> vm.extfrag_threshold = 500
> vm.hugetlb_shm_group = 0
> vm.laptop_mode = 0
> vm.legacy_va_layout = 0
> vm.lowmem_reserve_ratio = 256   256     32
> vm.max_map_count = 65530
> vm.memory_failure_early_kill = 0
> vm.memory_failure_recovery = 1
> vm.min_free_kbytes = 67584
> vm.min_slab_ratio = 5
> vm.min_unmapped_ratio = 1
> vm.mmap_min_addr = 4096
> vm.mmap_rnd_bits = 28
> vm.mmap_rnd_compat_bits = 8
> vm.nr_hugepages = 0
> vm.nr_hugepages_mempolicy = 0
> vm.nr_overcommit_hugepages = 0
> vm.numa_stat = 1
> vm.numa_zonelist_order = Node
> vm.oom_dump_tasks = 0
> vm.oom_kill_allocating_task = 0
> vm.overcommit_kbytes = 0
> vm.overcommit_memory = 0
> vm.overcommit_ratio = 50
> vm.page-cluster = 3
> vm.panic_on_oom = 0
> vm.percpu_pagelist_fraction = 0
> vm.stat_interval = 1
> vm.swappiness = 30
> vm.user_reserve_kbytes = 116665
> vm.vfs_cache_pressure = 100
> vm.watermark_scale_factor = 10
> vm.zone_reclaim_mode = 0
> ----------
>

Then under these settings the message should not been printed.

>>
>> I have also verified your test code on my machine, but can not find
>> this message.
>>
>
> Not always printed. It is timing dependent.
>

I will try and analysis why this happen.

> ----------
> [  343.783160] a.out invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=1000
> [  343.793554] a.out cpuset=/ mems_allowed=0
> [  343.795437] CPU: 2 PID: 2930 Comm: a.out Not tainted 4.14.0-next-20171124+ #681
> [  343.798112] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 07/02/2015
> [  343.801348] Call Trace:
> [  343.802844]  dump_stack+0x5f/0x86
> [  343.804545]  dump_header+0x69/0x431
> [  343.806268]  oom_kill_process+0x294/0x670
> [  343.808042]  out_of_memory+0x423/0x5c0
> [  343.809701]  __alloc_pages_nodemask+0x11e2/0x1450
> [  343.811553]  filemap_fault+0x4b7/0x710
> [  343.813184]  __xfs_filemap_fault.constprop.0+0x68/0x210
> [  343.815129]  __do_fault+0x15/0xc0
> [  343.816625]  __handle_mm_fault+0xd7c/0x1390
> [  343.818315]  handle_mm_fault+0x173/0x330
> [  343.820051]  __do_page_fault+0x2a7/0x510
> [  343.821842]  do_page_fault+0x2c/0x2f0
> [  343.823376]  page_fault+0x22/0x30
> [  343.824834] RIP: 0033:0x7f12886f9840
> [  343.826302] RSP: 002b:00007ffdfa961828 EFLAGS: 00010246
> [  343.828075] RAX: 0000000000001000 RBX: 0000000000000003 RCX: 00007f12886f9840
> [  343.830324] RDX: 0000000000001000 RSI: 00000000006010a0 RDI: 0000000000000003
> [  343.832526] RBP: 0000000000000000 R08: 00007ffdfa961760 R09: 00007ffdfa9615a0
> [  343.834693] R10: 0000000000000008 R11: 0000000000000246 R12: 000000000040079d
> [  343.836825] R13: 00007ffdfa961930 R14: 0000000000000000 R15: 0000000000000000
> [  343.839058] Mem-Info:
> [  343.840557] active_anon:882238 inactive_anon:2091 isolated_anon:0
> [  343.840557]  active_file:27 inactive_file:0 isolated_file:0
> [  343.840557]  unevictable:0 dirty:2 writeback:0 unstable:0
> [  343.840557]  slab_reclaimable:6412 slab_unreclaimable:15971
> [  343.840557]  mapped:695 shmem:2162 pagetables:3290 bounce:0
> [  343.840557]  free:21394 free_pcp:1 free_cma:0
> [  343.851466] Node 0 active_anon:3528952kB inactive_anon:8364kB active_file:108kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:2780kB dirty:8kB writeback:0kB shmem:8648kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 2797568kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
> [  343.859073] Node 0 DMA free:14944kB min:284kB low:352kB high:420kB active_anon:928kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [  343.866543] lowmem_reserve[]: 0 2708 3666 3666
> [  343.868345] Node 0 DMA32 free:53196kB min:49608kB low:62008kB high:74408kB active_anon:2713592kB inactive_anon:0kB active_file:100kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:2773132kB mlocked:0kB kernel_stack:16kB pagetables:3372kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [  343.877512] lowmem_reserve[]: 0 0 958 958
> [  343.878700] vm direct limit must be set greater than background limit.
> [  343.878706] vm direct limit must be set greater than background limit.
> [  343.878712] vm direct limit must be set greater than background limit.
> [  343.878718] vm direct limit must be set greater than background limit.
> [  343.878723] vm direct limit must be set greater than background limit.
> [  343.878728] vm direct limit must be set greater than background limit.
> [  343.878734] vm direct limit must be set greater than background limit.
> [  343.878739] vm direct limit must be set greater than background limit.
> [  343.878744] vm direct limit must be set greater than background limit.
> [  343.878749] vm direct limit must be set greater than background limit.
> [  343.878755] vm direct limit must be set greater than background limit.
> [  343.878760] vm direct limit must be set greater than background limit.
> [  343.878767] vm direct limit must be set greater than background limit.
> [  343.878772] vm direct limit must be set greater than background limit.
> [  343.878777] vm direct limit must be set greater than background limit.
> [  343.878782] vm direct limit must be set greater than background limit.
> [  343.878786] vm direct limit must be set greater than background limit.
> [  343.878791] vm direct limit must be set greater than background limit.
> [  343.878796] vm direct limit must be set greater than background limit.
> [  343.878800] vm direct limit must be set greater than background limit.
> [  343.878805] vm direct limit must be set greater than background limit.
> [  343.878809] vm direct limit must be set greater than background limit.
> [  343.878814] vm direct limit must be set greater than background limit.
> [  343.878818] vm direct limit must be set greater than background limit.
> [  343.878826] vm direct limit must be set greater than background limit.
> [  343.878831] vm direct limit must be set greater than background limit.
> [  343.878835] vm direct limit must be set greater than background limit.
> [  343.878840] vm direct limit must be set greater than background limit.
> [  343.878844] vm direct limit must be set greater than background limit.
> [  343.878848] vm direct limit must be set greater than background limit.
> [  343.878852] vm direct limit must be set greater than background limit.
> [  343.878856] vm direct limit must be set greater than background limit.
> [  343.878861] vm direct limit must be set greater than background limit.
> [  343.878865] vm direct limit must be set greater than background limit.
> [  343.878891] vm direct limit must be set greater than background limit.
> [  343.878897] vm direct limit must be set greater than background limit.
> [  343.878898] vm direct limit must be set greater than background limit.
> [  343.878915] vm direct limit must be set greater than background limit.
> [  343.956240] Node 0 Normal free:17796kB min:17684kB low:22104kB high:26524kB active_anon:815300kB inactive_anon:8364kB active_file:0kB inactive_file:4kB unevictable:0kB writepending:0kB present:1048576kB managed:981224kB mlocked:0kB kernel_stack:3632kB pagetables:9792kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [  343.963339] lowmem_reserve[]: 0 0 0 0
> [  343.964644] Node 0 DMA: 0*4kB 0*8kB 0*16kB 1*32kB (M) 3*64kB (UM) 1*128kB (U) 1*256kB (U) 0*512kB 2*1024kB (UM) 0*2048kB 3*4096kB (ME) = 14944kB
> [  343.968026] Node 0 DMA32: 12*4kB (UM) 15*8kB (UM) 28*16kB (UM) 28*32kB (UM) 12*64kB (UM) 21*128kB (UM) 14*256kB (UM) 12*512kB (UM) 24*1024kB (UM) 5*2048kB (E) 1*4096kB (E) = 53608kB
> [  343.972148] Node 0 Normal: 513*4kB (UME) 282*8kB (UME) 215*16kB (UME) 128*32kB (UME) 41*64kB (UME) 12*128kB (UME) 3*256kB (M) 2*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 17796kB
> [  343.976695] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
> [  343.979084] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
> [  343.981469] 2290 total pagecache pages
> [  343.982709] 0 pages in swap cache
> [  343.983989] Swap cache stats: add 0, delete 0, find 0/0
> [  343.987118] Free swap  = 0kB
> [  343.988182] Total swap = 0kB
> [  343.990799] 1048445 pages RAM
> [  343.991844] 0 pages HighMem/MovableOnly
> [  343.994723] 105880 pages reserved
> [  343.995878] 0 pages hwpoisoned
> [  343.998745] Out of memory: Kill process 2929 (a.out) score 999 or sacrifice child
> [  344.002681] Killed process 2929 (a.out) total-vm:4180kB, anon-rss:88kB, file-rss:0kB, shmem-rss:0kB
> ----------

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-26  4:32       ` Yafang Shao
@ 2017-11-26  8:03         ` Tetsuo Handa
  2017-11-26  8:27           ` Yafang Shao
  2017-11-26  8:46           ` Yafang Shao
  0 siblings, 2 replies; 37+ messages in thread
From: Tetsuo Handa @ 2017-11-26  8:03 UTC (permalink / raw)
  To: laoar.shao; +Cc: akpm, jack, mhocko, linux-mm

Yafang Shao wrote:
> >> I have also verified your test code on my machine, but can not find
> >> this message.
> >>
> >
> > Not always printed. It is timing dependent.
> >
> 
> I will try and analysis why this happen.
> 
I see.

Here is dump of variables. Always mostly 0 when this happens.

--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -434,7 +434,8 @@ static void domain_dirty_limits(struct dirty_throttle_control *dtc)
                bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
 
        if (unlikely(bg_thresh >= thresh)) {
-               pr_warn("vm direct limit must be set greater than background limit.\n");
+               pr_warn("vm direct limit must be set greater than background limit. bg_thresh=%lu thresh=%lu bg_bytes=%lu bytes=%lu bg_ratio=%lu ratio=%lu gdtc=%p gdtc->vail=%lu vm_dirty_bytes=%lu dirty_background_bytes=%lu\n",
+                       bg_thresh, thresh, bg_bytes, bytes, bg_ratio, ratio, gdtc, gdtc ? gdtc->avail : 0UL, vm_dirty_bytes, dirty_background_bytes);
                bg_thresh = thresh / 2;
        }

[  259.641324] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
[  317.798913] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
[  317.798935] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
[  317.976210] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
[  417.781194] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
[  466.322615] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
[  466.322618] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
[  466.497893] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
[  466.504687] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-26  8:03         ` Tetsuo Handa
@ 2017-11-26  8:27           ` Yafang Shao
  2017-11-26  8:46           ` Yafang Shao
  1 sibling, 0 replies; 37+ messages in thread
From: Yafang Shao @ 2017-11-26  8:27 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Andrew Morton, Jan Kara, Michal Hocko, Linux MM

2017-11-26 16:03 GMT+08:00 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
> Yafang Shao wrote:
>> >> I have also verified your test code on my machine, but can not find
>> >> this message.
>> >>
>> >
>> > Not always printed. It is timing dependent.
>> >
>>
>> I will try and analysis why this happen.
>>
> I see.
>
> Here is dump of variables. Always mostly 0 when this happens.
>
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -434,7 +434,8 @@ static void domain_dirty_limits(struct dirty_throttle_control *dtc)
>                 bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
>
>         if (unlikely(bg_thresh >= thresh)) {
> -               pr_warn("vm direct limit must be set greater than background limit.\n");
> +               pr_warn("vm direct limit must be set greater than background limit. bg_thresh=%lu thresh=%lu bg_bytes=%lu bytes=%lu bg_ratio=%lu ratio=%lu gdtc=%p gdtc->vail=%lu vm_dirty_bytes=%lu dirty_background_bytes=%lu\n",
> +                       bg_thresh, thresh, bg_bytes, bytes, bg_ratio, ratio, gdtc, gdtc ? gdtc->avail : 0UL, vm_dirty_bytes, dirty_background_bytes);
>                 bg_thresh = thresh / 2;
>         }
>
> [  259.641324] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  317.798913] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  317.798935] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  317.976210] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  417.781194] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  466.322615] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  466.322618] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  466.497893] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  466.504687] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0

Hi Tetsuo,

Bellow is the code analysis,
     // global_dirtyable_memory() will return number of globally dirtyable page
    gdtc.avail = global_dirtyable_memory();
    domain_dirty_limits(&gdtc);
        unsigned long ratio = (vm_dirty_ratio * PAGE_SIZE) / 100;   // 1228
        unsigned long bg_ratio = (dirty_background_ratio * PAGE_SIZE)
/ 100;  // 409
        available_memory = dtc->avail;
        thresh = (ratio * available_memory) / PAGE_SIZE;
        bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;


So if available_memory is less than 4 pages, thresh and bg_thresh will
be both 0,
then the message will be printed.

Thanks
Yafang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-26  8:03         ` Tetsuo Handa
  2017-11-26  8:27           ` Yafang Shao
@ 2017-11-26  8:46           ` Yafang Shao
  2017-11-26 10:38             ` Tetsuo Handa
  1 sibling, 1 reply; 37+ messages in thread
From: Yafang Shao @ 2017-11-26  8:46 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Andrew Morton, Jan Kara, Michal Hocko, Linux MM

2017-11-26 16:03 GMT+08:00 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
> Yafang Shao wrote:
>> >> I have also verified your test code on my machine, but can not find
>> >> this message.
>> >>
>> >
>> > Not always printed. It is timing dependent.
>> >
>>
>> I will try and analysis why this happen.
>>
> I see.
>
> Here is dump of variables. Always mostly 0 when this happens.
>
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -434,7 +434,8 @@ static void domain_dirty_limits(struct dirty_throttle_control *dtc)
>                 bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
>
>         if (unlikely(bg_thresh >= thresh)) {
> -               pr_warn("vm direct limit must be set greater than background limit.\n");
> +               pr_warn("vm direct limit must be set greater than background limit. bg_thresh=%lu thresh=%lu bg_bytes=%lu bytes=%lu bg_ratio=%lu ratio=%lu gdtc=%p gdtc->vail=%lu vm_dirty_bytes=%lu dirty_background_bytes=%lu\n",
> +                       bg_thresh, thresh, bg_bytes, bytes, bg_ratio, ratio, gdtc, gdtc ? gdtc->avail : 0UL, vm_dirty_bytes, dirty_background_bytes);
>                 bg_thresh = thresh / 2;
>         }

You could print dtc->avail as well.
Seems bg_thresh and thresh are not so acurate as they are interger
other than float.

Should find a smart way to fix it.

>
> [  259.641324] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  317.798913] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  317.798935] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  317.976210] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  417.781194] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  466.322615] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  466.322618] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  466.497893] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0
> [  466.504687] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->vail=0 vm_dirty_bytes=0 dirty_background_bytes=0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-26  8:46           ` Yafang Shao
@ 2017-11-26 10:38             ` Tetsuo Handa
  2017-11-27  8:06               ` Yafang Shao
  0 siblings, 1 reply; 37+ messages in thread
From: Tetsuo Handa @ 2017-11-26 10:38 UTC (permalink / raw)
  To: laoar.shao; +Cc: akpm, jack, mhocko, linux-mm

Yafang Shao wrote:
> 2017-11-26 16:03 GMT+08:00 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
> > Yafang Shao wrote:
> >> >> I have also verified your test code on my machine, but can not find
> >> >> this message.
> >> >>
> >> >
> >> > Not always printed. It is timing dependent.
> >> >
> >>
> >> I will try and analysis why this happen.
> >>
> > I see.
> >
> > Here is dump of variables. Always mostly 0 when this happens.
> >
> > --- a/mm/page-writeback.c
> > +++ b/mm/page-writeback.c
> > @@ -434,7 +434,8 @@ static void domain_dirty_limits(struct dirty_throttle_control *dtc)
> >                 bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
> >
> >         if (unlikely(bg_thresh >= thresh)) {
> > -               pr_warn("vm direct limit must be set greater than background limit.\n");
> > +               pr_warn("vm direct limit must be set greater than background limit. bg_thresh=%lu thresh=%lu bg_bytes=%lu bytes=%lu bg_ratio=%lu ratio=%lu gdtc=%p gdtc->vail=%lu vm_dirty_bytes=%lu dirty_background_bytes=%lu\n",
> > +                       bg_thresh, thresh, bg_bytes, bytes, bg_ratio, ratio, gdtc, gdtc ? gdtc->avail : 0UL, vm_dirty_bytes, dirty_background_bytes);
> >                 bg_thresh = thresh / 2;
> >         }
> 
> You could print dtc->avail as well.
> Seems bg_thresh and thresh are not so acurate as they are interger
> other than float.

Indeed, dtc->avail < 4 when this message is printed.

[  314.730541] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=2
[  315.864111] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=1
[  315.864126] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=1
[  315.993866] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=2
[  355.807392] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=2
[  406.819939] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=1
[  407.782790] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=1
[  416.939906] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=1
[  417.090872] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
[  417.093164] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
[  417.093176] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
[  417.093183] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
[  417.093191] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
[  417.093198] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
[  417.093206] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
[  417.093213] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
[  417.093223] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
[  417.093232] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
[  417.093240] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-26 10:38             ` Tetsuo Handa
@ 2017-11-27  8:06               ` Yafang Shao
  2017-11-27  8:21                 ` Michal Hocko
  0 siblings, 1 reply; 37+ messages in thread
From: Yafang Shao @ 2017-11-27  8:06 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Andrew Morton, Jan Kara, Michal Hocko, Linux MM, fcicq

+cc fcicq

2017-11-26 18:38 GMT+08:00 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
> Yafang Shao wrote:
>> 2017-11-26 16:03 GMT+08:00 Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>:
>> > Yafang Shao wrote:
>> >> >> I have also verified your test code on my machine, but can not find
>> >> >> this message.
>> >> >>
>> >> >
>> >> > Not always printed. It is timing dependent.
>> >> >
>> >>
>> >> I will try and analysis why this happen.
>> >>
>> > I see.
>> >
>> > Here is dump of variables. Always mostly 0 when this happens.
>> >
>> > --- a/mm/page-writeback.c
>> > +++ b/mm/page-writeback.c
>> > @@ -434,7 +434,8 @@ static void domain_dirty_limits(struct dirty_throttle_control *dtc)
>> >                 bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
>> >
>> >         if (unlikely(bg_thresh >= thresh)) {
>> > -               pr_warn("vm direct limit must be set greater than background limit.\n");
>> > +               pr_warn("vm direct limit must be set greater than background limit. bg_thresh=%lu thresh=%lu bg_bytes=%lu bytes=%lu bg_ratio=%lu ratio=%lu gdtc=%p gdtc->vail=%lu vm_dirty_bytes=%lu dirty_background_bytes=%lu\n",
>> > +                       bg_thresh, thresh, bg_bytes, bytes, bg_ratio, ratio, gdtc, gdtc ? gdtc->avail : 0UL, vm_dirty_bytes, dirty_background_bytes);
>> >                 bg_thresh = thresh / 2;
>> >         }
>>
>> You could print dtc->avail as well.
>> Seems bg_thresh and thresh are not so acurate as they are interger
>> other than float.
>
> Indeed, dtc->avail < 4 when this message is printed.
>
> [  314.730541] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=2
> [  315.864111] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=1
> [  315.864126] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=1
> [  315.993866] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=2
> [  355.807392] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=2
> [  406.819939] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=1
> [  407.782790] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=1
> [  416.939906] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=1
> [  417.090872] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
> [  417.093164] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
> [  417.093176] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
> [  417.093183] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
> [  417.093191] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
> [  417.093198] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
> [  417.093206] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
> [  417.093213] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
> [  417.093223] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
> [  417.093232] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3
> [  417.093240] vm direct limit must be set greater than background limit. bg_thresh=0 thresh=0 bg_bytes=0 bytes=0 bg_ratio=409 ratio=1228 gdtc=          (null) gdtc->avail=0 vm_dirty_bytes=0 dirty_background_bytes=0 dtc->avail=3


What about bellow change ?

diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 8a15511..6c5c018 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -377,7 +377,16 @@ static unsigned long global_dirtyable_memory(void)
    if (!vm_highmem_is_dirtyable)
        x -= highmem_dirtyable_memory(x);

-   return x + 1;   /* Ensure that we never return 0 */
+   /*
+    * - Why 100 ?
+    * - Because the return value will be used by dirty ratio and
+    *   dirty background ratio to calculate dirty thresh and bg thresh,
+    *   so if the return value is two small, the thresh value maybe
+    *   calculated to 0.
+    *   As the max value of ratio is 100, so the return value is added
+    *   by 100 here.
+    */
+   return x + 100;



Thanks
Yafang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-27  8:06               ` Yafang Shao
@ 2017-11-27  8:21                 ` Michal Hocko
  2017-11-27  8:29                   ` Yafang Shao
  0 siblings, 1 reply; 37+ messages in thread
From: Michal Hocko @ 2017-11-27  8:21 UTC (permalink / raw)
  To: Yafang Shao; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM, fcicq

On Mon 27-11-17 16:06:50, Yafang Shao wrote:
> +cc fcicq
[...]
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 8a15511..6c5c018 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -377,7 +377,16 @@ static unsigned long global_dirtyable_memory(void)
>     if (!vm_highmem_is_dirtyable)
>         x -= highmem_dirtyable_memory(x);
> 
> -   return x + 1;   /* Ensure that we never return 0 */
> +   /*
> +    * - Why 100 ?
> +    * - Because the return value will be used by dirty ratio and
> +    *   dirty background ratio to calculate dirty thresh and bg thresh,
> +    *   so if the return value is two small, the thresh value maybe
> +    *   calculated to 0.
> +    *   As the max value of ratio is 100, so the return value is added
> +    *   by 100 here.
> +    */
> +   return x + 100;

No. We should just revert 0f6d24f87856 ("mm/page-writeback.c: print a
warning if the vm dirtiness settings are illogical") because it is of a
dubious value and it causes problems. I am not even sure why it got
merged. It doesn't have any ack or review and I remember objecting to
the patch previously as pointless.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-27  8:21                 ` Michal Hocko
@ 2017-11-27  8:29                   ` Yafang Shao
  2017-11-27  8:32                     ` Yafang Shao
  2017-11-27  8:34                     ` Michal Hocko
  0 siblings, 2 replies; 37+ messages in thread
From: Yafang Shao @ 2017-11-27  8:29 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM, fcicq

2017-11-27 16:21 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> On Mon 27-11-17 16:06:50, Yafang Shao wrote:
>> +cc fcicq
> [...]
>> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>> index 8a15511..6c5c018 100644
>> --- a/mm/page-writeback.c
>> +++ b/mm/page-writeback.c
>> @@ -377,7 +377,16 @@ static unsigned long global_dirtyable_memory(void)
>>     if (!vm_highmem_is_dirtyable)
>>         x -= highmem_dirtyable_memory(x);
>>
>> -   return x + 1;   /* Ensure that we never return 0 */
>> +   /*
>> +    * - Why 100 ?
>> +    * - Because the return value will be used by dirty ratio and
>> +    *   dirty background ratio to calculate dirty thresh and bg thresh,
>> +    *   so if the return value is two small, the thresh value maybe
>> +    *   calculated to 0.
>> +    *   As the max value of ratio is 100, so the return value is added
>> +    *   by 100 here.
>> +    */
>> +   return x + 100;
>
> No. We should just revert 0f6d24f87856 ("mm/page-writeback.c: print a
> warning if the vm dirtiness settings are illogical") because it is of a
> dubious value and it causes problems. I am not even sure why it got
> merged. It doesn't have any ack or review and I remember objecting to
> the patch previously as pointless.
> --

It is reviewed and merged by Andrew.

>From Andrew:
> I think this means that a script which alters both dirty_bytes and
dirty_background_bytes must alter dirty_background_bytes first if they
are being decreased and must alter dirty_bytes first if they are being
increased.  Or something like that.

It will help us to find the error if we don't change these values like this.

Thanks
Yafang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-27  8:29                   ` Yafang Shao
@ 2017-11-27  8:32                     ` Yafang Shao
  2017-11-27  8:37                       ` Michal Hocko
  2017-11-27  8:34                     ` Michal Hocko
  1 sibling, 1 reply; 37+ messages in thread
From: Yafang Shao @ 2017-11-27  8:32 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM, fcicq

2017-11-27 16:29 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
> 2017-11-27 16:21 GMT+08:00 Michal Hocko <mhocko@suse.com>:
>> On Mon 27-11-17 16:06:50, Yafang Shao wrote:
>>> +cc fcicq
>> [...]
>>> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>>> index 8a15511..6c5c018 100644
>>> --- a/mm/page-writeback.c
>>> +++ b/mm/page-writeback.c
>>> @@ -377,7 +377,16 @@ static unsigned long global_dirtyable_memory(void)
>>>     if (!vm_highmem_is_dirtyable)
>>>         x -= highmem_dirtyable_memory(x);
>>>
>>> -   return x + 1;   /* Ensure that we never return 0 */
>>> +   /*
>>> +    * - Why 100 ?
>>> +    * - Because the return value will be used by dirty ratio and
>>> +    *   dirty background ratio to calculate dirty thresh and bg thresh,
>>> +    *   so if the return value is two small, the thresh value maybe
>>> +    *   calculated to 0.
>>> +    *   As the max value of ratio is 100, so the return value is added
>>> +    *   by 100 here.
>>> +    */
>>> +   return x + 100;
>>
>> No. We should just revert 0f6d24f87856 ("mm/page-writeback.c: print a
>> warning if the vm dirtiness settings are illogical") because it is of a
>> dubious value and it causes problems. I am not even sure why it got
>> merged. It doesn't have any ack or review and I remember objecting to
>> the patch previously as pointless.
>> --
>
> It is reviewed and merged by Andrew.
>
> From Andrew:
>> I think this means that a script which alters both dirty_bytes and
> dirty_background_bytes must alter dirty_background_bytes first if they
> are being decreased and must alter dirty_bytes first if they are being
> increased.  Or something like that.
>
> It will help us to find the error if we don't change these values like this.
>

And actually it help us find another issue that when availble_memroy
is too small, the thresh and bg_thresh will be 0, that's absolutely
wrong.

Thanks
Yafang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-27  8:29                   ` Yafang Shao
  2017-11-27  8:32                     ` Yafang Shao
@ 2017-11-27  8:34                     ` Michal Hocko
  1 sibling, 0 replies; 37+ messages in thread
From: Michal Hocko @ 2017-11-27  8:34 UTC (permalink / raw)
  To: Yafang Shao; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM, fcicq

On Mon 27-11-17 16:29:13, Yafang Shao wrote:
> 2017-11-27 16:21 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> > On Mon 27-11-17 16:06:50, Yafang Shao wrote:
> >> +cc fcicq
> > [...]
> >> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> >> index 8a15511..6c5c018 100644
> >> --- a/mm/page-writeback.c
> >> +++ b/mm/page-writeback.c
> >> @@ -377,7 +377,16 @@ static unsigned long global_dirtyable_memory(void)
> >>     if (!vm_highmem_is_dirtyable)
> >>         x -= highmem_dirtyable_memory(x);
> >>
> >> -   return x + 1;   /* Ensure that we never return 0 */
> >> +   /*
> >> +    * - Why 100 ?
> >> +    * - Because the return value will be used by dirty ratio and
> >> +    *   dirty background ratio to calculate dirty thresh and bg thresh,
> >> +    *   so if the return value is two small, the thresh value maybe
> >> +    *   calculated to 0.
> >> +    *   As the max value of ratio is 100, so the return value is added
> >> +    *   by 100 here.
> >> +    */
> >> +   return x + 100;
> >
> > No. We should just revert 0f6d24f87856 ("mm/page-writeback.c: print a
> > warning if the vm dirtiness settings are illogical") because it is of a
> > dubious value and it causes problems. I am not even sure why it got
> > merged. It doesn't have any ack or review and I remember objecting to
> > the patch previously as pointless.
> > --
> 
> It is reviewed and merged by Andrew.
> 
> >From Andrew:
> I think this means that a script which alters both dirty_bytes and
> dirty_background_bytes must alter dirty_background_bytes first if they
> are being decreased and must alter dirty_bytes first if they are being
> increased.  Or something like that.

I still maintain my position that we _assume_ that admin knows what he
does and many things will break if that is not the case. Now you see
that your patch is dumping pointless warnings and you are trying to add
an ugly hack to silence them. No, this is not the way we should go. Nack
to the above. Really we should simply revert the patch. It's value is
dubious at best and it causes false positives. I do not see any reason
to not do so.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-27  8:32                     ` Yafang Shao
@ 2017-11-27  8:37                       ` Michal Hocko
  2017-11-27  8:49                         ` Yafang Shao
  0 siblings, 1 reply; 37+ messages in thread
From: Michal Hocko @ 2017-11-27  8:37 UTC (permalink / raw)
  To: Yafang Shao; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM, fcicq

On Mon 27-11-17 16:32:42, Yafang Shao wrote:
> 2017-11-27 16:29 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
> > 2017-11-27 16:21 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> >> On Mon 27-11-17 16:06:50, Yafang Shao wrote:
> >>> +cc fcicq
> >> [...]
> >>> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> >>> index 8a15511..6c5c018 100644
> >>> --- a/mm/page-writeback.c
> >>> +++ b/mm/page-writeback.c
> >>> @@ -377,7 +377,16 @@ static unsigned long global_dirtyable_memory(void)
> >>>     if (!vm_highmem_is_dirtyable)
> >>>         x -= highmem_dirtyable_memory(x);
> >>>
> >>> -   return x + 1;   /* Ensure that we never return 0 */
> >>> +   /*
> >>> +    * - Why 100 ?
> >>> +    * - Because the return value will be used by dirty ratio and
> >>> +    *   dirty background ratio to calculate dirty thresh and bg thresh,
> >>> +    *   so if the return value is two small, the thresh value maybe
> >>> +    *   calculated to 0.
> >>> +    *   As the max value of ratio is 100, so the return value is added
> >>> +    *   by 100 here.
> >>> +    */
> >>> +   return x + 100;
> >>
> >> No. We should just revert 0f6d24f87856 ("mm/page-writeback.c: print a
> >> warning if the vm dirtiness settings are illogical") because it is of a
> >> dubious value and it causes problems. I am not even sure why it got
> >> merged. It doesn't have any ack or review and I remember objecting to
> >> the patch previously as pointless.
> >> --
> >
> > It is reviewed and merged by Andrew.
> >
> > From Andrew:
> >> I think this means that a script which alters both dirty_bytes and
> > dirty_background_bytes must alter dirty_background_bytes first if they
> > are being decreased and must alter dirty_bytes first if they are being
> > increased.  Or something like that.
> >
> > It will help us to find the error if we don't change these values like this.
> >
> 
> And actually it help us find another issue that when availble_memroy
> is too small, the thresh and bg_thresh will be 0, that's absolutely
> wrong.

Why is it wrong?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-27  8:37                       ` Michal Hocko
@ 2017-11-27  8:49                         ` Yafang Shao
  2017-11-27  8:52                           ` Michal Hocko
  0 siblings, 1 reply; 37+ messages in thread
From: Yafang Shao @ 2017-11-27  8:49 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM, fcicq

2017-11-27 16:37 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> On Mon 27-11-17 16:32:42, Yafang Shao wrote:
>> 2017-11-27 16:29 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
>> > 2017-11-27 16:21 GMT+08:00 Michal Hocko <mhocko@suse.com>:
>> >> On Mon 27-11-17 16:06:50, Yafang Shao wrote:
>> >>> +cc fcicq
>> >> [...]
>> >>> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>> >>> index 8a15511..6c5c018 100644
>> >>> --- a/mm/page-writeback.c
>> >>> +++ b/mm/page-writeback.c
>> >>> @@ -377,7 +377,16 @@ static unsigned long global_dirtyable_memory(void)
>> >>>     if (!vm_highmem_is_dirtyable)
>> >>>         x -= highmem_dirtyable_memory(x);
>> >>>
>> >>> -   return x + 1;   /* Ensure that we never return 0 */
>> >>> +   /*
>> >>> +    * - Why 100 ?
>> >>> +    * - Because the return value will be used by dirty ratio and
>> >>> +    *   dirty background ratio to calculate dirty thresh and bg thresh,
>> >>> +    *   so if the return value is two small, the thresh value maybe
>> >>> +    *   calculated to 0.
>> >>> +    *   As the max value of ratio is 100, so the return value is added
>> >>> +    *   by 100 here.
>> >>> +    */
>> >>> +   return x + 100;
>> >>
>> >> No. We should just revert 0f6d24f87856 ("mm/page-writeback.c: print a
>> >> warning if the vm dirtiness settings are illogical") because it is of a
>> >> dubious value and it causes problems. I am not even sure why it got
>> >> merged. It doesn't have any ack or review and I remember objecting to
>> >> the patch previously as pointless.
>> >> --
>> >
>> > It is reviewed and merged by Andrew.
>> >
>> > From Andrew:
>> >> I think this means that a script which alters both dirty_bytes and
>> > dirty_background_bytes must alter dirty_background_bytes first if they
>> > are being decreased and must alter dirty_bytes first if they are being
>> > increased.  Or something like that.
>> >
>> > It will help us to find the error if we don't change these values like this.
>> >
>>
>> And actually it help us find another issue that when availble_memroy
>> is too small, the thresh and bg_thresh will be 0, that's absolutely
>> wrong.
>
> Why is it wrong?
> --

For example, the writeback threads will be wakeup on every write.
I don't think it is meaningful to wakeup the writeback thread when the
dirty pages is very low.


Thanks
Yafang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-27  8:49                         ` Yafang Shao
@ 2017-11-27  8:52                           ` Michal Hocko
  2017-11-27  8:54                             ` Yafang Shao
  0 siblings, 1 reply; 37+ messages in thread
From: Michal Hocko @ 2017-11-27  8:52 UTC (permalink / raw)
  To: Yafang Shao; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM, fcicq

On Mon 27-11-17 16:49:34, Yafang Shao wrote:
> 2017-11-27 16:37 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> > On Mon 27-11-17 16:32:42, Yafang Shao wrote:
> >> 2017-11-27 16:29 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
[...]
> >> > It will help us to find the error if we don't change these values like this.
> >> >
> >>
> >> And actually it help us find another issue that when availble_memroy
> >> is too small, the thresh and bg_thresh will be 0, that's absolutely
> >> wrong.
> >
> > Why is it wrong?
> > --
> 
> For example, the writeback threads will be wakeup on every write.
> I don't think it is meaningful to wakeup the writeback thread when the
> dirty pages is very low.

Well, this is a corner situation when we are out of memory basically.
Doing a wake up on the flusher is the least of your problem. So _why_
exactly is this is problem?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-27  8:52                           ` Michal Hocko
@ 2017-11-27  8:54                             ` Yafang Shao
  2017-11-27  9:04                               ` Michal Hocko
  0 siblings, 1 reply; 37+ messages in thread
From: Yafang Shao @ 2017-11-27  8:54 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM, fcicq

2017-11-27 16:52 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> On Mon 27-11-17 16:49:34, Yafang Shao wrote:
>> 2017-11-27 16:37 GMT+08:00 Michal Hocko <mhocko@suse.com>:
>> > On Mon 27-11-17 16:32:42, Yafang Shao wrote:
>> >> 2017-11-27 16:29 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
> [...]
>> >> > It will help us to find the error if we don't change these values like this.
>> >> >
>> >>
>> >> And actually it help us find another issue that when availble_memroy
>> >> is too small, the thresh and bg_thresh will be 0, that's absolutely
>> >> wrong.
>> >
>> > Why is it wrong?
>> > --
>>
>> For example, the writeback threads will be wakeup on every write.
>> I don't think it is meaningful to wakeup the writeback thread when the
>> dirty pages is very low.
>
> Well, this is a corner situation when we are out of memory basically.
> Doing a wake up on the flusher is the least of your problem. So _why_
> exactly is this is problem?
> --

Are you _sure_ this is the least of  the problem on this corner situation ?
Why wakeup the bdi writeback ? why not just kill the program and flush
the date into the Disk when we do oom ?

Thanks
Yafang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-27  8:54                             ` Yafang Shao
@ 2017-11-27  9:04                               ` Michal Hocko
  2017-11-27  9:08                                 ` Yafang Shao
  0 siblings, 1 reply; 37+ messages in thread
From: Michal Hocko @ 2017-11-27  9:04 UTC (permalink / raw)
  To: Yafang Shao; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM, fcicq

On Mon 27-11-17 16:54:39, Yafang Shao wrote:
> 2017-11-27 16:52 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> > On Mon 27-11-17 16:49:34, Yafang Shao wrote:
> >> 2017-11-27 16:37 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> >> > On Mon 27-11-17 16:32:42, Yafang Shao wrote:
> >> >> 2017-11-27 16:29 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
> > [...]
> >> >> > It will help us to find the error if we don't change these values like this.
> >> >> >
> >> >>
> >> >> And actually it help us find another issue that when availble_memroy
> >> >> is too small, the thresh and bg_thresh will be 0, that's absolutely
> >> >> wrong.
> >> >
> >> > Why is it wrong?
> >> > --
> >>
> >> For example, the writeback threads will be wakeup on every write.
> >> I don't think it is meaningful to wakeup the writeback thread when the
> >> dirty pages is very low.
> >
> > Well, this is a corner situation when we are out of memory basically.
> > Doing a wake up on the flusher is the least of your problem. So _why_
> > exactly is this is problem?
> > --
> 
> Are you _sure_ this is the least of the problem on this corner situation ?

I am not _sure_ and that is why I'm _asking_ _you_ and you seem to come
up with reasons which don't make me really convinced.

If we wake up flusher which have nothing to do they will simply back
off. That is what have to do anyway because dirty data can be truncated
at any time, right?

> Why wakeup the bdi writeback ? why not just kill the program and flush
> the date into the Disk when we do oom ?

I really do not see how this is related to the discussion here. OOM
killer doesn't flush anything. It kills a task and that is it. Moreover
we do not flush any data from direct context at all. We simply rely on
flushers to do the work.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical
  2017-11-27  9:04                               ` Michal Hocko
@ 2017-11-27  9:08                                 ` Yafang Shao
  0 siblings, 0 replies; 37+ messages in thread
From: Yafang Shao @ 2017-11-27  9:08 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM, fcicq

2017-11-27 17:04 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> On Mon 27-11-17 16:54:39, Yafang Shao wrote:
>> 2017-11-27 16:52 GMT+08:00 Michal Hocko <mhocko@suse.com>:
>> > On Mon 27-11-17 16:49:34, Yafang Shao wrote:
>> >> 2017-11-27 16:37 GMT+08:00 Michal Hocko <mhocko@suse.com>:
>> >> > On Mon 27-11-17 16:32:42, Yafang Shao wrote:
>> >> >> 2017-11-27 16:29 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
>> > [...]
>> >> >> > It will help us to find the error if we don't change these values like this.
>> >> >> >
>> >> >>
>> >> >> And actually it help us find another issue that when availble_memroy
>> >> >> is too small, the thresh and bg_thresh will be 0, that's absolutely
>> >> >> wrong.
>> >> >
>> >> > Why is it wrong?
>> >> > --
>> >>
>> >> For example, the writeback threads will be wakeup on every write.
>> >> I don't think it is meaningful to wakeup the writeback thread when the
>> >> dirty pages is very low.
>> >
>> > Well, this is a corner situation when we are out of memory basically.
>> > Doing a wake up on the flusher is the least of your problem. So _why_
>> > exactly is this is problem?
>> > --
>>
>> Are you _sure_ this is the least of the problem on this corner situation ?
>
> I am not _sure_ and that is why I'm _asking_ _you_ and you seem to come
> up with reasons which don't make me really convinced.
>
> If we wake up flusher which have nothing to do they will simply back
> off. That is what have to do anyway because dirty data can be truncated
> at any time, right?
>

Under this condition it is better not to wakeup the bdi writeback.
That's why I return the min value 100 to avoid doing this.
Maybe 100 is not enough. It should be geater IMHO.

>> Why wakeup the bdi writeback ? why not just kill the program and flush
>> the date into the Disk when we do oom ?
>
> I really do not see how this is related to the discussion here. OOM
> killer doesn't flush anything. It kills a task and that is it. Moreover
> we do not flush any data from direct context at all. We simply rely on
> flushers to do the work.
> --
That's my fault.
Just killing the program is enough.

Thanks
Yafang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-25 16:05 ` Tetsuo Handa
  2017-11-26  2:24   ` Yafang Shao
@ 2017-11-27  9:19   ` Michal Hocko
  2017-11-28  3:11     ` Yafang Shao
  2017-11-28 10:37     ` Jan Kara
  1 sibling, 2 replies; 37+ messages in thread
From: Michal Hocko @ 2017-11-27  9:19 UTC (permalink / raw)
  To: Tetsuo Handa; +Cc: Yafang Shao, akpm, jack, linux-mm

Andrew,
could you simply send this to Linus. If we _really_ need something to
prevent misconfiguration, which I doubt to be honest, then it should be
thought through much better.
---

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-27  9:19   ` [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical) Michal Hocko
@ 2017-11-28  3:11     ` Yafang Shao
  2017-11-28  6:12       ` Yafang Shao
  2017-11-28 10:25       ` Jan Kara
  2017-11-28 10:37     ` Jan Kara
  1 sibling, 2 replies; 37+ messages in thread
From: Yafang Shao @ 2017-11-28  3:11 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM

Hi Michal,

What about bellow change ?
It makes the function  domain_dirty_limits() more clear.
And the result will have a higher precision.


diff --git a/mm/page-writeback.c b/mm/page-writeback.c
index 8a15511..2b5e507 100644
--- a/mm/page-writeback.c
+++ b/mm/page-writeback.c
@@ -397,8 +397,8 @@ static void domain_dirty_limits(struct
dirty_throttle_control *dtc)
    unsigned long bytes = vm_dirty_bytes;
    unsigned long bg_bytes = dirty_background_bytes;
    /* convert ratios to per-PAGE_SIZE for higher precision */
-   unsigned long ratio = (vm_dirty_ratio * PAGE_SIZE) / 100;
-   unsigned long bg_ratio = (dirty_background_ratio * PAGE_SIZE) / 100;
+   unsigned long ratio = vm_dirty_ratio;
+   unsigned long bg_ratio = dirty_background_ratio;
    unsigned long thresh;
    unsigned long bg_thresh;
    struct task_struct *tsk;
@@ -416,28 +416,33 @@ static void domain_dirty_limits(struct
dirty_throttle_control *dtc)
         */
        if (bytes)
            ratio = min(DIV_ROUND_UP(bytes, global_avail),
-                   PAGE_SIZE);
+                   100);
        if (bg_bytes)
            bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
-                      PAGE_SIZE);
+                      99);   /* bg_ratio should less than ratio */
        bytes = bg_bytes = 0;
    }

+   /* bytes and bg_bytes must be PAGE_SIZE aligned */
    if (bytes)
-       thresh = DIV_ROUND_UP(bytes, PAGE_SIZE);
+       thresh = DIV_ROUND_UP(bytes, PAGE_SIZE) * 100;
    else
-       thresh = (ratio * available_memory) / PAGE_SIZE;
+       thresh = ratio * available_memory;

    if (bg_bytes)
-       bg_thresh = DIV_ROUND_UP(bg_bytes, PAGE_SIZE);
+       bg_thresh = DIV_ROUND_UP(bg_bytes, PAGE_SIZE) * 100;
    else
-       bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
+       bg_thresh = bg_ratio * available_memory;

    if (unlikely(bg_thresh >= thresh)) {
        pr_warn("vm direct limit must be set greater than background limit.\n");
        bg_thresh = thresh / 2;
    }

+   /* ensure bg_thresh and thresh never be 0 */
+   bg_thresh = DIV_ROUND_UP(bg_thresh, 100);
+   thresh = DIV_ROUND_UP(thresh, 100);
+
    tsk = current;
    if (tsk->flags & PF_LESS_THROTTLE || rt_task(tsk)) {

2017-11-27 17:19 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> Andrew,
> could you simply send this to Linus. If we _really_ need something to
> prevent misconfiguration, which I doubt to be honest, then it should be
> thought through much better.
> ---
> From 4ef6b1cbf98ea5dae155ab3303c4ae1d93411b79 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Mon, 27 Nov 2017 10:12:15 +0100
> Subject: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm
>  dirtiness settings are illogical"
>
> This reverts commit 0f6d24f878568fac579a1962d0bf7cb9f01e0ceb because
> it causes false positive warnings during OOM situations as noticed by
> Tetsuo Handa:
> [  621.814512] Node 0 active_anon:3525940kB inactive_anon:8372kB active_file:216kB inactive_file:1872kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:2504kB dirty:52kB writeback:0kB shmem:8660kB s
> hmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 636928kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
> [  621.821534] Node 0 DMA free:14848kB min:284kB low:352kB high:420kB active_anon:992kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocke
> d:0kB kernel_stack:0kB pagetables:24kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [  621.829035] lowmem_reserve[]: 0 2687 3645 3645
> [  621.831655] Node 0 DMA32 free:53004kB min:49608kB low:62008kB high:74408kB active_anon:2712648kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:
> 2773132kB mlocked:0kB kernel_stack:96kB pagetables:5096kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [  621.839945] lowmem_reserve[]: 0 0 958 958
> [  621.842811] Node 0 Normal free:17140kB min:17684kB low:22104kB high:26524kB active_anon:812300kB inactive_anon:8372kB active_file:1228kB inactive_file:1868kB unevictable:0kB writepending:52kB present:1048576k
> B managed:981224kB mlocked:0kB kernel_stack:3520kB pagetables:8552kB bounce:0kB free_pcp:120kB local_pcp:120kB free_cma:0kB
> [  621.852473] lowmem_reserve[]: 0 0 0 0
> [...]
> [  621.891477] Out of memory: Kill process 8459 (a.out) score 999 or sacrifice child
> [  621.894363] Killed process 8459 (a.out) total-vm:4180kB, anon-rss:88kB, file-rss:0kB, shmem-rss:0kB
> [  621.897172] oom_reaper: reaped process 8459 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [  622.424664] vm direct limit must be set greater than background limit.
>
> The problem is that both thresh and bg_thresh will be 0 if available_memory
>  is less than 4 pages when evaluating global_dirtyable_memory. While
> this might be worked around the whole point of the warning is dubious at
> best. We do rely on admins to do sensible things when changing tunable
> knobs. Dirty memory writeback knobs are not any special in that regards
> so revert the warning rather than adding more hacks to work this around.
>
> Rerported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Debugged-by: Yafang Shao <laoar.shao@gmail.com>
> Fixes: 0f6d24f87856 ("mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical")
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  Documentation/sysctl/vm.txt | 7 -------
>  mm/page-writeback.c         | 5 +----
>  2 files changed, 1 insertion(+), 11 deletions(-)
>
> diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
> index b920423f88cb..5025ff9307e6 100644
> --- a/Documentation/sysctl/vm.txt
> +++ b/Documentation/sysctl/vm.txt
> @@ -158,10 +158,6 @@ Note: the minimum value allowed for dirty_bytes is two pages (in bytes); any
>  value lower than this limit will be ignored and the old configuration will be
>  retained.
>
> -Note: the value of dirty_bytes also must be set greater than
> -dirty_background_bytes or the amount of memory corresponding to
> -dirty_background_ratio.
> -
>  ==============================================================
>
>  dirty_expire_centisecs
> @@ -181,9 +177,6 @@ generating disk writes will itself start writing out dirty data.
>
>  The total available memory is not equal to total system memory.
>
> -Note: dirty_ratio must be set greater than dirty_background_ratio or
> -ratio corresponding to dirty_background_bytes.
> -
>  ==============================================================
>
>  dirty_writeback_centisecs
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index e7095030aa1f..586f31261c83 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -433,11 +433,8 @@ static void domain_dirty_limits(struct dirty_throttle_control *dtc)
>         else
>                 bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
>
> -       if (unlikely(bg_thresh >= thresh)) {
> -               pr_warn("vm direct limit must be set greater than background limit.\n");
> +       if (bg_thresh >= thresh)
>                 bg_thresh = thresh / 2;
> -       }
> -
>         tsk = current;
>         if (tsk->flags & PF_LESS_THROTTLE || rt_task(tsk)) {
>                 bg_thresh += bg_thresh / 4 + global_wb_domain.dirty_limit / 32;
> --
> 2.15.0
>
> --
> Michal Hocko
> SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28  3:11     ` Yafang Shao
@ 2017-11-28  6:12       ` Yafang Shao
  2017-11-28  7:45         ` Michal Hocko
  2017-11-28 10:25       ` Jan Kara
  1 sibling, 1 reply; 37+ messages in thread
From: Yafang Shao @ 2017-11-28  6:12 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM

2017-11-28 11:11 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
> Hi Michal,
>
> What about bellow change ?
> It makes the function  domain_dirty_limits() more clear.
> And the result will have a higher precision.
>
>
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 8a15511..2b5e507 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -397,8 +397,8 @@ static void domain_dirty_limits(struct
> dirty_throttle_control *dtc)
>     unsigned long bytes = vm_dirty_bytes;
>     unsigned long bg_bytes = dirty_background_bytes;
>     /* convert ratios to per-PAGE_SIZE for higher precision */
> -   unsigned long ratio = (vm_dirty_ratio * PAGE_SIZE) / 100;
> -   unsigned long bg_ratio = (dirty_background_ratio * PAGE_SIZE) / 100;
> +   unsigned long ratio = vm_dirty_ratio;
> +   unsigned long bg_ratio = dirty_background_ratio;
>     unsigned long thresh;
>     unsigned long bg_thresh;
>     struct task_struct *tsk;
> @@ -416,28 +416,33 @@ static void domain_dirty_limits(struct
> dirty_throttle_control *dtc)
>          */
>         if (bytes)
>             ratio = min(DIV_ROUND_UP(bytes, global_avail),
> -                   PAGE_SIZE);
> +                   100);
>         if (bg_bytes)
>             bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
> -                      PAGE_SIZE);
> +                      99);   /* bg_ratio should less than ratio */
>         bytes = bg_bytes = 0;
>     }


Errata:

        if (bytes)
-           ratio = min(DIV_ROUND_UP(bytes, global_avail),
-                   PAGE_SIZE);
+           ratio = min(DIV_ROUND_UP(bytes / PAGE_SIZE, global_avail),
+                   100);
        if (bg_bytes)
-           bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
-                      PAGE_SIZE);
+           bg_ratio = min(DIV_ROUND_UP(bg_bytes / PAGE_SIZE, global_avail),
+                      100 - 1); /* bg_ratio should be less than ratio */
        bytes = bg_bytes = 0;

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28  6:12       ` Yafang Shao
@ 2017-11-28  7:45         ` Michal Hocko
  2017-11-28  7:52           ` Yafang Shao
  0 siblings, 1 reply; 37+ messages in thread
From: Michal Hocko @ 2017-11-28  7:45 UTC (permalink / raw)
  To: Yafang Shao; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM

On Tue 28-11-17 14:12:15, Yafang Shao wrote:
> 2017-11-28 11:11 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
> > Hi Michal,
> >
> > What about bellow change ?
> > It makes the function  domain_dirty_limits() more clear.
> > And the result will have a higher precision.
> >
> >
> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > index 8a15511..2b5e507 100644
> > --- a/mm/page-writeback.c
> > +++ b/mm/page-writeback.c
> > @@ -397,8 +397,8 @@ static void domain_dirty_limits(struct
> > dirty_throttle_control *dtc)
> >     unsigned long bytes = vm_dirty_bytes;
> >     unsigned long bg_bytes = dirty_background_bytes;
> >     /* convert ratios to per-PAGE_SIZE for higher precision */
> > -   unsigned long ratio = (vm_dirty_ratio * PAGE_SIZE) / 100;
> > -   unsigned long bg_ratio = (dirty_background_ratio * PAGE_SIZE) / 100;
> > +   unsigned long ratio = vm_dirty_ratio;
> > +   unsigned long bg_ratio = dirty_background_ratio;
> >     unsigned long thresh;
> >     unsigned long bg_thresh;
> >     struct task_struct *tsk;
> > @@ -416,28 +416,33 @@ static void domain_dirty_limits(struct
> > dirty_throttle_control *dtc)
> >          */
> >         if (bytes)
> >             ratio = min(DIV_ROUND_UP(bytes, global_avail),
> > -                   PAGE_SIZE);
> > +                   100);
> >         if (bg_bytes)
> >             bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
> > -                      PAGE_SIZE);
> > +                      99);   /* bg_ratio should less than ratio */
> >         bytes = bg_bytes = 0;
> >     }
> 
> 
> Errata:
> 
>         if (bytes)
> -           ratio = min(DIV_ROUND_UP(bytes, global_avail),
> -                   PAGE_SIZE);
> +           ratio = min(DIV_ROUND_UP(bytes / PAGE_SIZE, global_avail),
> +                   100);
>         if (bg_bytes)
> -           bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
> -                      PAGE_SIZE);
> +           bg_ratio = min(DIV_ROUND_UP(bg_bytes / PAGE_SIZE, global_avail),
> +                      100 - 1); /* bg_ratio should be less than ratio */
>         bytes = bg_bytes = 0;

And you really think this makes code easier to follow? I am somehow not
conviced...

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28  7:45         ` Michal Hocko
@ 2017-11-28  7:52           ` Yafang Shao
  2017-11-28  9:43             ` Yafang Shao
                               ` (2 more replies)
  0 siblings, 3 replies; 37+ messages in thread
From: Yafang Shao @ 2017-11-28  7:52 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM

2017-11-28 15:45 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> On Tue 28-11-17 14:12:15, Yafang Shao wrote:
>> 2017-11-28 11:11 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
>> > Hi Michal,
>> >
>> > What about bellow change ?
>> > It makes the function  domain_dirty_limits() more clear.
>> > And the result will have a higher precision.
>> >
>> >
>> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>> > index 8a15511..2b5e507 100644
>> > --- a/mm/page-writeback.c
>> > +++ b/mm/page-writeback.c
>> > @@ -397,8 +397,8 @@ static void domain_dirty_limits(struct
>> > dirty_throttle_control *dtc)
>> >     unsigned long bytes = vm_dirty_bytes;
>> >     unsigned long bg_bytes = dirty_background_bytes;
>> >     /* convert ratios to per-PAGE_SIZE for higher precision */
>> > -   unsigned long ratio = (vm_dirty_ratio * PAGE_SIZE) / 100;
>> > -   unsigned long bg_ratio = (dirty_background_ratio * PAGE_SIZE) / 100;
>> > +   unsigned long ratio = vm_dirty_ratio;
>> > +   unsigned long bg_ratio = dirty_background_ratio;
>> >     unsigned long thresh;
>> >     unsigned long bg_thresh;
>> >     struct task_struct *tsk;
>> > @@ -416,28 +416,33 @@ static void domain_dirty_limits(struct
>> > dirty_throttle_control *dtc)
>> >          */
>> >         if (bytes)
>> >             ratio = min(DIV_ROUND_UP(bytes, global_avail),
>> > -                   PAGE_SIZE);
>> > +                   100);
>> >         if (bg_bytes)
>> >             bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
>> > -                      PAGE_SIZE);
>> > +                      99);   /* bg_ratio should less than ratio */
>> >         bytes = bg_bytes = 0;
>> >     }
>>
>>
>> Errata:
>>
>>         if (bytes)
>> -           ratio = min(DIV_ROUND_UP(bytes, global_avail),
>> -                   PAGE_SIZE);
>> +           ratio = min(DIV_ROUND_UP(bytes / PAGE_SIZE, global_avail),
>> +                   100);
>>         if (bg_bytes)
>> -           bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
>> -                      PAGE_SIZE);
>> +           bg_ratio = min(DIV_ROUND_UP(bg_bytes / PAGE_SIZE, global_avail),
>> +                      100 - 1); /* bg_ratio should be less than ratio */
>>         bytes = bg_bytes = 0;
>
> And you really think this makes code easier to follow? I am somehow not
> conviced...
>

There's hidden bug in the original code, because it is too complex to
clearly understand.
See bellow,

ratio = min(DIV_ROUND_UP(bytes, global_avail),
                    PAGE_SIZE)

Suppose the vm_dirty_bytes is set to 512M (this is a reasonable
value), and the global_avail is only 10000 pages (this is not low),
then DIV_ROUND_UP(bytes, global_avail) is 53688, which is bigger than
4096, so the ratio will be 4096.
That's unreasonable.


Thanks
Yafang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28  7:52           ` Yafang Shao
@ 2017-11-28  9:43             ` Yafang Shao
  2017-11-28  9:45             ` Michal Hocko
  2017-11-28 10:09             ` Jan Kara
  2 siblings, 0 replies; 37+ messages in thread
From: Yafang Shao @ 2017-11-28  9:43 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM

2017-11-28 15:52 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
> 2017-11-28 15:45 GMT+08:00 Michal Hocko <mhocko@suse.com>:
>> On Tue 28-11-17 14:12:15, Yafang Shao wrote:
>>> 2017-11-28 11:11 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
>>> > Hi Michal,
>>> >
>>> > What about bellow change ?
>>> > It makes the function  domain_dirty_limits() more clear.
>>> > And the result will have a higher precision.
>>> >
>>> >
>>> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>>> > index 8a15511..2b5e507 100644
>>> > --- a/mm/page-writeback.c
>>> > +++ b/mm/page-writeback.c
>>> > @@ -397,8 +397,8 @@ static void domain_dirty_limits(struct
>>> > dirty_throttle_control *dtc)
>>> >     unsigned long bytes = vm_dirty_bytes;
>>> >     unsigned long bg_bytes = dirty_background_bytes;
>>> >     /* convert ratios to per-PAGE_SIZE for higher precision */
>>> > -   unsigned long ratio = (vm_dirty_ratio * PAGE_SIZE) / 100;
>>> > -   unsigned long bg_ratio = (dirty_background_ratio * PAGE_SIZE) / 100;
>>> > +   unsigned long ratio = vm_dirty_ratio;
>>> > +   unsigned long bg_ratio = dirty_background_ratio;
>>> >     unsigned long thresh;
>>> >     unsigned long bg_thresh;
>>> >     struct task_struct *tsk;
>>> > @@ -416,28 +416,33 @@ static void domain_dirty_limits(struct
>>> > dirty_throttle_control *dtc)
>>> >          */
>>> >         if (bytes)
>>> >             ratio = min(DIV_ROUND_UP(bytes, global_avail),
>>> > -                   PAGE_SIZE);
>>> > +                   100);
>>> >         if (bg_bytes)
>>> >             bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
>>> > -                      PAGE_SIZE);
>>> > +                      99);   /* bg_ratio should less than ratio */
>>> >         bytes = bg_bytes = 0;
>>> >     }
>>>
>>>
>>> Errata:
>>>
>>>         if (bytes)
>>> -           ratio = min(DIV_ROUND_UP(bytes, global_avail),
>>> -                   PAGE_SIZE);
>>> +           ratio = min(DIV_ROUND_UP(bytes / PAGE_SIZE, global_avail),
>>> +                   100);
>>>         if (bg_bytes)
>>> -           bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
>>> -                      PAGE_SIZE);
>>> +           bg_ratio = min(DIV_ROUND_UP(bg_bytes / PAGE_SIZE, global_avail),
>>> +                      100 - 1); /* bg_ratio should be less than ratio */
>>>         bytes = bg_bytes = 0;
>>
>> And you really think this makes code easier to follow? I am somehow not
>> conviced...
>>
>
> There's hidden bug in the original code, because it is too complex to
> clearly understand.
> See bellow,
>
> ratio = min(DIV_ROUND_UP(bytes, global_avail),
>                     PAGE_SIZE)
>
> Suppose the vm_dirty_bytes is set to 512M (this is a reasonable
> value), and the global_avail is only 10000 pages (this is not low),
> then DIV_ROUND_UP(bytes, global_avail) is 53688, which is bigger than
> 4096, so the ratio will be 4096.
> That's unreasonable.
>

Besides, when  gdtc is NULL(meaning not for  memcg),  bg_thresh and
thresh could both be bigger than available_memory when
available_memory is very low.
So what is your opinion on that confused code ?

My opinion is when available_memory is very low, don't wake up
for_background writeback, just let the  for_kupdate writeback flush
the dirty data.

Thanks
Yafang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28  7:52           ` Yafang Shao
  2017-11-28  9:43             ` Yafang Shao
@ 2017-11-28  9:45             ` Michal Hocko
  2017-11-28 10:09             ` Jan Kara
  2 siblings, 0 replies; 37+ messages in thread
From: Michal Hocko @ 2017-11-28  9:45 UTC (permalink / raw)
  To: Yafang Shao; +Cc: Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM

On Tue 28-11-17 15:52:50, Yafang Shao wrote:
> 2017-11-28 15:45 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> > On Tue 28-11-17 14:12:15, Yafang Shao wrote:
> >> 2017-11-28 11:11 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
> >> > Hi Michal,
> >> >
> >> > What about bellow change ?
> >> > It makes the function  domain_dirty_limits() more clear.
> >> > And the result will have a higher precision.
> >> >
> >> >
> >> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> >> > index 8a15511..2b5e507 100644
> >> > --- a/mm/page-writeback.c
> >> > +++ b/mm/page-writeback.c
> >> > @@ -397,8 +397,8 @@ static void domain_dirty_limits(struct
> >> > dirty_throttle_control *dtc)
> >> >     unsigned long bytes = vm_dirty_bytes;
> >> >     unsigned long bg_bytes = dirty_background_bytes;
> >> >     /* convert ratios to per-PAGE_SIZE for higher precision */
> >> > -   unsigned long ratio = (vm_dirty_ratio * PAGE_SIZE) / 100;
> >> > -   unsigned long bg_ratio = (dirty_background_ratio * PAGE_SIZE) / 100;
> >> > +   unsigned long ratio = vm_dirty_ratio;
> >> > +   unsigned long bg_ratio = dirty_background_ratio;
> >> >     unsigned long thresh;
> >> >     unsigned long bg_thresh;
> >> >     struct task_struct *tsk;
> >> > @@ -416,28 +416,33 @@ static void domain_dirty_limits(struct
> >> > dirty_throttle_control *dtc)
> >> >          */
> >> >         if (bytes)
> >> >             ratio = min(DIV_ROUND_UP(bytes, global_avail),
> >> > -                   PAGE_SIZE);
> >> > +                   100);
> >> >         if (bg_bytes)
> >> >             bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
> >> > -                      PAGE_SIZE);
> >> > +                      99);   /* bg_ratio should less than ratio */
> >> >         bytes = bg_bytes = 0;
> >> >     }
> >>
> >>
> >> Errata:
> >>
> >>         if (bytes)
> >> -           ratio = min(DIV_ROUND_UP(bytes, global_avail),
> >> -                   PAGE_SIZE);
> >> +           ratio = min(DIV_ROUND_UP(bytes / PAGE_SIZE, global_avail),
> >> +                   100);
> >>         if (bg_bytes)
> >> -           bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
> >> -                      PAGE_SIZE);
> >> +           bg_ratio = min(DIV_ROUND_UP(bg_bytes / PAGE_SIZE, global_avail),
> >> +                      100 - 1); /* bg_ratio should be less than ratio */
> >>         bytes = bg_bytes = 0;
> >
> > And you really think this makes code easier to follow? I am somehow not
> > conviced...
> >
> 
> There's hidden bug in the original code, because it is too complex to
> clearly understand.
> See bellow,
> 
> ratio = min(DIV_ROUND_UP(bytes, global_avail),
>                     PAGE_SIZE)
> 
> Suppose the vm_dirty_bytes is set to 512M (this is a reasonable
> value), and the global_avail is only 10000 pages (this is not low),
> then DIV_ROUND_UP(bytes, global_avail) is 53688, which is bigger than
> 4096, so the ratio will be 4096.
> That's unreasonable.

You should discuss this with IO people (make sure to CC Tejun). I do not
see _why_ this is unreasonable to be honest. Anyway I still maintain
my position on the warning which is just calling for false positives
without any great advantage. So whatever you decide to change in the
ratio calculation I do not think we should add the warning back.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28  7:52           ` Yafang Shao
  2017-11-28  9:43             ` Yafang Shao
  2017-11-28  9:45             ` Michal Hocko
@ 2017-11-28 10:09             ` Jan Kara
  2017-11-28 10:16               ` Yafang Shao
  2 siblings, 1 reply; 37+ messages in thread
From: Jan Kara @ 2017-11-28 10:09 UTC (permalink / raw)
  To: Yafang Shao; +Cc: Michal Hocko, Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM

On Tue 28-11-17 15:52:50, Yafang Shao wrote:
> 2017-11-28 15:45 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> > On Tue 28-11-17 14:12:15, Yafang Shao wrote:
> >> 2017-11-28 11:11 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
> >> > Hi Michal,
> >> >
> >> > What about bellow change ?
> >> > It makes the function  domain_dirty_limits() more clear.
> >> > And the result will have a higher precision.
> >> >
> >> >
> >> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> >> > index 8a15511..2b5e507 100644
> >> > --- a/mm/page-writeback.c
> >> > +++ b/mm/page-writeback.c
> >> > @@ -397,8 +397,8 @@ static void domain_dirty_limits(struct
> >> > dirty_throttle_control *dtc)
> >> >     unsigned long bytes = vm_dirty_bytes;
> >> >     unsigned long bg_bytes = dirty_background_bytes;
> >> >     /* convert ratios to per-PAGE_SIZE for higher precision */
> >> > -   unsigned long ratio = (vm_dirty_ratio * PAGE_SIZE) / 100;
> >> > -   unsigned long bg_ratio = (dirty_background_ratio * PAGE_SIZE) / 100;
> >> > +   unsigned long ratio = vm_dirty_ratio;
> >> > +   unsigned long bg_ratio = dirty_background_ratio;
> >> >     unsigned long thresh;
> >> >     unsigned long bg_thresh;
> >> >     struct task_struct *tsk;
> >> > @@ -416,28 +416,33 @@ static void domain_dirty_limits(struct
> >> > dirty_throttle_control *dtc)
> >> >          */
> >> >         if (bytes)
> >> >             ratio = min(DIV_ROUND_UP(bytes, global_avail),
> >> > -                   PAGE_SIZE);
> >> > +                   100);
> >> >         if (bg_bytes)
> >> >             bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
> >> > -                      PAGE_SIZE);
> >> > +                      99);   /* bg_ratio should less than ratio */
> >> >         bytes = bg_bytes = 0;
> >> >     }
> >>
> >>
> >> Errata:
> >>
> >>         if (bytes)
> >> -           ratio = min(DIV_ROUND_UP(bytes, global_avail),
> >> -                   PAGE_SIZE);
> >> +           ratio = min(DIV_ROUND_UP(bytes / PAGE_SIZE, global_avail),
> >> +                   100);
> >>         if (bg_bytes)
> >> -           bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
> >> -                      PAGE_SIZE);
> >> +           bg_ratio = min(DIV_ROUND_UP(bg_bytes / PAGE_SIZE, global_avail),
> >> +                      100 - 1); /* bg_ratio should be less than ratio */
> >>         bytes = bg_bytes = 0;
> >
> > And you really think this makes code easier to follow? I am somehow not
> > conviced...
> >
> 
> There's hidden bug in the original code, because it is too complex to
> clearly understand.
> See bellow,
> 
> ratio = min(DIV_ROUND_UP(bytes, global_avail),
>                     PAGE_SIZE)
> 
> Suppose the vm_dirty_bytes is set to 512M (this is a reasonable
> value), and the global_avail is only 10000 pages (this is not low),
> then DIV_ROUND_UP(bytes, global_avail) is 53688, which is bigger than
> 4096, so the ratio will be 4096.
> That's unreasonable.

But that's not a bug in domain_dirty_limits(). It is more a design issue of
the dirty_bytes interface - i.e., if you tell the system that 512M of dirty
pages is fine, then it is fine even if you have only 400M of page cache -
i.e., 100% of page cache can be dirty and that's what the function
computes.  Bad luck if you don't like that but that's how the interface was
(mis)designed. We can talk about changes to what dirty_bytes mean under a
situation when there is low amount of page cache but that will be a
userspace visible change and we will have to be *very* careful not to break
current users.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28 10:09             ` Jan Kara
@ 2017-11-28 10:16               ` Yafang Shao
  0 siblings, 0 replies; 37+ messages in thread
From: Yafang Shao @ 2017-11-28 10:16 UTC (permalink / raw)
  To: Jan Kara; +Cc: Michal Hocko, Tetsuo Handa, Andrew Morton, Linux MM

2017-11-28 18:09 GMT+08:00 Jan Kara <jack@suse.cz>:
> On Tue 28-11-17 15:52:50, Yafang Shao wrote:
>> 2017-11-28 15:45 GMT+08:00 Michal Hocko <mhocko@suse.com>:
>> > On Tue 28-11-17 14:12:15, Yafang Shao wrote:
>> >> 2017-11-28 11:11 GMT+08:00 Yafang Shao <laoar.shao@gmail.com>:
>> >> > Hi Michal,
>> >> >
>> >> > What about bellow change ?
>> >> > It makes the function  domain_dirty_limits() more clear.
>> >> > And the result will have a higher precision.
>> >> >
>> >> >
>> >> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>> >> > index 8a15511..2b5e507 100644
>> >> > --- a/mm/page-writeback.c
>> >> > +++ b/mm/page-writeback.c
>> >> > @@ -397,8 +397,8 @@ static void domain_dirty_limits(struct
>> >> > dirty_throttle_control *dtc)
>> >> >     unsigned long bytes = vm_dirty_bytes;
>> >> >     unsigned long bg_bytes = dirty_background_bytes;
>> >> >     /* convert ratios to per-PAGE_SIZE for higher precision */
>> >> > -   unsigned long ratio = (vm_dirty_ratio * PAGE_SIZE) / 100;
>> >> > -   unsigned long bg_ratio = (dirty_background_ratio * PAGE_SIZE) / 100;
>> >> > +   unsigned long ratio = vm_dirty_ratio;
>> >> > +   unsigned long bg_ratio = dirty_background_ratio;
>> >> >     unsigned long thresh;
>> >> >     unsigned long bg_thresh;
>> >> >     struct task_struct *tsk;
>> >> > @@ -416,28 +416,33 @@ static void domain_dirty_limits(struct
>> >> > dirty_throttle_control *dtc)
>> >> >          */
>> >> >         if (bytes)
>> >> >             ratio = min(DIV_ROUND_UP(bytes, global_avail),
>> >> > -                   PAGE_SIZE);
>> >> > +                   100);
>> >> >         if (bg_bytes)
>> >> >             bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
>> >> > -                      PAGE_SIZE);
>> >> > +                      99);   /* bg_ratio should less than ratio */
>> >> >         bytes = bg_bytes = 0;
>> >> >     }
>> >>
>> >>
>> >> Errata:
>> >>
>> >>         if (bytes)
>> >> -           ratio = min(DIV_ROUND_UP(bytes, global_avail),
>> >> -                   PAGE_SIZE);
>> >> +           ratio = min(DIV_ROUND_UP(bytes / PAGE_SIZE, global_avail),
>> >> +                   100);
>> >>         if (bg_bytes)
>> >> -           bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
>> >> -                      PAGE_SIZE);
>> >> +           bg_ratio = min(DIV_ROUND_UP(bg_bytes / PAGE_SIZE, global_avail),
>> >> +                      100 - 1); /* bg_ratio should be less than ratio */
>> >>         bytes = bg_bytes = 0;
>> >
>> > And you really think this makes code easier to follow? I am somehow not
>> > conviced...
>> >
>>
>> There's hidden bug in the original code, because it is too complex to
>> clearly understand.
>> See bellow,
>>
>> ratio = min(DIV_ROUND_UP(bytes, global_avail),
>>                     PAGE_SIZE)
>>
>> Suppose the vm_dirty_bytes is set to 512M (this is a reasonable
>> value), and the global_avail is only 10000 pages (this is not low),
>> then DIV_ROUND_UP(bytes, global_avail) is 53688, which is bigger than
>> 4096, so the ratio will be 4096.
>> That's unreasonable.
>
> But that's not a bug in domain_dirty_limits(). It is more a design issue of
> the dirty_bytes interface - i.e., if you tell the system that 512M of dirty
> pages is fine, then it is fine even if you have only 400M of page cache -
> i.e., 100% of page cache can be dirty and that's what the function
> computes.  Bad luck if you don't like that but that's how the interface was
> (mis)designed. We can talk about changes to what dirty_bytes mean under a
> situation when there is low amount of page cache but that will be a
> userspace visible change and we will have to be *very* careful not to break
> current users.
>

Thanks for your suggestion.
I will submit a patch for that.

Thanks
Yafang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28  3:11     ` Yafang Shao
  2017-11-28  6:12       ` Yafang Shao
@ 2017-11-28 10:25       ` Jan Kara
  2017-11-28 10:33         ` Yafang Shao
  1 sibling, 1 reply; 37+ messages in thread
From: Jan Kara @ 2017-11-28 10:25 UTC (permalink / raw)
  To: Yafang Shao; +Cc: Michal Hocko, Tetsuo Handa, Andrew Morton, Jan Kara, Linux MM

Hi Yafang,

On Tue 28-11-17 11:11:40, Yafang Shao wrote:
> What about bellow change ?
> It makes the function  domain_dirty_limits() more clear.
> And the result will have a higher precision.

Frankly, I don't find this any better and you've just lost the additional
precision of ratios computed in the "if (gdtc)" branch the multiplication by
PAGE_SIZE got us.

								Honza


> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 8a15511..2b5e507 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -397,8 +397,8 @@ static void domain_dirty_limits(struct
> dirty_throttle_control *dtc)
>     unsigned long bytes = vm_dirty_bytes;
>     unsigned long bg_bytes = dirty_background_bytes;
>     /* convert ratios to per-PAGE_SIZE for higher precision */
> -   unsigned long ratio = (vm_dirty_ratio * PAGE_SIZE) / 100;
> -   unsigned long bg_ratio = (dirty_background_ratio * PAGE_SIZE) / 100;
> +   unsigned long ratio = vm_dirty_ratio;
> +   unsigned long bg_ratio = dirty_background_ratio;
>     unsigned long thresh;
>     unsigned long bg_thresh;
>     struct task_struct *tsk;
> @@ -416,28 +416,33 @@ static void domain_dirty_limits(struct
> dirty_throttle_control *dtc)
>          */
>         if (bytes)
>             ratio = min(DIV_ROUND_UP(bytes, global_avail),
> -                   PAGE_SIZE);
> +                   100);
>         if (bg_bytes)
>             bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
> -                      PAGE_SIZE);
> +                      99);   /* bg_ratio should less than ratio */
>         bytes = bg_bytes = 0;
>     }
> 
> +   /* bytes and bg_bytes must be PAGE_SIZE aligned */
>     if (bytes)
> -       thresh = DIV_ROUND_UP(bytes, PAGE_SIZE);
> +       thresh = DIV_ROUND_UP(bytes, PAGE_SIZE) * 100;
>     else
> -       thresh = (ratio * available_memory) / PAGE_SIZE;
> +       thresh = ratio * available_memory;
> 
>     if (bg_bytes)
> -       bg_thresh = DIV_ROUND_UP(bg_bytes, PAGE_SIZE);
> +       bg_thresh = DIV_ROUND_UP(bg_bytes, PAGE_SIZE) * 100;
>     else
> -       bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
> +       bg_thresh = bg_ratio * available_memory;
> 
>     if (unlikely(bg_thresh >= thresh)) {
>         pr_warn("vm direct limit must be set greater than background limit.\n");
>         bg_thresh = thresh / 2;
>     }
> 
> +   /* ensure bg_thresh and thresh never be 0 */
> +   bg_thresh = DIV_ROUND_UP(bg_thresh, 100);
> +   thresh = DIV_ROUND_UP(thresh, 100);
> +
>     tsk = current;
>     if (tsk->flags & PF_LESS_THROTTLE || rt_task(tsk)) {
> 
> 2017-11-27 17:19 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> > Andrew,
> > could you simply send this to Linus. If we _really_ need something to
> > prevent misconfiguration, which I doubt to be honest, then it should be
> > thought through much better.
> > ---
> > From 4ef6b1cbf98ea5dae155ab3303c4ae1d93411b79 Mon Sep 17 00:00:00 2001
> > From: Michal Hocko <mhocko@suse.com>
> > Date: Mon, 27 Nov 2017 10:12:15 +0100
> > Subject: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm
> >  dirtiness settings are illogical"
> >
> > This reverts commit 0f6d24f878568fac579a1962d0bf7cb9f01e0ceb because
> > it causes false positive warnings during OOM situations as noticed by
> > Tetsuo Handa:
> > [  621.814512] Node 0 active_anon:3525940kB inactive_anon:8372kB active_file:216kB inactive_file:1872kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:2504kB dirty:52kB writeback:0kB shmem:8660kB s
> > hmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 636928kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
> > [  621.821534] Node 0 DMA free:14848kB min:284kB low:352kB high:420kB active_anon:992kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocke
> > d:0kB kernel_stack:0kB pagetables:24kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> > [  621.829035] lowmem_reserve[]: 0 2687 3645 3645
> > [  621.831655] Node 0 DMA32 free:53004kB min:49608kB low:62008kB high:74408kB active_anon:2712648kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:
> > 2773132kB mlocked:0kB kernel_stack:96kB pagetables:5096kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> > [  621.839945] lowmem_reserve[]: 0 0 958 958
> > [  621.842811] Node 0 Normal free:17140kB min:17684kB low:22104kB high:26524kB active_anon:812300kB inactive_anon:8372kB active_file:1228kB inactive_file:1868kB unevictable:0kB writepending:52kB present:1048576k
> > B managed:981224kB mlocked:0kB kernel_stack:3520kB pagetables:8552kB bounce:0kB free_pcp:120kB local_pcp:120kB free_cma:0kB
> > [  621.852473] lowmem_reserve[]: 0 0 0 0
> > [...]
> > [  621.891477] Out of memory: Kill process 8459 (a.out) score 999 or sacrifice child
> > [  621.894363] Killed process 8459 (a.out) total-vm:4180kB, anon-rss:88kB, file-rss:0kB, shmem-rss:0kB
> > [  621.897172] oom_reaper: reaped process 8459 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> > [  622.424664] vm direct limit must be set greater than background limit.
> >
> > The problem is that both thresh and bg_thresh will be 0 if available_memory
> >  is less than 4 pages when evaluating global_dirtyable_memory. While
> > this might be worked around the whole point of the warning is dubious at
> > best. We do rely on admins to do sensible things when changing tunable
> > knobs. Dirty memory writeback knobs are not any special in that regards
> > so revert the warning rather than adding more hacks to work this around.
> >
> > Rerported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> > Debugged-by: Yafang Shao <laoar.shao@gmail.com>
> > Fixes: 0f6d24f87856 ("mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical")
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > ---
> >  Documentation/sysctl/vm.txt | 7 -------
> >  mm/page-writeback.c         | 5 +----
> >  2 files changed, 1 insertion(+), 11 deletions(-)
> >
> > diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
> > index b920423f88cb..5025ff9307e6 100644
> > --- a/Documentation/sysctl/vm.txt
> > +++ b/Documentation/sysctl/vm.txt
> > @@ -158,10 +158,6 @@ Note: the minimum value allowed for dirty_bytes is two pages (in bytes); any
> >  value lower than this limit will be ignored and the old configuration will be
> >  retained.
> >
> > -Note: the value of dirty_bytes also must be set greater than
> > -dirty_background_bytes or the amount of memory corresponding to
> > -dirty_background_ratio.
> > -
> >  ==============================================================
> >
> >  dirty_expire_centisecs
> > @@ -181,9 +177,6 @@ generating disk writes will itself start writing out dirty data.
> >
> >  The total available memory is not equal to total system memory.
> >
> > -Note: dirty_ratio must be set greater than dirty_background_ratio or
> > -ratio corresponding to dirty_background_bytes.
> > -
> >  ==============================================================
> >
> >  dirty_writeback_centisecs
> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> > index e7095030aa1f..586f31261c83 100644
> > --- a/mm/page-writeback.c
> > +++ b/mm/page-writeback.c
> > @@ -433,11 +433,8 @@ static void domain_dirty_limits(struct dirty_throttle_control *dtc)
> >         else
> >                 bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
> >
> > -       if (unlikely(bg_thresh >= thresh)) {
> > -               pr_warn("vm direct limit must be set greater than background limit.\n");
> > +       if (bg_thresh >= thresh)
> >                 bg_thresh = thresh / 2;
> > -       }
> > -
> >         tsk = current;
> >         if (tsk->flags & PF_LESS_THROTTLE || rt_task(tsk)) {
> >                 bg_thresh += bg_thresh / 4 + global_wb_domain.dirty_limit / 32;
> > --
> > 2.15.0
> >
> > --
> > Michal Hocko
> > SUSE Labs
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28 10:25       ` Jan Kara
@ 2017-11-28 10:33         ` Yafang Shao
  2017-11-28 10:41           ` Jan Kara
  0 siblings, 1 reply; 37+ messages in thread
From: Yafang Shao @ 2017-11-28 10:33 UTC (permalink / raw)
  To: Jan Kara; +Cc: Michal Hocko, Tetsuo Handa, Andrew Morton, Linux MM

2017-11-28 18:25 GMT+08:00 Jan Kara <jack@suse.cz>:
> Hi Yafang,
>
> On Tue 28-11-17 11:11:40, Yafang Shao wrote:
>> What about bellow change ?
>> It makes the function  domain_dirty_limits() more clear.
>> And the result will have a higher precision.
>
> Frankly, I don't find this any better and you've just lost the additional
> precision of ratios computed in the "if (gdtc)" branch the multiplication by
> PAGE_SIZE got us.
>

What about bellow change? It won't be lost any more, becasue
bytes and bg_bytes are both PAGE_SIZE aligned.

-       if (bytes)
-           ratio = min(DIV_ROUND_UP(bytes, global_avail),
-                   PAGE_SIZE);
-       if (bg_bytes)
-           bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
-                      PAGE_SIZE);
+       if (bytes) {
+           pages = DIV_ROUND_UP(bytes, PAGE_SIZE);
+           ratio = DIV_ROUND_UP(pages * 100, global_avail);
+
+       }
+
+       if (bg_bytes) {
+           pages = DIV_ROUND_UP(bg_bytes, PAGE_SIZE);
+           bg_ratio = DIV_ROUND_UP(pages * 100, global_avail);
+       }


>
>
>> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>> index 8a15511..2b5e507 100644
>> --- a/mm/page-writeback.c
>> +++ b/mm/page-writeback.c
>> @@ -397,8 +397,8 @@ static void domain_dirty_limits(struct
>> dirty_throttle_control *dtc)
>>     unsigned long bytes = vm_dirty_bytes;
>>     unsigned long bg_bytes = dirty_background_bytes;
>>     /* convert ratios to per-PAGE_SIZE for higher precision */
>> -   unsigned long ratio = (vm_dirty_ratio * PAGE_SIZE) / 100;
>> -   unsigned long bg_ratio = (dirty_background_ratio * PAGE_SIZE) / 100;
>> +   unsigned long ratio = vm_dirty_ratio;
>> +   unsigned long bg_ratio = dirty_background_ratio;
>>     unsigned long thresh;
>>     unsigned long bg_thresh;
>>     struct task_struct *tsk;
>> @@ -416,28 +416,33 @@ static void domain_dirty_limits(struct
>> dirty_throttle_control *dtc)
>>          */
>>         if (bytes)
>>             ratio = min(DIV_ROUND_UP(bytes, global_avail),
>> -                   PAGE_SIZE);
>> +                   100);
>>         if (bg_bytes)
>>             bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
>> -                      PAGE_SIZE);
>> +                      99);   /* bg_ratio should less than ratio */
>>         bytes = bg_bytes = 0;
>>     }
>>
>> +   /* bytes and bg_bytes must be PAGE_SIZE aligned */
>>     if (bytes)
>> -       thresh = DIV_ROUND_UP(bytes, PAGE_SIZE);
>> +       thresh = DIV_ROUND_UP(bytes, PAGE_SIZE) * 100;
>>     else
>> -       thresh = (ratio * available_memory) / PAGE_SIZE;
>> +       thresh = ratio * available_memory;
>>
>>     if (bg_bytes)
>> -       bg_thresh = DIV_ROUND_UP(bg_bytes, PAGE_SIZE);
>> +       bg_thresh = DIV_ROUND_UP(bg_bytes, PAGE_SIZE) * 100;
>>     else
>> -       bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
>> +       bg_thresh = bg_ratio * available_memory;
>>
>>     if (unlikely(bg_thresh >= thresh)) {
>>         pr_warn("vm direct limit must be set greater than background limit.\n");
>>         bg_thresh = thresh / 2;
>>     }
>>
>> +   /* ensure bg_thresh and thresh never be 0 */
>> +   bg_thresh = DIV_ROUND_UP(bg_thresh, 100);
>> +   thresh = DIV_ROUND_UP(thresh, 100);
>> +
>>     tsk = current;
>>     if (tsk->flags & PF_LESS_THROTTLE || rt_task(tsk)) {
>>
>> 2017-11-27 17:19 GMT+08:00 Michal Hocko <mhocko@suse.com>:
>> > Andrew,
>> > could you simply send this to Linus. If we _really_ need something to
>> > prevent misconfiguration, which I doubt to be honest, then it should be
>> > thought through much better.
>> > ---
>> > From 4ef6b1cbf98ea5dae155ab3303c4ae1d93411b79 Mon Sep 17 00:00:00 2001
>> > From: Michal Hocko <mhocko@suse.com>
>> > Date: Mon, 27 Nov 2017 10:12:15 +0100
>> > Subject: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm
>> >  dirtiness settings are illogical"
>> >
>> > This reverts commit 0f6d24f878568fac579a1962d0bf7cb9f01e0ceb because
>> > it causes false positive warnings during OOM situations as noticed by
>> > Tetsuo Handa:
>> > [  621.814512] Node 0 active_anon:3525940kB inactive_anon:8372kB active_file:216kB inactive_file:1872kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:2504kB dirty:52kB writeback:0kB shmem:8660kB s
>> > hmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 636928kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
>> > [  621.821534] Node 0 DMA free:14848kB min:284kB low:352kB high:420kB active_anon:992kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocke
>> > d:0kB kernel_stack:0kB pagetables:24kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>> > [  621.829035] lowmem_reserve[]: 0 2687 3645 3645
>> > [  621.831655] Node 0 DMA32 free:53004kB min:49608kB low:62008kB high:74408kB active_anon:2712648kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:
>> > 2773132kB mlocked:0kB kernel_stack:96kB pagetables:5096kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>> > [  621.839945] lowmem_reserve[]: 0 0 958 958
>> > [  621.842811] Node 0 Normal free:17140kB min:17684kB low:22104kB high:26524kB active_anon:812300kB inactive_anon:8372kB active_file:1228kB inactive_file:1868kB unevictable:0kB writepending:52kB present:1048576k
>> > B managed:981224kB mlocked:0kB kernel_stack:3520kB pagetables:8552kB bounce:0kB free_pcp:120kB local_pcp:120kB free_cma:0kB
>> > [  621.852473] lowmem_reserve[]: 0 0 0 0
>> > [...]
>> > [  621.891477] Out of memory: Kill process 8459 (a.out) score 999 or sacrifice child
>> > [  621.894363] Killed process 8459 (a.out) total-vm:4180kB, anon-rss:88kB, file-rss:0kB, shmem-rss:0kB
>> > [  621.897172] oom_reaper: reaped process 8459 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
>> > [  622.424664] vm direct limit must be set greater than background limit.
>> >
>> > The problem is that both thresh and bg_thresh will be 0 if available_memory
>> >  is less than 4 pages when evaluating global_dirtyable_memory. While
>> > this might be worked around the whole point of the warning is dubious at
>> > best. We do rely on admins to do sensible things when changing tunable
>> > knobs. Dirty memory writeback knobs are not any special in that regards
>> > so revert the warning rather than adding more hacks to work this around.
>> >
>> > Rerported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
>> > Debugged-by: Yafang Shao <laoar.shao@gmail.com>
>> > Fixes: 0f6d24f87856 ("mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical")
>> > Signed-off-by: Michal Hocko <mhocko@suse.com>
>> > ---
>> >  Documentation/sysctl/vm.txt | 7 -------
>> >  mm/page-writeback.c         | 5 +----
>> >  2 files changed, 1 insertion(+), 11 deletions(-)
>> >
>> > diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
>> > index b920423f88cb..5025ff9307e6 100644
>> > --- a/Documentation/sysctl/vm.txt
>> > +++ b/Documentation/sysctl/vm.txt
>> > @@ -158,10 +158,6 @@ Note: the minimum value allowed for dirty_bytes is two pages (in bytes); any
>> >  value lower than this limit will be ignored and the old configuration will be
>> >  retained.
>> >
>> > -Note: the value of dirty_bytes also must be set greater than
>> > -dirty_background_bytes or the amount of memory corresponding to
>> > -dirty_background_ratio.
>> > -
>> >  ==============================================================
>> >
>> >  dirty_expire_centisecs
>> > @@ -181,9 +177,6 @@ generating disk writes will itself start writing out dirty data.
>> >
>> >  The total available memory is not equal to total system memory.
>> >
>> > -Note: dirty_ratio must be set greater than dirty_background_ratio or
>> > -ratio corresponding to dirty_background_bytes.
>> > -
>> >  ==============================================================
>> >
>> >  dirty_writeback_centisecs
>> > diff --git a/mm/page-writeback.c b/mm/page-writeback.c
>> > index e7095030aa1f..586f31261c83 100644
>> > --- a/mm/page-writeback.c
>> > +++ b/mm/page-writeback.c
>> > @@ -433,11 +433,8 @@ static void domain_dirty_limits(struct dirty_throttle_control *dtc)
>> >         else
>> >                 bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
>> >
>> > -       if (unlikely(bg_thresh >= thresh)) {
>> > -               pr_warn("vm direct limit must be set greater than background limit.\n");
>> > +       if (bg_thresh >= thresh)
>> >                 bg_thresh = thresh / 2;
>> > -       }
>> > -
>> >         tsk = current;
>> >         if (tsk->flags & PF_LESS_THROTTLE || rt_task(tsk)) {
>> >                 bg_thresh += bg_thresh / 4 + global_wb_domain.dirty_limit / 32;
>> > --
>> > 2.15.0
>> >
>> > --
>> > Michal Hocko
>> > SUSE Labs
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-27  9:19   ` [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical) Michal Hocko
  2017-11-28  3:11     ` Yafang Shao
@ 2017-11-28 10:37     ` Jan Kara
  2017-11-28 10:48       ` Michal Hocko
  1 sibling, 1 reply; 37+ messages in thread
From: Jan Kara @ 2017-11-28 10:37 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Tetsuo Handa, Yafang Shao, akpm, jack, linux-mm

On Mon 27-11-17 10:19:39, Michal Hocko wrote:
> Andrew,
> could you simply send this to Linus. If we _really_ need something to
> prevent misconfiguration, which I doubt to be honest, then it should be
> thought through much better.

What's so bad about the warning? I think warning about such
misconfiguration is not a bad thing per se. Maybe it should be ratelimited
and certainly the condition is too loose as your example shows but in
principle I'm not against it and e.g. making the inequality in the condition
strict like:

	if (unlikely(bg_thresh > thresh))

or at least

	if (unlikely(bg_thresh >= thresh && thresh > 0))

would warn about cases where domain_dirty_limits() had to fixup bg_thresh
manually to make writeback throttling work and avoid reclaim stalls which
is IMHO a sane thing...

								Honza

> ---
> From 4ef6b1cbf98ea5dae155ab3303c4ae1d93411b79 Mon Sep 17 00:00:00 2001
> From: Michal Hocko <mhocko@suse.com>
> Date: Mon, 27 Nov 2017 10:12:15 +0100
> Subject: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm
>  dirtiness settings are illogical"
> 
> This reverts commit 0f6d24f878568fac579a1962d0bf7cb9f01e0ceb because
> it causes false positive warnings during OOM situations as noticed by
> Tetsuo Handa:
> [  621.814512] Node 0 active_anon:3525940kB inactive_anon:8372kB active_file:216kB inactive_file:1872kB unevictable:0kB isolated(anon):0kB isolated(file):0kB mapped:2504kB dirty:52kB writeback:0kB shmem:8660kB s
> hmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 636928kB writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
> [  621.821534] Node 0 DMA free:14848kB min:284kB low:352kB high:420kB active_anon:992kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocke
> d:0kB kernel_stack:0kB pagetables:24kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [  621.829035] lowmem_reserve[]: 0 2687 3645 3645
> [  621.831655] Node 0 DMA32 free:53004kB min:49608kB low:62008kB high:74408kB active_anon:2712648kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:3129216kB managed:
> 2773132kB mlocked:0kB kernel_stack:96kB pagetables:5096kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [  621.839945] lowmem_reserve[]: 0 0 958 958
> [  621.842811] Node 0 Normal free:17140kB min:17684kB low:22104kB high:26524kB active_anon:812300kB inactive_anon:8372kB active_file:1228kB inactive_file:1868kB unevictable:0kB writepending:52kB present:1048576k
> B managed:981224kB mlocked:0kB kernel_stack:3520kB pagetables:8552kB bounce:0kB free_pcp:120kB local_pcp:120kB free_cma:0kB
> [  621.852473] lowmem_reserve[]: 0 0 0 0
> [...]
> [  621.891477] Out of memory: Kill process 8459 (a.out) score 999 or sacrifice child
> [  621.894363] Killed process 8459 (a.out) total-vm:4180kB, anon-rss:88kB, file-rss:0kB, shmem-rss:0kB
> [  621.897172] oom_reaper: reaped process 8459 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [  622.424664] vm direct limit must be set greater than background limit.
> 
> The problem is that both thresh and bg_thresh will be 0 if available_memory
>  is less than 4 pages when evaluating global_dirtyable_memory. While
> this might be worked around the whole point of the warning is dubious at
> best. We do rely on admins to do sensible things when changing tunable
> knobs. Dirty memory writeback knobs are not any special in that regards
> so revert the warning rather than adding more hacks to work this around.
> 
> Rerported-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Debugged-by: Yafang Shao <laoar.shao@gmail.com>
> Fixes: 0f6d24f87856 ("mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical")
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>  Documentation/sysctl/vm.txt | 7 -------
>  mm/page-writeback.c         | 5 +----
>  2 files changed, 1 insertion(+), 11 deletions(-)
> 
> diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt
> index b920423f88cb..5025ff9307e6 100644
> --- a/Documentation/sysctl/vm.txt
> +++ b/Documentation/sysctl/vm.txt
> @@ -158,10 +158,6 @@ Note: the minimum value allowed for dirty_bytes is two pages (in bytes); any
>  value lower than this limit will be ignored and the old configuration will be
>  retained.
>  
> -Note: the value of dirty_bytes also must be set greater than
> -dirty_background_bytes or the amount of memory corresponding to
> -dirty_background_ratio.
> -
>  ==============================================================
>  
>  dirty_expire_centisecs
> @@ -181,9 +177,6 @@ generating disk writes will itself start writing out dirty data.
>  
>  The total available memory is not equal to total system memory.
>  
> -Note: dirty_ratio must be set greater than dirty_background_ratio or
> -ratio corresponding to dirty_background_bytes.
> -
>  ==============================================================
>  
>  dirty_writeback_centisecs
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index e7095030aa1f..586f31261c83 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -433,11 +433,8 @@ static void domain_dirty_limits(struct dirty_throttle_control *dtc)
>  	else
>  		bg_thresh = (bg_ratio * available_memory) / PAGE_SIZE;
>  
> -	if (unlikely(bg_thresh >= thresh)) {
> -		pr_warn("vm direct limit must be set greater than background limit.\n");
> +	if (bg_thresh >= thresh)
>  		bg_thresh = thresh / 2;
> -	}
> -
>  	tsk = current;
>  	if (tsk->flags & PF_LESS_THROTTLE || rt_task(tsk)) {
>  		bg_thresh += bg_thresh / 4 + global_wb_domain.dirty_limit / 32;
> -- 
> 2.15.0
> 
> -- 
> Michal Hocko
> SUSE Labs
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28 10:33         ` Yafang Shao
@ 2017-11-28 10:41           ` Jan Kara
  2017-11-28 10:44             ` Yafang Shao
  0 siblings, 1 reply; 37+ messages in thread
From: Jan Kara @ 2017-11-28 10:41 UTC (permalink / raw)
  To: Yafang Shao; +Cc: Jan Kara, Michal Hocko, Tetsuo Handa, Andrew Morton, Linux MM

On Tue 28-11-17 18:33:02, Yafang Shao wrote:
> 2017-11-28 18:25 GMT+08:00 Jan Kara <jack@suse.cz>:
> > Hi Yafang,
> >
> > On Tue 28-11-17 11:11:40, Yafang Shao wrote:
> >> What about bellow change ?
> >> It makes the function  domain_dirty_limits() more clear.
> >> And the result will have a higher precision.
> >
> > Frankly, I don't find this any better and you've just lost the additional
> > precision of ratios computed in the "if (gdtc)" branch the multiplication by
> > PAGE_SIZE got us.
> >
> 
> What about bellow change? It won't be lost any more, becasue
> bytes and bg_bytes are both PAGE_SIZE aligned.
> 
> -       if (bytes)
> -           ratio = min(DIV_ROUND_UP(bytes, global_avail),
> -                   PAGE_SIZE);
> -       if (bg_bytes)
> -           bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
> -                      PAGE_SIZE);
> +       if (bytes) {
> +           pages = DIV_ROUND_UP(bytes, PAGE_SIZE);
> +           ratio = DIV_ROUND_UP(pages * 100, global_avail);
> +
> +       }
> +
> +       if (bg_bytes) {
> +           pages = DIV_ROUND_UP(bg_bytes, PAGE_SIZE);
> +           bg_ratio = DIV_ROUND_UP(pages * 100, global_avail);
> +       }

Not better... Look, in the original code the 'ratio' and 'bg_ratio'
variables contain a number between 0 and 1 as fractions of 1/PAGE_SIZE. In
your code you have in these variables fractions of 1/100. That's certainly
less precise no matter how you get to those numbers.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28 10:41           ` Jan Kara
@ 2017-11-28 10:44             ` Yafang Shao
  0 siblings, 0 replies; 37+ messages in thread
From: Yafang Shao @ 2017-11-28 10:44 UTC (permalink / raw)
  To: Jan Kara; +Cc: Michal Hocko, Tetsuo Handa, Andrew Morton, Linux MM

2017-11-28 18:41 GMT+08:00 Jan Kara <jack@suse.cz>:
> On Tue 28-11-17 18:33:02, Yafang Shao wrote:
>> 2017-11-28 18:25 GMT+08:00 Jan Kara <jack@suse.cz>:
>> > Hi Yafang,
>> >
>> > On Tue 28-11-17 11:11:40, Yafang Shao wrote:
>> >> What about bellow change ?
>> >> It makes the function  domain_dirty_limits() more clear.
>> >> And the result will have a higher precision.
>> >
>> > Frankly, I don't find this any better and you've just lost the additional
>> > precision of ratios computed in the "if (gdtc)" branch the multiplication by
>> > PAGE_SIZE got us.
>> >
>>
>> What about bellow change? It won't be lost any more, becasue
>> bytes and bg_bytes are both PAGE_SIZE aligned.
>>
>> -       if (bytes)
>> -           ratio = min(DIV_ROUND_UP(bytes, global_avail),
>> -                   PAGE_SIZE);
>> -       if (bg_bytes)
>> -           bg_ratio = min(DIV_ROUND_UP(bg_bytes, global_avail),
>> -                      PAGE_SIZE);
>> +       if (bytes) {
>> +           pages = DIV_ROUND_UP(bytes, PAGE_SIZE);
>> +           ratio = DIV_ROUND_UP(pages * 100, global_avail);
>> +
>> +       }
>> +
>> +       if (bg_bytes) {
>> +           pages = DIV_ROUND_UP(bg_bytes, PAGE_SIZE);
>> +           bg_ratio = DIV_ROUND_UP(pages * 100, global_avail);
>> +       }
>
> Not better... Look, in the original code the 'ratio' and 'bg_ratio'
> variables contain a number between 0 and 1 as fractions of 1/PAGE_SIZE. In
> your code you have in these variables fractions of 1/100. That's certainly
> less precise no matter how you get to those numbers.
>

Understood.
Thanks :)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28 10:37     ` Jan Kara
@ 2017-11-28 10:48       ` Michal Hocko
  2017-11-28 11:05         ` Yafang Shao
  0 siblings, 1 reply; 37+ messages in thread
From: Michal Hocko @ 2017-11-28 10:48 UTC (permalink / raw)
  To: Jan Kara; +Cc: Tetsuo Handa, Yafang Shao, akpm, linux-mm

On Tue 28-11-17 11:37:23, Jan Kara wrote:
> On Mon 27-11-17 10:19:39, Michal Hocko wrote:
> > Andrew,
> > could you simply send this to Linus. If we _really_ need something to
> > prevent misconfiguration, which I doubt to be honest, then it should be
> > thought through much better.
> 
> What's so bad about the warning? I think warning about such
> misconfiguration is not a bad thing per se. Maybe it should be ratelimited
> and certainly the condition is too loose as your example shows but in
> principle I'm not against it and e.g. making the inequality in the condition
> strict like:
> 
> 	if (unlikely(bg_thresh > thresh))
> 
> or at least
> 
> 	if (unlikely(bg_thresh >= thresh && thresh > 0))
> 
> would warn about cases where domain_dirty_limits() had to fixup bg_thresh
> manually to make writeback throttling work and avoid reclaim stalls which
> is IMHO a sane thing...

If it generates false positives then it is more harmful than useful. And
even if it doesn't, what is the point? Do we check that other related
knobs are configured properly? I do not think so, we simply rely on
admins doing sane things. Otherwise we would have a lot of warnings like
that. They would be pain to maintain and I believe the additional value
is quite dubious.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28 10:48       ` Michal Hocko
@ 2017-11-28 11:05         ` Yafang Shao
  2017-11-28 11:54           ` Michal Hocko
  0 siblings, 1 reply; 37+ messages in thread
From: Yafang Shao @ 2017-11-28 11:05 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Jan Kara, Tetsuo Handa, Andrew Morton, Linux MM

2017-11-28 18:48 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> On Tue 28-11-17 11:37:23, Jan Kara wrote:
>> On Mon 27-11-17 10:19:39, Michal Hocko wrote:
>> > Andrew,
>> > could you simply send this to Linus. If we _really_ need something to
>> > prevent misconfiguration, which I doubt to be honest, then it should be
>> > thought through much better.
>>
>> What's so bad about the warning? I think warning about such
>> misconfiguration is not a bad thing per se. Maybe it should be ratelimited
>> and certainly the condition is too loose as your example shows but in
>> principle I'm not against it and e.g. making the inequality in the condition
>> strict like:
>>
>>       if (unlikely(bg_thresh > thresh))
>>
>> or at least
>>
>>       if (unlikely(bg_thresh >= thresh && thresh > 0))
>>
>> would warn about cases where domain_dirty_limits() had to fixup bg_thresh
>> manually to make writeback throttling work and avoid reclaim stalls which
>> is IMHO a sane thing...
>
> If it generates false positives then it is more harmful than useful. And
> even if it doesn't, what is the point? Do we check that other related
> knobs are configured properly? I do not think so, we simply rely on
> admins doing sane things.

Not all admins are good at tuning this.
I don't think every SE knows how to tune vm.dirty_background_bytes and
vm.dirty_background_bytes. Only kernel experts could do that.

At least this warning could help them to learn what happend instead of
knowing nothing.



> Otherwise we would have a lot of warnings like
> that. They would be pain to maintain and I believe the additional value
> is quite dubious.
> --

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical)
  2017-11-28 11:05         ` Yafang Shao
@ 2017-11-28 11:54           ` Michal Hocko
  0 siblings, 0 replies; 37+ messages in thread
From: Michal Hocko @ 2017-11-28 11:54 UTC (permalink / raw)
  To: Yafang Shao; +Cc: Jan Kara, Tetsuo Handa, Andrew Morton, Linux MM

On Tue 28-11-17 19:05:54, Yafang Shao wrote:
> 2017-11-28 18:48 GMT+08:00 Michal Hocko <mhocko@suse.com>:
> > On Tue 28-11-17 11:37:23, Jan Kara wrote:
> >> On Mon 27-11-17 10:19:39, Michal Hocko wrote:
> >> > Andrew,
> >> > could you simply send this to Linus. If we _really_ need something to
> >> > prevent misconfiguration, which I doubt to be honest, then it should be
> >> > thought through much better.
> >>
> >> What's so bad about the warning? I think warning about such
> >> misconfiguration is not a bad thing per se. Maybe it should be ratelimited
> >> and certainly the condition is too loose as your example shows but in
> >> principle I'm not against it and e.g. making the inequality in the condition
> >> strict like:
> >>
> >>       if (unlikely(bg_thresh > thresh))
> >>
> >> or at least
> >>
> >>       if (unlikely(bg_thresh >= thresh && thresh > 0))
> >>
> >> would warn about cases where domain_dirty_limits() had to fixup bg_thresh
> >> manually to make writeback throttling work and avoid reclaim stalls which
> >> is IMHO a sane thing...
> >
> > If it generates false positives then it is more harmful than useful. And
> > even if it doesn't, what is the point? Do we check that other related
> > knobs are configured properly? I do not think so, we simply rely on
> > admins doing sane things.
> 
> Not all admins are good at tuning this.
> I don't think every SE knows how to tune vm.dirty_background_bytes and
> vm.dirty_background_bytes. Only kernel experts could do that.

So are you saying that people tend to configure system randomly without
reading documentation? Seriously, this whole thing has generated much
more discussion than it deserves. I really fail to see why you are
insisting without admiting that the thing is just broken and you do not
have a great idea to fix it without adding even more hacks on top of
what we have currently.

> At least this warning could help them to learn what happend instead of
> knowing nothing.

That is what we have a documentation for. If it needs improvements then
I am all for it.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2017-11-28 11:54 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-28  9:54 [PATCH] mm: print a warning once the vm dirtiness settings is illogical Yafang Shao
2017-11-25 16:05 ` Tetsuo Handa
2017-11-26  2:24   ` Yafang Shao
2017-11-26  2:42     ` Tetsuo Handa
2017-11-26  4:32       ` Yafang Shao
2017-11-26  8:03         ` Tetsuo Handa
2017-11-26  8:27           ` Yafang Shao
2017-11-26  8:46           ` Yafang Shao
2017-11-26 10:38             ` Tetsuo Handa
2017-11-27  8:06               ` Yafang Shao
2017-11-27  8:21                 ` Michal Hocko
2017-11-27  8:29                   ` Yafang Shao
2017-11-27  8:32                     ` Yafang Shao
2017-11-27  8:37                       ` Michal Hocko
2017-11-27  8:49                         ` Yafang Shao
2017-11-27  8:52                           ` Michal Hocko
2017-11-27  8:54                             ` Yafang Shao
2017-11-27  9:04                               ` Michal Hocko
2017-11-27  9:08                                 ` Yafang Shao
2017-11-27  8:34                     ` Michal Hocko
2017-11-27  9:19   ` [PATCH] Revert "mm/page-writeback.c: print a warning if the vm dirtiness settings are illogical" (was: Re: [PATCH] mm: print a warning once the vm dirtiness settings is illogical) Michal Hocko
2017-11-28  3:11     ` Yafang Shao
2017-11-28  6:12       ` Yafang Shao
2017-11-28  7:45         ` Michal Hocko
2017-11-28  7:52           ` Yafang Shao
2017-11-28  9:43             ` Yafang Shao
2017-11-28  9:45             ` Michal Hocko
2017-11-28 10:09             ` Jan Kara
2017-11-28 10:16               ` Yafang Shao
2017-11-28 10:25       ` Jan Kara
2017-11-28 10:33         ` Yafang Shao
2017-11-28 10:41           ` Jan Kara
2017-11-28 10:44             ` Yafang Shao
2017-11-28 10:37     ` Jan Kara
2017-11-28 10:48       ` Michal Hocko
2017-11-28 11:05         ` Yafang Shao
2017-11-28 11:54           ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.