> It might be also interesting to do in the problematic state, instead of > dropping caches: > > - save snapshot of /proc/vmstat and /proc/pagetypeinfo > - echo 1 > /proc/sys/vm/compact_memory > - save new snapshot of /proc/vmstat and /proc/pagetypeinfo There was just a worstcase in progress, about 100MB/10GB were used, super-low perfomance, but could not see any improvement there after echo 1, I watches this for about 3 minutes, the cache usage did not change. pagetypeinfo before echo https://pastebin.com/MjSgiMRL pagetypeinfo 3min after echo https://pastebin.com/uWM6xGDd vmstat before echo https://pastebin.com/TjYSKNdE vmstat 3min after echo https://pastebin.com/MqTibEKi > Btw. vast majority of order-3 requests come from the network layer. Are > you using a large MTU (jumbo packets)? not that I know of, how would I figure that out? I have not touched sysctl net.* besides a few values not related to mtu afaik > Btw. I was probably not specific enough. This data should be collected > _during_ the time when the page cache is disappearing. I suspect you > have started collecting after the fact. meh, I just messed up that output with the latest drop_caches, but I am pretty much sure that the one you see is while the usage was like 300MB/10GB, before drop caches. I was thinking maybe it would really help if one of you guys links up with the hosts in that state so that you can see for yourself. due to privacy issues (gdpr and stuff) I'd like to monitor this, so the ssh login would have to go over something like teamviewer on my host or whatever. please let me know if anyone is willing, since I really see no help there with anything I tried for 3 months by now. thanks for the efforts. surely any diagnosis would be easier this way.