On 18.02.21 06:21, Roman Shaposhnik wrote: > On Wed, Feb 17, 2021 at 12:29 AM Jürgen Groß > wrote: > > On 17.02.21 09:12, Roman Shaposhnik wrote: > > Hi Jürgen, thanks for taking a look at this. A few comments below: > > > > On Tue, Feb 16, 2021 at 10:47 PM Jürgen Groß > wrote: > >> > >> On 16.02.21 21:34, Stefano Stabellini wrote: > >>> + x86 maintainers > >>> > >>> It looks like the tlbflush is getting stuck? > >> > >> I have seen this case multiple times on customer systems now, but > >> reproducing it reliably seems to be very hard. > > > > It is reliably reproducible under my workload but it take a long time > > (~3 days of the workload running in the lab). > > This is by far the best reproduction rate I have seen up to now. > > The next best reproducer seems to be a huge installation with several > hundred hosts and thousands of VMs with about 1 crash each week. > > > > >> I suspected fifo events to be blamed, but just yesterday I've been > >> informed of another case with fifo events disabled in the guest. > >> > >> One common pattern seems to be that up to now I have seen this > effect > >> only on systems with Intel Gold cpus. Can it be confirmed to be true > >> in this case, too? > > > > I am pretty sure mine isn't -- I can get you full CPU specs if > that's useful. > > Just the output of "grep model /proc/cpuinfo" should be enough. > > > processor: 3 > vendor_id: GenuineIntel > cpu family: 6 > model: 77 > model name: Intel(R) Atom(TM) CPU  C2550  @ 2.40GHz > stepping: 8 > microcode: 0x12d > cpu MHz: 1200.070 > cache size: 1024 KB > physical id: 0 > siblings: 4 > core id: 3 > cpu cores: 4 > apicid: 6 > initial apicid: 6 > fpu: yes > fpu_exception: yes > cpuid level: 11 > wp: yes > flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat > pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp > lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology > nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est > tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 movbe popcnt tsc_deadline_timer > aes rdrand lahf_lm 3dnowprefetch cpuid_fault epb pti ibrs ibpb stibp > tpr_shadow vnmi flexpriority ept vpid tsc_adjust smep erms dtherm ida > arat md_clear > vmx flags: vnmi preemption_timer invvpid ept_x_only flexpriority > tsc_offset vtpr mtf vapic ept vpid unrestricted_guest > bugs: cpu_meltdown spectre_v1 spectre_v2 mds msbds_only > bogomips: 4800.19 > clflush size: 64 > cache_alignment: 64 > address sizes: 36 bits physical, 48 bits virtual > power management: > > > > >> In case anybody has a reproducer (either in a guest or dom0) with a > >> setup where a diagnostic kernel can be used, I'd be _very_ > interested! > > > > I can easily add things to Dom0 and DomU. Whether that will > disrupt the > > experiment is, of course, another matter. Still please let me > know what > > would be helpful to do. > > Is there a chance to switch to an upstream kernel in the guest? I'd like > to add some diagnostic code to the kernel and creating the patches will > be easier this way. > > > That's a bit tough -- the VM is based on stock Ubuntu and if I upgrade > the kernel I'll have fiddle with a lot things to make workload > functional again. > > However, I can install debug kernel (from Ubuntu, etc. etc.) > > Of course, if patching the kernel is the only way to make progress -- > lets try that -- please let me know. I have found a nice upstream patch, which - with some modifications - I plan to give our customer as a workaround. The patch is for kernel 4.12, but chances are good it will apply to a 4.15 kernel, too. Are you able to give it a try? I hope it will fix the hangs, but in#case of a fixed situation there should be a message on the console. Juergen