All of lore.kernel.org
 help / color / mirror / Atom feed
* All process has been hanged after a kernel WARNING in kernel 4.4.x
@ 2017-08-23 12:40 ` Feng Feng24 Liu
  0 siblings, 0 replies; 3+ messages in thread
From: Feng Feng24 Liu @ 2017-08-23 12:40 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users, mhocko, kirill.shutemov, gregkh, rostedt
  Cc: Tong Tong3 Li

Dear experts
	I install kernel 4.4.70-rt83 in my environment, and run QEMU-KVM & OVS-DPDK on my server.
	After a kernel warning, I found that all of the process, such as sshd, has no response. The monitor cannot displayed. All process looks like has been hanged. But the server could be ping. 
	Following is the log of the kernel warning
    >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 854 <3>Aug 18 11:40:36 node-15 kernel: [222633.430875] kvm [2042203]: vcpu0 unhandled rdmsr: 0x606                                                                                          
 855 <3>Aug 18 11:40:36 node-15 kernel: [222633.494780] kvm [2042203]: vcpu0 unhandled rdmsr: 0x34                                                                                           
 856 <3>Aug 18 11:41:22 node-15 kernel: [222679.084867] kvm [2042166]: vcpu0 unhandled rdmsr: 0x606                                                                                          
 857 <3>Aug 18 11:41:22 node-15 kernel: [222679.148727] kvm [2042166]: vcpu0 unhandled rdmsr: 0x34                                                                                           
 858 <4>Aug 22 13:44:21 node-15 kernel: [575621.666498] ------------[ cut here ]------------                                                                                                 
 859 <4>Aug 22 13:44:21 node-15 kernel: [575621.666518] WARNING: CPU: 34 PID: 1419064 at mm/page_counter.c:26 page_counter_cancel+0x34/0x40()                                                
 860 <4>Aug 22 13:44:21 node-15 kernel: [575621.666521] Modules linked in: xt_set ip_set_hash_net ip_set xt_mac xt_physdev ip6table_raw ip6table_mangle iptable_nat nf_nat_ipv4 nf_nat xt_con     nmark iptable_mangle 8021q garp mrp ebtable_filter ebtables ip6table_filter ip6_tables vhost_net vhost macvtap macvlan xt_tcpudp xt_conntrack iptable_raw xt_CT xt_comment iptable_filte     r xt_multiport igb_uio(O) uio openvswitch intel_rapl iosf_mbi intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64      glue_helper lrw ablk_helper cryptd input_leds led_class joydev mei_me mei lpc_ich sb_edac mfd_core edac_core shpchp ipmi_devintf ipmi_si ipmi_msghandler tpm_tis acpi_pad nf_conntrack_     ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables x_tables raid1 mpt3sas raid_class scsi_transport_sas                                                                     
 861 <4>Aug 22 13:44:21 node-15 kernel: [575621.666579] CPU: 34 PID: 1419064 Comm: ruby-mri Tainted: G           O    4.4.70-thinkcloud-nfv #1                                               
 862 <4>Aug 22 13:44:21 node-15 kernel: [575621.666581] Hardware name: ZTE R5300 G3/SGLMA, BIOS UBF09.01.09_SVN65700 12/14/2016                                                              
 863 <4>Aug 22 13:44:21 node-15 kernel: [575621.666585]  0000000000000000 ffff8801341f3b90 ffffffff814093de 0000000000000000                                                                 
 864 <4>Aug 22 13:44:21 node-15 kernel: [575621.666587]  ffffffff81caec1c ffff8801341f3bc8 ffffffff810615d6 ffff8801897acce0                                                                 
 865 <4>Aug 22 13:44:21 node-15 kernel: [575621.666589]  000000000000000a ffff8801897acc00 ffff883fc6fcb8e0 ffff883fc6fcb800                                                                 
 866 <4>Aug 22 13:44:21 node-15 kernel: [575621.666590] Call Trace:                                                                                                                          
 867 <4>Aug 22 13:44:21 node-15 kernel: [575621.666601]  [<ffffffff814093de>] dump_stack+0x65/0x87                                                                                           
 868 <4>Aug 22 13:44:21 node-15 kernel: [575621.666609]  [<ffffffff810615d6>] warn_slowpath_common+0x86/0xe0                                                                                 
 869 <4>Aug 22 13:44:21 node-15 kernel: [575621.666612]  [<ffffffff810616ea>] warn_slowpath_null+0x1a/0x30                                                                                   
 870 <4>Aug 22 13:44:21 node-15 kernel: [575621.666616]  [<ffffffff811a15c4>] page_counter_cancel+0x34/0x40                                                                                  
 871 <4>Aug 22 13:44:21 node-15 kernel: [575621.666619]  [<ffffffff811a16c2>] page_counter_uncharge+0x22/0x30                                                                                
 872 <4>Aug 22 13:44:21 node-15 kernel: [575621.666622]  [<ffffffff811a35db>] drain_stock.isra.39+0x3b/0xe0                                                                                  
 873 <4>Aug 22 13:44:21 node-15 kernel: [575621.666624]  [<ffffffff811a3bea>] try_charge+0x3ca/0x720                                                                                         
 874 <4>Aug 22 13:44:21 node-15 kernel: [575621.666629]  [<ffffffff81085687>] ? preempt_count_add+0x47/0xc0                                                                                  
 875 <4>Aug 22 13:44:21 node-15 kernel: [575621.666634]  [<ffffffff811a7ba3>] mem_cgroup_try_charge+0x63/0x100                                                                               
 876 <4>Aug 22 13:44:21 node-15 kernel: [575621.666640]  [<ffffffff8117477b>] wp_page_copy.isra.63+0x14b/0x500                                                                               
 877 <4>Aug 22 13:44:21 node-15 kernel: [575621.666643]  [<ffffffff811760fe>] do_wp_page+0x8e/0x450                                                                                          
 878 <4>Aug 22 13:44:21 node-15 kernel: [575621.666647]  [<ffffffff8117814b>] handle_mm_fault+0xd7b/0x1380                                                                                   
 879 <4>Aug 22 13:44:21 node-15 kernel: [575621.666656]  [<ffffffff81a98c2a>] ? _raw_spin_lock_irqsave+0x2a/0x50                                                                             
 880 <4>Aug 22 13:44:21 node-15 kernel: [575621.666661]  [<ffffffff810a2d88>] ? __try_to_take_rt_mutex+0x108/0x160                                                                           
 881 <4>Aug 22 13:44:21 node-15 kernel: [575621.666664]  [<ffffffff81a98c70>] ? _raw_spin_unlock_irqrestore+0x20/0x60                                                                        
 882 <4>Aug 22 13:44:21 node-15 kernel: [575621.666667]  [<ffffffff81a975e0>] ? rt_mutex_trylock+0x80/0xc0                                                                                   
 883 <4>Aug 22 13:44:21 node-15 kernel: [575621.666673]  [<ffffffff8104efaf>] __do_page_fault+0x16f/0x4d0                                                                                    
 884 <4>Aug 22 13:44:21 node-15 kernel: [575621.666676]  [<ffffffff8104f342>] do_page_fault+0x32/0x90                                                                                        
 885 <4>Aug 22 13:44:21 node-15 kernel: [575621.666681]  [<ffffffff811463cd>] ? context_tracking_exit+0x1d/0x30                                                                              
 886 <4>Aug 22 13:44:21 node-15 kernel: [575621.666685]  [<ffffffff81a9b298>] page_fault+0x28/0x30                                                                                           
 887 <4>Aug 22 13:44:21 node-15 kernel: [575621.666688] ---[ end trace 0000000000000002 ]---                                                                                                 
 888 <7>Aug 22 13:52:14 node-15 kernel: [576094.285955] kvm: zapping shadow pages for mmio generation wraparound                                                                             
 889 <7>Aug 22 13:52:14 node-15 kernel: [576094.362130] kvm: zapping shadow pages for mmio generation wraparound                                                                             
 890 <3>Aug 22 13:52:21 node-15 kernel: [576101.551233] kvm [1424015]: vcpu3 unhandled rdmsr: 0x606               
	<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
	
	I find there is a discuss at:
	https://lkml.org/lkml/2015/12/3/460
	Whether it is the same problem as above?  Is it a known issue , which has not been fixed in kernel 4.4.x?


Thanks
Feng

^ permalink raw reply	[flat|nested] 3+ messages in thread

* All process has been hanged after a kernel WARNING in kernel 4.4.x
@ 2017-08-23 12:40 ` Feng Feng24 Liu
  0 siblings, 0 replies; 3+ messages in thread
From: Feng Feng24 Liu @ 2017-08-23 12:40 UTC (permalink / raw)
  To: linux-kernel, linux-rt-users, mhocko, kirill.shutemov, gregkh, rostedt
  Cc: Tong Tong3 Li

Dear experts
	I install kernel 4.4.70-rt83 in my environment, and run QEMU-KVM & OVS-DPDK on my server.
	After a kernel warning, I found that all of the process, such as sshd, has no response. The monitor cannot displayed. All process looks like has been hanged. But the server could be ping. 
	Following is the log of the kernel warning
    >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
 854 <3>Aug 18 11:40:36 node-15 kernel: [222633.430875] kvm [2042203]: vcpu0 unhandled rdmsr: 0x606                                                                                          
 855 <3>Aug 18 11:40:36 node-15 kernel: [222633.494780] kvm [2042203]: vcpu0 unhandled rdmsr: 0x34                                                                                           
 856 <3>Aug 18 11:41:22 node-15 kernel: [222679.084867] kvm [2042166]: vcpu0 unhandled rdmsr: 0x606                                                                                          
 857 <3>Aug 18 11:41:22 node-15 kernel: [222679.148727] kvm [2042166]: vcpu0 unhandled rdmsr: 0x34                                                                                           
 858 <4>Aug 22 13:44:21 node-15 kernel: [575621.666498] ------------[ cut here ]------------                                                                                                 
 859 <4>Aug 22 13:44:21 node-15 kernel: [575621.666518] WARNING: CPU: 34 PID: 1419064 at mm/page_counter.c:26 page_counter_cancel+0x34/0x40()                                                
 860 <4>Aug 22 13:44:21 node-15 kernel: [575621.666521] Modules linked in: xt_set ip_set_hash_net ip_set xt_mac xt_physdev ip6table_raw ip6table_mangle iptable_nat nf_nat_ipv4 nf_nat xt_con     nmark iptable_mangle 8021q garp mrp ebtable_filter ebtables ip6table_filter ip6_tables vhost_net vhost macvtap macvlan xt_tcpudp xt_conntrack iptable_raw xt_CT xt_comment iptable_filte     r xt_multiport igb_uio(O) uio openvswitch intel_rapl iosf_mbi intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64      glue_helper lrw ablk_helper cryptd input_leds led_class joydev mei_me mei lpc_ich sb_edac mfd_core edac_core shpchp ipmi_devintf ipmi_si ipmi_msghandler tpm_tis acpi_pad nf_conntrack_     ipv6 nf_defrag_ipv6 nf_conntrack_ipv4 n
 f_defrag_ipv4 ip_tables x_tables raid1 mpt3sas raid_class scsi_transport_sas                                                                     
 861 <4>Aug 22 13:44:21 node-15 kernel: [575621.666579] CPU: 34 PID: 1419064 Comm: ruby-mri Tainted: G           O    4.4.70-thinkcloud-nfv #1                                               
 862 <4>Aug 22 13:44:21 node-15 kernel: [575621.666581] Hardware name: ZTE R5300 G3/SGLMA, BIOS UBF09.01.09_SVN65700 12/14/2016                                                              
 863 <4>Aug 22 13:44:21 node-15 kernel: [575621.666585]  0000000000000000 ffff8801341f3b90 ffffffff814093de 0000000000000000                                                                 
 864 <4>Aug 22 13:44:21 node-15 kernel: [575621.666587]  ffffffff81caec1c ffff8801341f3bc8 ffffffff810615d6 ffff8801897acce0                                                                 
 865 <4>Aug 22 13:44:21 node-15 kernel: [575621.666589]  000000000000000a ffff8801897acc00 ffff883fc6fcb8e0 ffff883fc6fcb800                                                                 
 866 <4>Aug 22 13:44:21 node-15 kernel: [575621.666590] Call Trace:                                                                                                                          
 867 <4>Aug 22 13:44:21 node-15 kernel: [575621.666601]  [<ffffffff814093de>] dump_stack+0x65/0x87                                                                                           
 868 <4>Aug 22 13:44:21 node-15 kernel: [575621.666609]  [<ffffffff810615d6>] warn_slowpath_common+0x86/0xe0                                                                                 
 869 <4>Aug 22 13:44:21 node-15 kernel: [575621.666612]  [<ffffffff810616ea>] warn_slowpath_null+0x1a/0x30                                                                                   
 870 <4>Aug 22 13:44:21 node-15 kernel: [575621.666616]  [<ffffffff811a15c4>] page_counter_cancel+0x34/0x40                                                                                  
 871 <4>Aug 22 13:44:21 node-15 kernel: [575621.666619]  [<ffffffff811a16c2>] page_counter_uncharge+0x22/0x30                                                                                
 872 <4>Aug 22 13:44:21 node-15 kernel: [575621.666622]  [<ffffffff811a35db>] drain_stock.isra.39+0x3b/0xe0                                                                                  
 873 <4>Aug 22 13:44:21 node-15 kernel: [575621.666624]  [<ffffffff811a3bea>] try_charge+0x3ca/0x720                                                                                         
 874 <4>Aug 22 13:44:21 node-15 kernel: [575621.666629]  [<ffffffff81085687>] ? preempt_count_add+0x47/0xc0                                                                                  
 875 <4>Aug 22 13:44:21 node-15 kernel: [575621.666634]  [<ffffffff811a7ba3>] mem_cgroup_try_charge+0x63/0x100                                                                               
 876 <4>Aug 22 13:44:21 node-15 kernel: [575621.666640]  [<ffffffff8117477b>] wp_page_copy.isra.63+0x14b/0x500                                                                               
 877 <4>Aug 22 13:44:21 node-15 kernel: [575621.666643]  [<ffffffff811760fe>] do_wp_page+0x8e/0x450                                                                                          
 878 <4>Aug 22 13:44:21 node-15 kernel: [575621.666647]  [<ffffffff8117814b>] handle_mm_fault+0xd7b/0x1380                                                                                   
 879 <4>Aug 22 13:44:21 node-15 kernel: [575621.666656]  [<ffffffff81a98c2a>] ? _raw_spin_lock_irqsave+0x2a/0x50                                                                             
 880 <4>Aug 22 13:44:21 node-15 kernel: [575621.666661]  [<ffffffff810a2d88>] ? __try_to_take_rt_mutex+0x108/0x160                                                                           
 881 <4>Aug 22 13:44:21 node-15 kernel: [575621.666664]  [<ffffffff81a98c70>] ? _raw_spin_unlock_irqrestore+0x20/0x60                                                                        
 882 <4>Aug 22 13:44:21 node-15 kernel: [575621.666667]  [<ffffffff81a975e0>] ? rt_mutex_trylock+0x80/0xc0                                                                                   
 883 <4>Aug 22 13:44:21 node-15 kernel: [575621.666673]  [<ffffffff8104efaf>] __do_page_fault+0x16f/0x4d0                                                                                    
 884 <4>Aug 22 13:44:21 node-15 kernel: [575621.666676]  [<ffffffff8104f342>] do_page_fault+0x32/0x90                                                                                        
 885 <4>Aug 22 13:44:21 node-15 kernel: [575621.666681]  [<ffffffff811463cd>] ? context_tracking_exit+0x1d/0x30                                                                              
 886 <4>Aug 22 13:44:21 node-15 kernel: [575621.666685]  [<ffffffff81a9b298>] page_fault+0x28/0x30                                                                                           
 887 <4>Aug 22 13:44:21 node-15 kernel: [575621.666688] ---[ end trace 0000000000000002 ]---                                                                                                 
 888 <7>Aug 22 13:52:14 node-15 kernel: [576094.285955] kvm: zapping shadow pages for mmio generation wraparound                                                                             
 889 <7>Aug 22 13:52:14 node-15 kernel: [576094.362130] kvm: zapping shadow pages for mmio generation wraparound                                                                             
 890 <3>Aug 22 13:52:21 node-15 kernel: [576101.551233] kvm [1424015]: vcpu3 unhandled rdmsr: 0x606               
	<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
	
	I find there is a discuss at:
	https://lkml.org/lkml/2015/12/3/460
	Whether it is the same problem as above?  Is it a known issue , which has not been fixed in kernel 4.4.x?


Thanks
Feng

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: All process has been hanged after a kernel WARNING in kernel 4.4.x
  2017-08-23 12:40 ` Feng Feng24 Liu
  (?)
@ 2017-08-23 12:57 ` Michal Hocko
  -1 siblings, 0 replies; 3+ messages in thread
From: Michal Hocko @ 2017-08-23 12:57 UTC (permalink / raw)
  To: Feng Feng24 Liu
  Cc: linux-kernel, linux-rt-users, kirill.shutemov, gregkh, rostedt,
	Tong Tong3 Li

On Wed 23-08-17 12:40:36, Feng Feng24 Liu wrote:
> Dear experts
> 	I install kernel 4.4.70-rt83 in my environment, and run QEMU-KVM & OVS-DPDK on my server.

Is this reproducible? If yes could you try without RT patches applied to
know this is applicable to vanilla kernel as well?

> 	After a kernel warning, I found that all of the process, such as sshd, has no response. The monitor cannot displayed. All process looks like has been hanged. But the server could be ping. 

The warning tells that we have underflown the counter and that can have
variety of side effects.
[...]
> 	I find there is a discuss at:
> 	https://lkml.org/lkml/2015/12/3/460

from a quick glance this doesn't seem related.

> 	Whether it is the same problem as above?  Is it a known issue , which has not been fixed in kernel 4.4.x?

I haven't seen any such reports.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2017-08-23 12:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-23 12:40 All process has been hanged after a kernel WARNING in kernel 4.4.x Feng Feng24 Liu
2017-08-23 12:40 ` Feng Feng24 Liu
2017-08-23 12:57 ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.