xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Steven Haigh <netwiz@crc.id.au>
To: Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	xen-devel <xen-devel@lists.xenproject.org>,
	linux-kernel@vger.kernel.org
Cc: "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>
Subject: Re: 4.4: INFO: rcu_sched self-detected stall on CPU
Date: Tue, 29 Mar 2016 19:56:22 +1100	[thread overview]
Message-ID: <56FA4336.2030301__15742.7484788515$1459241883$gmane$org@crc.id.au> (raw)
In-Reply-To: <56F5A87A.8000903@crc.id.au>


[-- Attachment #1.1.1: Type: text/plain, Size: 4390 bytes --]

On 26/03/2016 8:07 AM, Steven Haigh wrote:
> On 26/03/2016 3:20 AM, Boris Ostrovsky wrote:
>> On 03/25/2016 12:04 PM, Steven Haigh wrote:
>>> It may not actually be the full logs. Once the system gets really upset,
>>> you can't run anything - as such, grabbing anything from dmesg is not
>>> possible.
>>>
>>> The logs provided above is all that gets spat out to the syslog server.
>>>
>>> I'll try tinkering with a few things to see if I can get more output -
>>> but right now, that's all I've been able to achieve. So far, my only
>>> ideas are to remove the 'quiet' options from the kernel command line -
>>> but I'm not sure how much that would help.
>>>
>>> Suggestions gladly accepted on this front.
>>
>> You probably want to run connected to guest serial console ("
>> serial='pty' " in guest config file and something like 'loglevel=7
>> console=tty0 console=ttyS0,38400n8' on guest kernel commandline). And
>> start the guest with 'xl create -c <cfg>' or connect later with 'xl
>> console <domainID>'.
> 
> Ok thanks, I've booted the DomU with:
> 
> $ cat /proc/cmdline
> root=UUID=63ade949-ee67-4afb-8fe7-ecd96faa15e2 ro enforcemodulesig=1
> selinux=0 fsck.repair=yes loglevel=7 console=tty0 console=ttyS0,38400n8
> 
> I've left a screen session attached to the console (via xl console) and
> I'll see if that turns anything up. As this seems to be rather
> unpredictable when it happens, it may take a day or two to get anything.
> I just hope its more than the syslog output :)

Interestingly enough, this just happened again - but on a different
virtual machine. I'm starting to wonder if this may have something to do
with the uptime of the machine - as the system that this seems to happen
to is always different.

Destroying it and monitoring it again has so far come up blank.

I've thrown the latest lot of kernel messages here:
    http://paste.fedoraproject.org/346802/59241532

Interestingly, around the same time, /var/log/messages on the remote
syslog server shows:
Mar 29 17:00:01 zeus systemd: Created slice user-0.slice.
Mar 29 17:00:01 zeus systemd: Starting user-0.slice.
Mar 29 17:00:01 zeus systemd: Started Session 1567 of user root.
Mar 29 17:00:01 zeus systemd: Starting Session 1567 of user root.
Mar 29 17:00:01 zeus systemd: Removed slice user-0.slice.
Mar 29 17:00:01 zeus systemd: Stopping user-0.slice.
Mar 29 17:01:01 zeus systemd: Created slice user-0.slice.
Mar 29 17:01:01 zeus systemd: Starting user-0.slice.
Mar 29 17:01:01 zeus systemd: Started Session 1568 of user root.
Mar 29 17:01:01 zeus systemd: Starting Session 1568 of user root.
Mar 29 17:08:34 zeus ntpdate[18569]: adjust time server 203.56.246.94
offset -0.002247 sec
Mar 29 17:08:34 zeus systemd: Removed slice user-0.slice.
Mar 29 17:08:34 zeus systemd: Stopping user-0.slice.
Mar 29 17:10:01 zeus systemd: Created slice user-0.slice.
Mar 29 17:10:01 zeus systemd: Starting user-0.slice.
Mar 29 17:10:01 zeus systemd: Started Session 1569 of user root.
Mar 29 17:10:01 zeus systemd: Starting Session 1569 of user root.
Mar 29 17:10:01 zeus systemd: Removed slice user-0.slice.
Mar 29 17:10:01 zeus systemd: Stopping user-0.slice.
Mar 29 17:20:01 zeus systemd: Created slice user-0.slice.
Mar 29 17:20:01 zeus systemd: Starting user-0.slice.
Mar 29 17:20:01 zeus systemd: Started Session 1570 of user root.
Mar 29 17:20:01 zeus systemd: Starting Session 1570 of user root.
Mar 29 17:20:01 zeus systemd: Removed slice user-0.slice.
Mar 29 17:20:01 zeus systemd: Stopping user-0.slice.
Mar 29 17:30:55 zeus systemd: systemd-logind.service watchdog timeout
(limit 1min)!
Mar 29 17:32:25 zeus systemd: systemd-logind.service stop-sigabrt timed
out. Terminating.
Mar 29 17:33:56 zeus systemd: systemd-logind.service stop-sigterm timed
out. Killing.
Mar 29 17:35:26 zeus systemd: systemd-logind.service still around after
SIGKILL. Ignoring.
Mar 29 17:36:56 zeus systemd: systemd-logind.service stop-final-sigterm
timed out. Killing.
Mar 29 17:38:26 zeus systemd: systemd-logind.service still around after
final SIGKILL. Entering failed mode.
Mar 29 17:38:26 zeus systemd: Unit systemd-logind.service entered failed
state.
Mar 29 17:38:26 zeus systemd: systemd-logind.service failed.

-- 
Steven Haigh

Email: netwiz@crc.id.au
Web: https://www.crc.id.au
Phone: (03) 9001 6090 - 0412 935 897


[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

[-- Attachment #2: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2016-03-29  8:56 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-25  2:53 4.4: INFO: rcu_sched self-detected stall on CPU Steven Haigh
2016-03-25 12:23 ` Boris Ostrovsky
     [not found] ` <56F52DBF.5080006@oracle.com>
2016-03-25 14:05   ` Steven Haigh
     [not found]   ` <56F545B1.8080609@crc.id.au>
2016-03-25 14:44     ` Boris Ostrovsky
     [not found]     ` <56F54EE0.6030004@oracle.com>
2016-03-25 16:04       ` Steven Haigh
     [not found]       ` <56F56172.9020805@crc.id.au>
2016-03-25 16:20         ` Boris Ostrovsky
     [not found]         ` <56F5653B.1090700@oracle.com>
2016-03-25 21:07           ` Steven Haigh
     [not found]           ` <56F5A87A.8000903@crc.id.au>
2016-03-29  8:56             ` Steven Haigh [this message]
     [not found]             ` <56FA4336.2030301@crc.id.au>
2016-03-29 14:14               ` Boris Ostrovsky
     [not found]               ` <56FA8DDD.7070406@oracle.com>
2016-03-29 17:44                 ` Steven Haigh
     [not found]                 ` <56FABF17.7090608@crc.id.au>
2016-03-29 18:04                   ` Steven Haigh
     [not found]                   ` <56FAC3AC.9050802@crc.id.au>
2016-03-30 13:44                     ` Boris Ostrovsky
2016-05-02 20:54                     ` gregkh
2016-04-02  1:50                 ` Steven Haigh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='56FA4336.2030301__15742.7484788515$1459241883$gmane$org@crc.id.au' \
    --to=netwiz@crc.id.au \
    --cc=boris.ostrovsky@oracle.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).