From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: cpuidle and un-eoid interrupts at the local apic
Date: Mon, 12 Aug 2013 15:06:36 +0100
Message-ID: <5208EBEC.9000308@citrix.com>
References: <51A908CA.7050604@citrix.com><51F8CB15.1070608@digithi.de><51F8DD40.2090207@citrix.com><51FC37A9.9090809@digithi.de><51FC418D.8020708@citrix.com><51FFBA8502000078000E9462@nat28.tlf.novell.com><51FFBC08.6070804@citrix.com>
	<52055EC9.8030207@digithi.de><520561E1.8020809@citrix.com>
	<520562C8.8080703@citrix.com> <5207CE0C.1000502@digithi.de>
	<A9667DDFB95DB7438FA9D7D576C3D87E0A8E11A4@SHSMSX104.ccr.corp.intel.com>
	<5208E933.1020609@digithi.de>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============4123433626007719309=="
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <5208E933.1020609@digithi.de>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Thimo E <abc@digithi.de>
Cc: Keir Fraser <keir@xen.org>, Jan Beulich <JBeulich@suse.com>, "Dong,
	Eddie" <eddie.dong@intel.com>, Xen-develList <xen-devel@lists.xen.org>, "Nakajima,
	Jun" <jun.nakajima@intel.com>, "Zhang,
	Yang Z" <yang.z.zhang@intel.com>, "Zhang,
	Xiantao" <xiantao.zhang@intel.com>
List-Id: xen-devel@lists.xenproject.org

--===============4123433626007719309==
Content-Type: multipart/alternative;
	boundary="------------030606040408030909030305"

--------------030606040408030909030305
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: 7bit

On 12/08/13 14:54, Thimo E wrote:
> Hello Yang,
>
> and attached the next crash dump which occured today, only some
> minutes after I've created the logfiles I've sent in the mail just before.
> Perhaps together with the logfiles of the former mail it gives you a
> better understand of what is going on.
>
> I've disabled Interrupt remapping now.
>
> > 4.....
> > can you add some debug message in the guest EOI code path(like
> _irq_guest_eoi())) to track the EOI?
> @Andrew: Is it possible for you to integrate the requested changes
> from Yang into your Xen debugging version ?

I already have.  That would be "Marked {foo} ready" debugging in the
PEOI stack section.

~Andrew

>
> Best regards
>   Thimo
>
> Am 12.08.2013 10:49, schrieb Zhang, Yang Z:
>>
>> Hi Thimo,
>>
>> From your previous experience and log, it shows:
>>
>> 1.       The interrupt that triggers the issue is a MSI.
>>
>> 2.       MSI are treated as edge-triggered interrupts nomally, except
>> when there is no way to mask the device. In this case, your previous
>> log indicates the device is unmaskable(What special device are you
>> using?Modern PCI devcie should be maskable).
>>
>> 3.       The IRQ 29 is belong to dom0, it seems it is not a HVM
>> related issue.
>>
>> 4.       The status of IRQ 29 is 10 which means the guest already
>> issues the EOI because the bit IRQ_GUEST_EOI_PENDING is cleared, so
>> there should be no pending EOI in the EOI stack. If possible, can you
>> add some debug message in the guest EOI code path(like
>> _irq_guest_eoi())) to track the EOI?
>>
>> 5.       Both of the log show when the issue occured, most of the
>> other interrupts which owned by dom0 were in IRQ_MOVE_PENDING status.
>> Is it a coincidence? Or it happened only on the special condition
>> like heavy of IRQ migration?Perhaps you can disable irq balance in
>> dom0 and pin the IRQ manually.
>>
> |6.       I guess the interrupt remapping is enabled in your machine.
> Can you try to disable IR to see whether it still reproduceable?
>>
>> Also, please provide the whole Xen log.
>>
>>  
>>
>> Best regards,
>>
>> Yang
>>
>


--------------030606040408030909030305
Content-Type: text/html; charset="windows-1252"
Content-Transfer-Encoding: 8bit

<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">On 12/08/13 14:54, Thimo E wrote:<br>
    </div>
    <blockquote cite="mid:5208E933.1020609@digithi.de" type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=windows-1252">
      <div class="moz-cite-prefix">Hello Yang,<br>
        <br>
        and attached the next crash dump which occured today, only some
        minutes after I've created the logfiles I've sent in the mail
        just before.<br>
        Perhaps together with the logfiles of the former mail it gives
        you a better understand of what is going on.<br>
        <br>
        I've disabled Interrupt remapping now.<br>
        <br>
        <span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
          lang="EN-US">&gt; 4.....<br>
          &gt; can you add some debug message in the guest EOI code
          path(like _irq_guest_eoi())) to track the EOI?</span><br>
        @Andrew: Is it possible for you to integrate the requested
        changes from Yang into your Xen debugging version ?<br>
      </div>
    </blockquote>
    <br>
    I already have.  That would be "Marked {foo} ready" debugging in the
    PEOI stack section.<br>
    <br>
    ~Andrew<br>
    <br>
    <blockquote cite="mid:5208E933.1020609@digithi.de" type="cite">
      <div class="moz-cite-prefix"> <br>
        Best regards<br>
          Thimo<br>
        <br>
        Am 12.08.2013 10:49, schrieb Zhang, Yang Z:<br>
      </div>
      <blockquote
cite="mid:A9667DDFB95DB7438FA9D7D576C3D87E0A8E11A4@SHSMSX104.ccr.corp.intel.com"
        type="cite">
        <meta name="Generator" content="Microsoft Word 14 (filtered
          medium)">
        <style><!--
/* Font Definitions */
@font-face
	{font-family:SimSun;
	panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
	{font-family:SimSun;
	panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
	{font-family:Calibri;
	panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
	{font-family:SimSun;
	panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0cm;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman","serif";
	color:black;}
a:link, span.MsoHyperlink
	{mso-style-priority:99;
	color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{mso-style-priority:99;
	color:purple;
	text-decoration:underline;}
pre
	{mso-style-priority:99;
	mso-style-link:"HTML Preformatted Char";
	margin:0cm;
	margin-bottom:.0001pt;
	font-size:10.0pt;
	font-family:"Courier New";
	color:black;}
p.MsoListParagraph, li.MsoListParagraph, div.MsoListParagraph
	{mso-style-priority:34;
	margin:0cm;
	margin-bottom:.0001pt;
	text-indent:21.0pt;
	font-size:12.0pt;
	font-family:"Times New Roman","serif";
	color:black;}
span.HTMLPreformattedChar
	{mso-style-name:"HTML Preformatted Char";
	mso-style-priority:99;
	mso-style-link:"HTML Preformatted";
	font-family:"Courier New";
	color:black;}
span.EmailStyle19
	{mso-style-type:personal-reply;
	font-family:"Calibri","sans-serif";
	color:#1F497D;}
.MsoChpDefault
	{mso-style-type:export-only;
	font-size:10.0pt;}
@page WordSection1
	{size:612.0pt 792.0pt;
	margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
	{page:WordSection1;}
/* List Definitions */
@list l0
	{mso-list-id:1272785873;
	mso-list-type:hybrid;
	mso-list-template-ids:-1415154368 -1729587200 67698713 67698715 67698703 67698713 67698715 67698703 67698713 67698715;}
@list l0:level1
	{mso-level-tab-stop:none;
	mso-level-number-position:left;
	margin-left:18.0pt;
	text-indent:-18.0pt;}
@list l0:level2
	{mso-level-number-format:alpha-lower;
	mso-level-text:"%2\)";
	mso-level-tab-stop:none;
	mso-level-number-position:left;
	margin-left:42.0pt;
	text-indent:-21.0pt;}
@list l0:level3
	{mso-level-number-format:roman-lower;
	mso-level-tab-stop:none;
	mso-level-number-position:right;
	margin-left:63.0pt;
	text-indent:-21.0pt;}
@list l0:level4
	{mso-level-tab-stop:none;
	mso-level-number-position:left;
	margin-left:84.0pt;
	text-indent:-21.0pt;}
@list l0:level5
	{mso-level-number-format:alpha-lower;
	mso-level-text:"%5\)";
	mso-level-tab-stop:none;
	mso-level-number-position:left;
	margin-left:105.0pt;
	text-indent:-21.0pt;}
@list l0:level6
	{mso-level-number-format:roman-lower;
	mso-level-tab-stop:none;
	mso-level-number-position:right;
	margin-left:126.0pt;
	text-indent:-21.0pt;}
@list l0:level7
	{mso-level-tab-stop:none;
	mso-level-number-position:left;
	margin-left:147.0pt;
	text-indent:-21.0pt;}
@list l0:level8
	{mso-level-number-format:alpha-lower;
	mso-level-text:"%8\)";
	mso-level-tab-stop:none;
	mso-level-number-position:left;
	margin-left:168.0pt;
	text-indent:-21.0pt;}
@list l0:level9
	{mso-level-number-format:roman-lower;
	mso-level-tab-stop:none;
	mso-level-number-position:right;
	margin-left:189.0pt;
	text-indent:-21.0pt;}
ol
	{margin-bottom:0cm;}
ul
	{margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
        <div class="WordSection1">
          <p class="MsoNormal"
            style="text-align:justify;text-justify:inter-ideograph"><a
              moz-do-not-send="true" name="_MailEndCompose"><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
                lang="EN-US">Hi Thimo,<o:p></o:p></span></a></p>
          <p class="MsoNormal"
            style="text-align:justify;text-justify:inter-ideograph"><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US">From your previous experience and log, it
              shows:<o:p></o:p></span></p>
          <p class="MsoListParagraph"
            style="margin-left:18.0pt;text-align:justify;text-justify:inter-ideograph;text-indent:-18.0pt;mso-list:l0


            level1 lfo1">
            <!--[if !supportLists]--><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US"><span style="mso-list:Ignore">1.<span
                  style="font:7.0pt &quot;Times New Roman&quot;">      
                </span></span></span><!--[endif]--><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US">The interrupt that triggers the issue is a
              MSI.<o:p></o:p></span></p>
          <p class="MsoListParagraph"
            style="margin-left:18.0pt;text-align:justify;text-justify:inter-ideograph;text-indent:-18.0pt;mso-list:l0


            level1 lfo1">
            <!--[if !supportLists]--><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US"><span style="mso-list:Ignore">2.<span
                  style="font:7.0pt &quot;Times New Roman&quot;">      
                </span></span></span><!--[endif]--><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US">MSI are treated as edge-triggered interrupts
              nomally, except when there is no way to mask the device.
              In this case, your previous log indicates the device is
              unmaskable(What special device are you using?Modern PCI
              devcie should be maskable). <o:p></o:p></span></p>
          <p class="MsoListParagraph"
            style="margin-left:18.0pt;text-align:justify;text-justify:inter-ideograph;text-indent:-18.0pt;mso-list:l0


            level1 lfo1">
            <!--[if !supportLists]--><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US"><span style="mso-list:Ignore">3.<span
                  style="font:7.0pt &quot;Times New Roman&quot;">      
                </span></span></span><!--[endif]--><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US">The IRQ 29 is belong to dom0, it seems it is
              not a HVM related issue.</span></p>
        </div>
      </blockquote>
      <blockquote
cite="mid:A9667DDFB95DB7438FA9D7D576C3D87E0A8E11A4@SHSMSX104.ccr.corp.intel.com"
        type="cite">
        <div class="WordSection1">
          <p class="MsoListParagraph"
            style="margin-left:18.0pt;text-align:justify;text-justify:inter-ideograph;text-indent:-18.0pt;mso-list:l0


            level1 lfo1"><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US"><span style="mso-list:Ignore">4.<span
                  style="font:7.0pt &quot;Times New Roman&quot;">      
                </span></span></span><!--[endif]--><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US">The status of IRQ 29 is 10 which means the
              guest already issues the EOI because the bit
              IRQ_GUEST_EOI_PENDING is cleared, so there should be no
              pending EOI in the EOI stack. If possible, can you add
              some debug message in the guest EOI code path(like
              _irq_guest_eoi())) to track the EOI?</span></p>
        </div>
      </blockquote>
      <blockquote
cite="mid:A9667DDFB95DB7438FA9D7D576C3D87E0A8E11A4@SHSMSX104.ccr.corp.intel.com"
        type="cite">
        <div class="WordSection1">
          <p class="MsoListParagraph"
            style="margin-left:18.0pt;text-align:justify;text-justify:inter-ideograph;text-indent:-18.0pt;mso-list:l0


            level1 lfo1"><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US"><span style="mso-list:Ignore">5.<span
                  style="font:7.0pt &quot;Times New Roman&quot;">      
                </span></span></span><!--[endif]--><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US">Both of the log show when the issue occured,
              most of the other interrupts which owned by dom0 were in
              IRQ_MOVE_PENDING status. Is it a coincidence? Or it
              happened only on the special condition like heavy of IRQ
              migration?Perhaps you can disable irq balance in dom0 and
              pin the IRQ manually.</span></p>
        </div>
      </blockquote>
      |<span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
        lang="EN-US"><span style="mso-list:Ignore">6.<span
            style="font:7.0pt &quot;Times New Roman&quot;">       </span></span></span><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
        lang="EN-US">I guess the interrupt remapping is enabled in your
        machine. Can you try to disable IR to see whether it still
        reproduceable?<o:p></o:p></span>
      <blockquote
cite="mid:A9667DDFB95DB7438FA9D7D576C3D87E0A8E11A4@SHSMSX104.ccr.corp.intel.com"
        type="cite">
        <div class="WordSection1">
          <p class="MsoNormal"
            style="text-align:justify;text-justify:inter-ideograph"><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US">Also, please provide the whole Xen log.<o:p></o:p></span></p>
          <p class="MsoNormal"
            style="text-align:justify;text-justify:inter-ideograph"><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US"><o:p> </o:p></span></p>
          <p class="MsoNormal"
            style="text-align:justify;text-justify:inter-ideograph"><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US">Best regards,<o:p></o:p></span></p>
          <p class="MsoNormal"
            style="text-align:justify;text-justify:inter-ideograph"><span
style="font-size:10.5pt;font-family:&quot;Calibri&quot;,&quot;sans-serif&quot;;color:#1F497D"
              lang="EN-US">Yang</span><br>
          </p>
        </div>
      </blockquote>
      <br>
    </blockquote>
    <br>
  </body>
</html>

--------------030606040408030909030305--


--===============4123433626007719309==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

--===============4123433626007719309==--