All of lore.kernel.org
 help / color / mirror / Atom feed
* Dom0 ACPI S3 patches
@ 2011-09-14  8:11 Adi Kriegisch
  2011-09-14  8:41 ` Jan Beulich
  2011-09-14 10:21 ` Konrad Rzeszutek Wilk
  0 siblings, 2 replies; 11+ messages in thread
From: Adi Kriegisch @ 2011-09-14  8:11 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel

Dear Konrad,

just to let you know: I am using your patches[1] on my notebook (Thinkpad
T61p) and they are working perfectly fine for me. I encountered three issues
which I could solve:
* Machine crashes some time after wakeup with "BUG: unable to handle kernel
  NULL pointer dereferenced at (null)". The crashing process was sshd as I
  am forwarding my window manager from a DomU to X with nouveau running on
  Dom0 with sdm.
  I fixed that by setting all interrupts in the BIOS to "auto-select"
  instead of the fixed default of "IRQ11". Since then I had no more crashes.
* The DomUs do not resync their clock after Dom0 waking up. They're
  basically continue to count the time as if the sleep never happened.
  I have to run 'ntpdate' on resume on all the DomUs. I am not sure if
  there are any side effects of this; probably there is a more simple way
  to tell a DomU to reread clock from Dom0?
* vbetool hangs at 100% CPU on resume (i/o waiting, I guess, because
  neither strace nor ltrace do show any activity). Simply killing vbetool
  (no -9) kind of "fixes" the issue. Probably I do not even need to run
  vbetool on resume.

Anyways, thank you very much for your efforts in bringing decent Dom0
support to upstream kernel! Your patches applied cleanly to the
Debian/testing package linux-image-3.0.0-1-amd64 (3.0.0-3) and work just
fine!

best regards,
    Adi Kriegisch

[1] http://lists.xensource.com/archives/html/xen-devel/2011-08/msg01358.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Dom0 ACPI S3 patches
  2011-09-14  8:11 Dom0 ACPI S3 patches Adi Kriegisch
@ 2011-09-14  8:41 ` Jan Beulich
  2011-09-14 17:44   ` Jeremy Fitzhardinge
  2011-09-14 10:21 ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 11+ messages in thread
From: Jan Beulich @ 2011-09-14  8:41 UTC (permalink / raw)
  To: Adi Kriegisch, Konrad Rzeszutek Wilk; +Cc: xen-devel

>>> On 14.09.11 at 10:11, Adi Kriegisch <adi@cg.tuwien.ac.at> wrote:
> * The DomUs do not resync their clock after Dom0 waking up. They're
>   basically continue to count the time as if the sleep never happened.
>   I have to run 'ntpdate' on resume on all the DomUs. I am not sure if
>   there are any side effects of this; probably there is a more simple way
>   to tell a DomU to reread clock from Dom0?

This is a more fundamental problem - upstream pv-ops doesn't make
use of XENFP_settime (or its bogus alias DOM_SETTIME) at all; only
Jeremy's 2.6.32.x tree has this so far.

Jan

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Dom0 ACPI S3 patches
  2011-09-14  8:11 Dom0 ACPI S3 patches Adi Kriegisch
  2011-09-14  8:41 ` Jan Beulich
@ 2011-09-14 10:21 ` Konrad Rzeszutek Wilk
  2011-09-14 13:17   ` Adi Kriegisch
       [not found]   ` <20110914112718.GG3079@vrvis.at>
  1 sibling, 2 replies; 11+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-14 10:21 UTC (permalink / raw)
  To: Adi Kriegisch; +Cc: xen-devel

On Wed, Sep 14, 2011 at 10:11:56AM +0200, Adi Kriegisch wrote:
> Dear Konrad,
> 
> just to let you know: I am using your patches[1] on my notebook (Thinkpad

Excellent. Is it OK if I put 'Tested-by: Adi Kriegish" on them?

> T61p) and they are working perfectly fine for me. I encountered three issues

Wait, T61p.. Can you actually do 64-bit on that laptop?Or are you using
a 32-bit hypervisor?

> which I could solve:
> * Machine crashes some time after wakeup with "BUG: unable to handle kernel
>   NULL pointer dereferenced at (null)". The crashing process was sshd as I
>   am forwarding my window manager from a DomU to X with nouveau running on
>   Dom0 with sdm.
>   I fixed that by setting all interrupts in the BIOS to "auto-select"
>   instead of the fixed default of "IRQ11". Since then I had no more crashes.

Ok, any other data? Stack trace?

> * The DomUs do not resync their clock after Dom0 waking up. They're
>   basically continue to count the time as if the sleep never happened.
>   I have to run 'ntpdate' on resume on all the DomUs. I am not sure if
>   there are any side effects of this; probably there is a more simple way
>   to tell a DomU to reread clock from Dom0?

You know, I don't know. I just never thought about that - um. I wonder
if it is related to the RTC update patch that I've been meaning
to take a look at:

http://lists.xensource.com/archives/html/xen-devel/2010-02/msg00469.html

> * vbetool hangs at 100% CPU on resume (i/o waiting, I guess, because
>   neither strace nor ltrace do show any activity). Simply killing vbetool
>   (no -9) kind of "fixes" the issue. Probably I do not even need to run
>   vbetool on resume.

Why do you run it? Anyhow there is a patch for vbetool to work
correctly with Nvidia drivers .. somewhere. ah, here.

diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 1256454..3d91e46 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -316,9 +316,14 @@ static int mmap_mem(struct file *file, struct vm_area_struct *vma)
                        &vma->vm_page_prot))
        return -EINVAL;
 
-   vma->vm_page_prot = phys_mem_access_prot(file, vma->vm_pgoff,
-                        size,
-                        vma->vm_page_prot);
+   vma->vm_flags |= VM_RESERVED | VM_IO | VM_PFNMAP | VM_DONTEXPAND;
+   vma->vm_page_prot =  __pgprot(
+           pgprot_val(vm_get_page_prot(vma->vm_flags)) |
+           _PAGE_IOMAP |
+           pgprot_val(phys_mem_access_prot(file,
+               vma->vm_pgoff,
+               size,
+               vma->vm_page_prot)));
 
    vma->vm_ops = &mmap_mem_ops;
 


> 
> Anyways, thank you very much for your efforts in bringing decent Dom0
> support to upstream kernel! Your patches applied cleanly to the
> Debian/testing package linux-image-3.0.0-1-amd64 (3.0.0-3) and work just
> fine!

Woot!

> 
> best regards,
>     Adi Kriegisch
> 
> [1] http://lists.xensource.com/archives/html/xen-devel/2011-08/msg01358.html

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: Dom0 ACPI S3 patches
  2011-09-14 10:21 ` Konrad Rzeszutek Wilk
@ 2011-09-14 13:17   ` Adi Kriegisch
       [not found]   ` <20110914112718.GG3079@vrvis.at>
  1 sibling, 0 replies; 11+ messages in thread
From: Adi Kriegisch @ 2011-09-14 13:17 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, Adi Kriegisch

[-- Attachment #1: Type: text/plain, Size: 1620 bytes --]

Dear Konrad,

first off, I am really sorry to first post that everything is just fine and
then -- while responding to your mail -- getting the same crash again. :-(

> > which I could solve:
> > * Machine crashes some time after wakeup with "BUG: unable to handle kernel
> >   NULL pointer dereferenced at (null)". The crashing process was sshd as I
> >   am forwarding my window manager from a DomU to X with nouveau running on
> >   Dom0 with sdm.
> >   I fixed that by setting all interrupts in the BIOS to "auto-select"
> >   instead of the fixed default of "IRQ11". Since then I had no more crashes.
> 
> Ok, any other data? Stack trace?
Find that stuff attached: the archive contains the kernel trace (I took a
photo, typed the stuff and checked twice... In case you want to have the
photo, just tell me) and two relevant parts of the syslog (9.6G
uncompressed):
First part is last two messages of the suspend process from yesterday and
the wakeup messages from today morning.
The other part is me plugging in my phone (which I used to take pictures of
the kernel trace), mounting it as usb mass storage device and finally
copying images dom0 to the domU that runs my desktop.
Right after copying finished I was looking at the images for some minutes.

I noticed these WARNINGs on all other crashes too. From the first
appearance of these warnings it took 2 to 10 minutes to crash. Apps that
triggered the warnings were: apt-get, bash, cp, dpkg, gzip, swapd, evtchn,
xenfs, xm_wm, Xorg, vi, "rs:main", "kworker/u:30" and several more...

I hope this helps. I'd be more than happy to help out with testing!

-- Adi

[-- Attachment #2: 2011-09-14_-_Dom0_Crash_Backtraces.tar.gz --]
[-- Type: application/octet-stream, Size: 144140 bytes --]

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Dom0 ACPI S3 patches
       [not found]   ` <20110914112718.GG3079@vrvis.at>
@ 2011-09-14 14:28     ` Konrad Rzeszutek Wilk
  2011-09-14 15:09       ` Adi Kriegisch
  0 siblings, 1 reply; 11+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-14 14:28 UTC (permalink / raw)
  To: Adi Kriegisch; +Cc: xen-devel, Adi Kriegisch

> > Excellent. Is it OK if I put 'Tested-by: Adi Kriegish" on them?
> Sure, go ahead! ;-)
> Update: No, the system just crashed while writing this mail after about 4 days
> of uptime with many suspend-resume cycles in between... *sigh* :-(

Hmmm.. I wonder if you are hitting the writecombine issue I've seen sometimes.
Just to eliminate it, can you try 'nopat' on the Linux command line?
..
> > Ok, any other data? Stack trace?
> Yes. I will send them in a second mail... I hope I can find all relevant
> information.

<nods>
> > You know, I don't know. I just never thought about that - um. I wonder
> > if it is related to the RTC update patch that I've been meaning
> > to take a look at:
> > 
> > http://lists.xensource.com/archives/html/xen-devel/2010-02/msg00469.html
> Sounds like it could be related. Shall I apply that patch? If so, which
> hook takes care that the function is called?

It kind of automatically hooks up. If you can apply it cleanly - sure. But it
might not apply cleanly :-(

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Dom0 ACPI S3 patches
  2011-09-14 14:28     ` Konrad Rzeszutek Wilk
@ 2011-09-14 15:09       ` Adi Kriegisch
  2011-09-14 15:47         ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 11+ messages in thread
From: Adi Kriegisch @ 2011-09-14 15:09 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, Adi Kriegisch

On Wed, Sep 14, 2011 at 10:28:54AM -0400, Konrad Rzeszutek Wilk wrote:
> > > Excellent. Is it OK if I put 'Tested-by: Adi Kriegish" on them?
> > Sure, go ahead! ;-)
> > Update: No, the system just crashed while writing this mail after about 4 days
> > of uptime with many suspend-resume cycles in between... *sigh* :-(
> 
> Hmmm.. I wonder if you are hitting the writecombine issue I've seen sometimes.
> Just to eliminate it, can you try 'nopat' on the Linux command line?
Sure. Do you know any way to make sure I am hitting the writecombine issue
fast, so that I can make (kind of) sure everything is working?

> > > You know, I don't know. I just never thought about that - um. I wonder
> > > if it is related to the RTC update patch that I've been meaning
> > > to take a look at:
> > > 
> > > http://lists.xensource.com/archives/html/xen-devel/2010-02/msg00469.html
> > Sounds like it could be related. Shall I apply that patch? If so, which
> > hook takes care that the function is called?
> 
> It kind of automatically hooks up. If you can apply it cleanly - sure. But it
> might not apply cleanly :-(
It does not apply at all:
first hunk fails because there have been some other includes added.
second hunk fails because there is no more
"#endif /* CONFIG_PARAVIRT_CLOCK_VSYSCALL */"... and this is the point
where I can't fix a bug because I do not know enough of the kernel/xen
internals to know what to touch...

-- Adi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: Dom0 ACPI S3 patches
  2011-09-14 15:09       ` Adi Kriegisch
@ 2011-09-14 15:47         ` Konrad Rzeszutek Wilk
  2011-09-15 13:18           ` Adi Kriegisch
  2011-09-27 14:50           ` Adi Kriegisch
  0 siblings, 2 replies; 11+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-14 15:47 UTC (permalink / raw)
  To: Adi Kriegisch; +Cc: xen-devel

On Wed, Sep 14, 2011 at 05:09:34PM +0200, Adi Kriegisch wrote:
> On Wed, Sep 14, 2011 at 10:28:54AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > Excellent. Is it OK if I put 'Tested-by: Adi Kriegish" on them?
> > > Sure, go ahead! ;-)
> > > Update: No, the system just crashed while writing this mail after about 4 days
> > > of uptime with many suspend-resume cycles in between... *sigh* :-(
> > 
> > Hmmm.. I wonder if you are hitting the writecombine issue I've seen sometimes.
> > Just to eliminate it, can you try 'nopat' on the Linux command line?
> Sure. Do you know any way to make sure I am hitting the writecombine issue
> fast, so that I can make (kind of) sure everything is working?

Mysterious applications crashing left and right. Under my box bash stopped
working right and such. Pretty obvious that something went wrong.

> 
> > > > You know, I don't know. I just never thought about that - um. I wonder
> > > > if it is related to the RTC update patch that I've been meaning
> > > > to take a look at:
> > > > 
> > > > http://lists.xensource.com/archives/html/xen-devel/2010-02/msg00469.html
> > > Sounds like it could be related. Shall I apply that patch? If so, which
> > > hook takes care that the function is called?
> > 
> > It kind of automatically hooks up. If you can apply it cleanly - sure. But it
> > might not apply cleanly :-(
> It does not apply at all:

Pfff.. well, I will try to rebase it in a couple of days. Can you ping in a week
if I haven't sent anything to you yet?

> first hunk fails because there have been some other includes added.
> second hunk fails because there is no more
> "#endif /* CONFIG_PARAVIRT_CLOCK_VSYSCALL */"... and this is the point
> where I can't fix a bug because I do not know enough of the kernel/xen
> internals to know what to touch...
> 
> -- Adi
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Dom0 ACPI S3 patches
  2011-09-14  8:41 ` Jan Beulich
@ 2011-09-14 17:44   ` Jeremy Fitzhardinge
  0 siblings, 0 replies; 11+ messages in thread
From: Jeremy Fitzhardinge @ 2011-09-14 17:44 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Adi Kriegisch, Konrad Rzeszutek Wilk

On 09/14/2011 01:41 AM, Jan Beulich wrote:
>>>> On 14.09.11 at 10:11, Adi Kriegisch <adi@cg.tuwien.ac.at> wrote:
>> * The DomUs do not resync their clock after Dom0 waking up. They're
>>   basically continue to count the time as if the sleep never happened.
>>   I have to run 'ntpdate' on resume on all the DomUs. I am not sure if
>>   there are any side effects of this; probably there is a more simple way
>>   to tell a DomU to reread clock from Dom0?
> This is a more fundamental problem - upstream pv-ops doesn't make
> use of XENFP_settime (or its bogus alias DOM_SETTIME) at all; only
> Jeremy's 2.6.32.x tree has this so far.

I was confused grepping for those: XEN*PF*_settime, or DOM*0*_SETTIME.

Yeah, thanks for the reminder.  I've queued that up for the next merge
window.

    J

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: Dom0 ACPI S3 patches
  2011-09-14 15:47         ` Konrad Rzeszutek Wilk
@ 2011-09-15 13:18           ` Adi Kriegisch
  2011-09-27 14:50           ` Adi Kriegisch
  1 sibling, 0 replies; 11+ messages in thread
From: Adi Kriegisch @ 2011-09-15 13:18 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, Adi Kriegisch

On Wed, Sep 14, 2011 at 11:47:26AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Sep 14, 2011 at 05:09:34PM +0200, Adi Kriegisch wrote:
> > On Wed, Sep 14, 2011 at 10:28:54AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > > Excellent. Is it OK if I put 'Tested-by: Adi Kriegish" on them?
> > > > Sure, go ahead! ;-)
Now, I think, you may really go ahead! ;-) 'nopat' did the trick for me.

> > > Hmmm.. I wonder if you are hitting the writecombine issue I've seen sometimes.
> > > Just to eliminate it, can you try 'nopat' on the Linux command line?
> > Sure. Do you know any way to make sure I am hitting the writecombine issue
> > fast, so that I can make (kind of) sure everything is working?
> Mysterious applications crashing left and right. Under my box bash stopped
> working right and such. Pretty obvious that something went wrong.
Hmmm... I rethought the workloads I had and found a way to reproduce the
crashes -- more or less reliably:
As I did a complete reinstallation of my notebook, I had a lot of data to
copy. The first bunch of sutff was copied on Dom0 -- this was where I
experienced the crashes; not immediately while copying but a little later
(The warnings in the syslog always happened during copying, btw).
Then -- after I basic setup was done -- I used xm block-attach and copied
tons of stuff within a DomU. I did not experience a single crash while
doing so.

Do you want me to do something to further debug the issue? Just tell me
what I could/should try to do! ;-)

> > > > > http://lists.xensource.com/archives/html/xen-devel/2010-02/msg00469.html
> > > > Sounds like it could be related. Shall I apply that patch? If so, which
> > > > hook takes care that the function is called?
[SNIP]
> > It does not apply at all:
> 
> Pfff.. well, I will try to rebase it in a couple of days. Can you ping in a week
> if I haven't sent anything to you yet?
Yes, I will! ;-)

-- Adi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: Dom0 ACPI S3 patches
  2011-09-14 15:47         ` Konrad Rzeszutek Wilk
  2011-09-15 13:18           ` Adi Kriegisch
@ 2011-09-27 14:50           ` Adi Kriegisch
  2011-09-27 22:26             ` Konrad Rzeszutek Wilk
  1 sibling, 1 reply; 11+ messages in thread
From: Adi Kriegisch @ 2011-09-27 14:50 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk; +Cc: xen-devel, Adi Kriegisch

On Wed, Sep 14, 2011 at 11:47:26AM -0400, Konrad Rzeszutek Wilk wrote:
> On Wed, Sep 14, 2011 at 05:09:34PM +0200, Adi Kriegisch wrote:
> > On Wed, Sep 14, 2011 at 10:28:54AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > > Excellent. Is it OK if I put 'Tested-by: Adi Kriegish" on them?
Works like a charme with 'nopat'. No crash ever since.

> > > > > You know, I don't know. I just never thought about that - um. I wonder
> > > > > if it is related to the RTC update patch that I've been meaning
> > > > > to take a look at:
> > > > > 
> > > > > http://lists.xensource.com/archives/html/xen-devel/2010-02/msg00469.html
[SNIP]
> Pfff.. well, I will try to rebase it in a couple of days. Can you ping in a week
> if I haven't sent anything to you yet?
Any news on this one? I still have to resync my clock after acpi sleep...

Thanks,
    Adi Kriegisch

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Re: Dom0 ACPI S3 patches
  2011-09-27 14:50           ` Adi Kriegisch
@ 2011-09-27 22:26             ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 11+ messages in thread
From: Konrad Rzeszutek Wilk @ 2011-09-27 22:26 UTC (permalink / raw)
  To: Adi Kriegisch; +Cc: xen-devel

On Tue, Sep 27, 2011 at 04:50:08PM +0200, Adi Kriegisch wrote:
> On Wed, Sep 14, 2011 at 11:47:26AM -0400, Konrad Rzeszutek Wilk wrote:
> > On Wed, Sep 14, 2011 at 05:09:34PM +0200, Adi Kriegisch wrote:
> > > On Wed, Sep 14, 2011 at 10:28:54AM -0400, Konrad Rzeszutek Wilk wrote:
> > > > > > Excellent. Is it OK if I put 'Tested-by: Adi Kriegish" on them?
> Works like a charme with 'nopat'. No crash ever since.
> 
> > > > > > You know, I don't know. I just never thought about that - um. I wonder
> > > > > > if it is related to the RTC update patch that I've been meaning
> > > > > > to take a look at:
> > > > > > 
> > > > > > http://lists.xensource.com/archives/html/xen-devel/2010-02/msg00469.html
> [SNIP]
> > Pfff.. well, I will try to rebase it in a couple of days. Can you ping in a week
> > if I haven't sent anything to you yet?
> Any news on this one? I still have to resync my clock after acpi sleep...

Jeremy just sent out the patches for review.
http://lists.xensource.com/archives/html/xen-devel/2011-09/msg01452.html
Please test.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2011-09-27 22:26 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-09-14  8:11 Dom0 ACPI S3 patches Adi Kriegisch
2011-09-14  8:41 ` Jan Beulich
2011-09-14 17:44   ` Jeremy Fitzhardinge
2011-09-14 10:21 ` Konrad Rzeszutek Wilk
2011-09-14 13:17   ` Adi Kriegisch
     [not found]   ` <20110914112718.GG3079@vrvis.at>
2011-09-14 14:28     ` Konrad Rzeszutek Wilk
2011-09-14 15:09       ` Adi Kriegisch
2011-09-14 15:47         ` Konrad Rzeszutek Wilk
2011-09-15 13:18           ` Adi Kriegisch
2011-09-27 14:50           ` Adi Kriegisch
2011-09-27 22:26             ` Konrad Rzeszutek Wilk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.