All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad@kernel.org>
To: Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	Jan Beulich <JBeulich@suse.com>,
	xen-devel@lists.xen.org
Subject: Re: dom0 / hypervisor hang  on dom0 boot
Date: Fri, 17 May 2013 18:28:16 -0400	[thread overview]
Message-ID: <20130517222814.GA3255@localhost.localdomain> (raw)
In-Reply-To: <1630888.LbRauWP15S@amur.mch.fsc.net>

[-- Attachment #1: Type: text/plain, Size: 2581 bytes --]

On Thu, May 16, 2013 at 01:07:05PM +0200, Dietmar Hahn wrote:
> Am Mittwoch 15 Mai 2013, 10:42:17 schrieb Jan Beulich:
> > >>> On 15.05.13 at 11:12, Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> wrote:
> > > Am Mittwoch 15 Mai 2013, 09:35:46 schrieb Jan Beulich:
> > >> >>> On 15.05.13 at 08:53, Dietmar Hahn <dietmar.hahn@ts.fujitsu.com> wrote:
> > >> > I tried iommu=debug and I can't see any faulting messages but Iam not
> > >> > familiar with this code.
> > >> > I attached the logging, maybe anyone can have a look on this.
> > 
> > Perhaps only (if at all) by instrumenting the hypervisor. The
> > question of course is how easily/quickly you can narrow down the
> > code region that it might be dying in. And whether it's a hypervisor
> > action at all that causes the hang (as opposed to something the
> > DRM code in Dom0 does).
> 
> I added some debug code to the linux kernel and could track down the
> point of the hang. I used openSuSE kernel 3.7.10-1.4 but I looked at newer
> kernels and found that the code is similar.
> 
> i915_gem_init_global_gtt(...)
>  ...
>  intel_gtt_clear_range(start / PAGE_SIZE, (end-start) / PAGE_SIZE);
>  ...
> 
> void intel_gtt_clear_range(unsigned int first_entry, unsigned int num_entries)
> {
>         unsigned int i;
> 
>     ---> A printk(...) here is seen on serial line!
> 
>         for (i = first_entry; i < (first_entry + num_entries); i++) {
>                 intel_private.driver->write_entry(intel_private.base.scratch_page_dma,
>                                                   i, 0);
>         }
> 
>     ---> A printk(...) here is never seen!
> 
>         readl(intel_private.gtt+i-1);
> }
> 
> The function behind the pointer intel_private.driver->write_entry is
> i965_write_entry(). And the interesting instruction seems to be:
>   writel(addr | pte_flags, intel_private.gtt + entry);
> 
> I added another printk() on start of the function i965_write_entry().
> And surprisingly  after printing a lot of messages the kernel came up!!!
> But now I had other problems like losing the audio device (maybe timeouts).
> So maybe the hang is a timing problem?
> 
> What I wanted to check is, what the hypervisor is doing while the system hangs.
> Has anybody an idea maybe a timer and after 30s printing a dump of the stack of
> all cpus?

Yes. Can you try the two attached patches please.

> Thanks.
> 
> Dietmar.
> 
> 
> -- 
> Company details: http://ts.fujitsu.com/imprint.html
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

[-- Attachment #2: 0001-drm-i915-Don-t-leak-a-page-in-case-of-DMA-error-mapp.patch --]
[-- Type: text/plain, Size: 1947 bytes --]

>From 4201962b743a44325ff848ba6387d3710343c123 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Fri, 17 May 2013 18:13:35 -0400
Subject: [PATCH 1/2] drm/i915: Don't leak a page in case of DMA error mapping.

We don't free the allocated page if we fail to setup the DMA
mapping. This fixes it.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/char/agp/intel-gtt.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c
index dbd901e..701b328 100644
--- a/drivers/char/agp/intel-gtt.c
+++ b/drivers/char/agp/intel-gtt.c
@@ -294,9 +294,10 @@ static int intel_gtt_setup_scratch_page(void)
 	if (intel_private.base.needs_dmar) {
 		dma_addr = pci_map_page(intel_private.pcidev, page, 0,
 				    PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
-		if (pci_dma_mapping_error(intel_private.pcidev, dma_addr))
+		if (pci_dma_mapping_error(intel_private.pcidev, dma_addr)) {
+			__intel_gtt_teardown_scratch_page();
 			return -EINVAL;
-
+		}
 		intel_private.base.scratch_page_dma = dma_addr;
 	} else
 		intel_private.base.scratch_page_dma = page_to_phys(page);
@@ -542,15 +543,18 @@ static unsigned int intel_gtt_mappable_entries(void)
 
 	return aperture_size >> PAGE_SHIFT;
 }
-
-static void intel_gtt_teardown_scratch_page(void)
+static void __intel_gtt_teardown_scratch_page(void)
 {
 	set_pages_wb(intel_private.scratch_page, 1);
-	pci_unmap_page(intel_private.pcidev, intel_private.base.scratch_page_dma,
-		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
 	put_page(intel_private.scratch_page);
 	__free_page(intel_private.scratch_page);
 }
+static void intel_gtt_teardown_scratch_page(void)
+{
+	pci_unmap_page(intel_private.pcidev, intel_private.base.scratch_page_dma,
+		       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
+	__intel_gtt_teardown_scratch_page();
+}
 
 static void intel_gtt_cleanup(void)
 {
-- 
1.8.1.2


[-- Attachment #3: 0002-drm-i915-Sync-the-scratch-page-after-writting-values.patch --]
[-- Type: text/plain, Size: 1221 bytes --]

>From 51908f611fb00195d98f1a552106c6d1709720c0 Mon Sep 17 00:00:00 2001
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date: Fri, 17 May 2013 18:20:46 -0400
Subject: [PATCH 2/2] drm/i915: Sync the scratch page after writting values to
 it.

We don't sync the page after we have written to it - this is what
you are suppose to when doing:

  pci_map_page
	.. write some values
  [ was missing a call to pci_dma_sync_single_for_device]
	.. read some values
  pci_unmap_page

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
---
 drivers/char/agp/intel-gtt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/char/agp/intel-gtt.c b/drivers/char/agp/intel-gtt.c
index 701b328..89dd698 100644
--- a/drivers/char/agp/intel-gtt.c
+++ b/drivers/char/agp/intel-gtt.c
@@ -902,6 +902,9 @@ void intel_gtt_clear_range(unsigned int first_entry, unsigned int num_entries)
 		intel_private.driver->write_entry(intel_private.base.scratch_page_dma,
 						  i, 0);
 	}
+	pci_dma_sync_single_for_device(intel_private.pcidev,
+				       intel_private.base.scratch_page_dma,
+				       PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
 	readl(intel_private.gtt+i-1);
 }
 EXPORT_SYMBOL(intel_gtt_clear_range);
-- 
1.8.1.2


[-- Attachment #4: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  parent reply	other threads:[~2013-05-17 22:28 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-05-14 12:35 dom0 / hypervisor hang on dom0 boot Dietmar Hahn
2013-05-14 12:42 ` Andrew Cooper
2013-05-14 12:50   ` Dietmar Hahn
2013-05-14 12:51     ` Andrew Cooper
2013-05-14 13:25       ` Dietmar Hahn
2013-05-14 13:27 ` Jan Beulich
2013-05-15  6:53   ` Dietmar Hahn
2013-05-15  8:35     ` Jan Beulich
2013-05-15  9:12       ` Dietmar Hahn
2013-05-15  9:42         ` Jan Beulich
2013-05-16 11:07           ` Dietmar Hahn
2013-05-16 12:10             ` Jan Beulich
2013-05-16 13:16               ` Dietmar Hahn
2013-05-16 13:45                 ` Jan Beulich
2013-05-17  7:10                   ` Dietmar Hahn
2013-05-16 14:50               ` Dugger, Donald D
2013-05-20 14:30               ` Dugger, Donald D
2013-05-21  8:03                 ` Jan Beulich
2013-05-21  8:28                   ` Tian, Kevin
2013-05-21  8:47                     ` Jan Beulich
2013-05-17 22:28             ` Konrad Rzeszutek Wilk [this message]
2013-05-21  7:39               ` Dietmar Hahn
2013-05-21 14:10                 ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130517222814.GA3255@localhost.localdomain \
    --to=konrad@kernel.org \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=dietmar.hahn@ts.fujitsu.com \
    --cc=konrad.wilk@oracle.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.