All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [linux-linus test] 25478: regressions - FAIL [and 1 more messages]
@ 2014-04-16 11:49 Konrad Rzeszutek Wilk
  2014-04-16 12:06 ` Ian Jackson
  0 siblings, 1 reply; 9+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-04-16 11:49 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Boris Ostrovsky, Xen Devel, Ian Campbell, David Vrabel


On Apr 16, 2014 6:12 AM, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:
>
> Ian Jackson writes ("Re: [Xen-devel] [linux-linus test] 25478: regressions - FAIL [and 1 more messages]"): 
> > xen.org writes ("[linux-linus test] 25677: regressions - FAIL"): 
> > > flight 25677 linux-linus real [real] 
> > > http://www.chiark.greenend.org.uk/~xensrcts/logs/25677/ 
> > > 
> > > Regressions :-( 
> > > 
> > > Tests which did not succeed and are blocking, 
> > > including tests which could not be run: 
> > >  test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail REGR. vs. 12557 
> > 
> > This is still failing with zillions of this message 
> >   Mar 27 12:25:48.036598 [  850.054546] mptsas 0000:03:00.0: swiotlb buffer is full 
> > 
> > Are we any closer to figuring out how to get a fix for this past the 
> > x86 maintainers ? 
> > 
> > Does the Linux kernel not have a "no regressions" policy ?  This is a 
> > regression, after all... 
>
> Another week has gone by and Linux tip still fails this test.

Did you try the recommendation that David offered - boot with 4GB to dom0?

>
> Ian. 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-linus test] 25478: regressions - FAIL [and 1 more messages]
  2014-04-16 11:49 [linux-linus test] 25478: regressions - FAIL [and 1 more messages] Konrad Rzeszutek Wilk
@ 2014-04-16 12:06 ` Ian Jackson
  2014-04-16 14:11   ` Konrad Rzeszutek Wilk
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Jackson @ 2014-04-16 12:06 UTC (permalink / raw)
  To: Konrad Rzeszutek Wilk
  Cc: Boris Ostrovsky, Xen Devel, Ian Campbell, David Vrabel

Konrad Rzeszutek Wilk writes ("Re: [Xen-devel] [linux-linus test] 25478: regressions - FAIL [and 1 more messages]"):
> On Apr 16, 2014 6:12 AM, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:
> > Another week has gone by and Linux tip still fails this test.
> 
> Did you try the recommendation that David offered - boot with 4GB to dom0?

That would be a workaround.  I can do that (as it happens these
machines have 8G of RAM so it is actually possible), but:

The purpose of running tests is to discover bugs (so that they can be
fixed).  It is not to generate a nice clean report by sweeping things
under the carpet.

Here the tests have discovered a bug in Linux.  AIUI it's a bug which
is visible when using this particular driver, but which is actually a
problem with the Xen integration into the Linux VM system in general.
So other it presumably affects other drivers too.  It should be fixed,
not worked around.

The downside of not working around this bug is that osstest's
failing-host-stickiness will cause an increasing proportion of the
tests to run on the affected hosts.  This might mask other bugs.

Conversely, working around this bug in the manner suggested will
presumably just make the bug disappear off our radar.

Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-linus test] 25478: regressions - FAIL [and 1 more messages]
  2014-04-16 12:06 ` Ian Jackson
@ 2014-04-16 14:11   ` Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 9+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-04-16 14:11 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Boris Ostrovsky, Xen Devel, Ian Campbell, David Vrabel

On Wed, Apr 16, 2014 at 01:06:06PM +0100, Ian Jackson wrote:
> Konrad Rzeszutek Wilk writes ("Re: [Xen-devel] [linux-linus test] 25478: regressions - FAIL [and 1 more messages]"):
> > On Apr 16, 2014 6:12 AM, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:
> > > Another week has gone by and Linux tip still fails this test.
> > 
> > Did you try the recommendation that David offered - boot with 4GB to dom0?
> 
> That would be a workaround.  I can do that (as it happens these
> machines have 8G of RAM so it is actually possible), but:

Right.
> 
> The purpose of running tests is to discover bugs (so that they can be
> fixed).  It is not to generate a nice clean report by sweeping things
> under the carpet.

Right.
> 
> Here the tests have discovered a bug in Linux.  AIUI it's a bug which
> is visible when using this particular driver, but which is actually a
> problem with the Xen integration into the Linux VM system in general.
> So other it presumably affects other drivers too.  It should be fixed,
> not worked around.

I concur.
> 
> The downside of not working around this bug is that osstest's
> failing-host-stickiness will cause an increasing proportion of the
> tests to run on the affected hosts.  This might mask other bugs.
> 
> Conversely, working around this bug in the manner suggested will
> presumably just make the bug disappear off our radar.

There is a danger of that.

I am not going to be able to take a look at this bug in the next three 
weeks. If anybody else wants to take a stab at this - here is my hand-waving
idea of how it could be done:

1). Cleanup the ia64 usage of same code 'ia64_dma_get_required_mask'
    (it actually duplicates what the drivers/base has). Perhaps move it to
    lib/iommu-helper.c
    Call it 'generic_get_required_mask' or such. 
2). Look at other platforms that use a similar (or the same) code
    and see if they can re-use it now that it is in lib/iommu-helper.c.
3). Make the x86 dma_ops start using the extra "get_required_mask"
    and point it to the generic_get_required_mask.
4). Make the xen-swiotlb use its own version. 

> 
> Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-linus test] 25478: regressions - FAIL [and 1 more messages]
  2014-04-16 11:47                                 ` Ian Jackson
@ 2014-04-16 13:48                                   ` David Vrabel
  0 siblings, 0 replies; 9+ messages in thread
From: David Vrabel @ 2014-04-16 13:48 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Boris.Ostrovsky, xen-devel, Ian Campbell

On 16/04/14 12:47, Ian Jackson wrote:
> David Vrabel writes ("Re: [Xen-devel] [linux-linus test] 25478: regressions - FAIL [and 1 more messages]"):
>> On 16/04/14 11:12, Ian Jackson wrote:
>>> Ian Jackson writes ("Re: [Xen-devel] [linux-linus test] 25478: regressions - FAIL [and 1 more messages]"):
>>>> Are we any closer to figuring out how to get a fix for this past the
>>>> x86 maintainers ?
>>>>
>>>> Does the Linux kernel not have a "no regressions" policy ?  This is a
>>>> regression, after all...
>>>
>>> Another week has gone by and Linux tip still fails this test.
>>
>> Yes, You've not fixed it yet.
> 
> My understanding from private emails is that the problem is mostly
> political rather than technical.
> 
> You wrote:
>>> mptsas is a driver that uses dma_get_required_mask() to determine a
>>> "suitable" DMA mask -- but under Xen dma_get_required_mask() may return
>>> the wrong mask since it only gets the physical RAM range and not machine
>>> addresses.

For this specific problem with the mptsas driver, yes.  But there other
problems with swiotlb usage such as by skbs using compound pages.

We (XenServer) plan to fix all these by making use of the IOMMU. See
Malcolm's recent design doc.  This is still quite a ways off though.

> I'm not familiar with the Linux kernel's VM system.  However, if
> someone would write a patch which provides an arch override for this
> (which from the private emails seems like is what required and would
> not be too hard for someone who knew what they were doing), then I can
> try to do the political work of negotiating with the Linux community.

This fix doesn't require any particular knowledge of the VM subsystem
and there's already some infrastructure for providing an arch specific
implementation of dma_get_required_mask().

David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-linus test] 25478: regressions - FAIL [and 1 more messages]
@ 2014-04-16 11:59 Konrad Rzeszutek Wilk
  0 siblings, 0 replies; 9+ messages in thread
From: Konrad Rzeszutek Wilk @ 2014-04-16 11:59 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Boris Ostrovsky, Xen Devel, Ian Campbell, David Vrabel


On Apr 16, 2014 7:47 AM, Ian Jackson <Ian.Jackson@eu.citrix.com> wrote:
>
> David Vrabel writes ("Re: [Xen-devel] [linux-linus test] 25478: regressions - FAIL [and 1 more messages]"): 
> > On 16/04/14 11:12, Ian Jackson wrote: 
> > > Ian Jackson writes ("Re: [Xen-devel] [linux-linus test] 25478: regressions - FAIL [and 1 more messages]"): 
> > >> Are we any closer to figuring out how to get a fix for this past the 
> > >> x86 maintainers ? 
> > >> 
> > >> Does the Linux kernel not have a "no regressions" policy ?  This is a 
> > >> regression, after all... 
> > > 
> > > Another week has gone by and Linux tip still fails this test. 
> > 
> > Yes, You've not fixed it yet. 
>
> My understanding from private emails is that the problem is mostly 
> political rather than technical. 
>
> You wrote: 
> >> mptsas is a driver that uses dma_get_required_mask() to determine a 
> >> "suitable" DMA mask -- but under Xen dma_get_required_mask() may return 
> >> the wrong mask since it only gets the physical RAM range and not machine 
> >> addresses. 
>
> I'm not familiar with the Linux kernel's VM system.  However, if 
> someone would write a patch which provides an arch override for this 
> (which from the private emails seems like is what required and would 
> not be too hard for someone who knew what they were doing), then I can 
> try to do the political work of negotiating with the Linux community. 
>

Sure. But before we go that route - can you try the suggestion given by David to see if does indeed 'fix' the issue.

> Ian. 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-linus test] 25478: regressions - FAIL [and 1 more messages]
  2014-04-16 10:53                               ` David Vrabel
@ 2014-04-16 11:47                                 ` Ian Jackson
  2014-04-16 13:48                                   ` David Vrabel
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Jackson @ 2014-04-16 11:47 UTC (permalink / raw)
  To: David Vrabel; +Cc: Boris.Ostrovsky, xen-devel, Ian Campbell

David Vrabel writes ("Re: [Xen-devel] [linux-linus test] 25478: regressions - FAIL [and 1 more messages]"):
> On 16/04/14 11:12, Ian Jackson wrote:
> > Ian Jackson writes ("Re: [Xen-devel] [linux-linus test] 25478: regressions - FAIL [and 1 more messages]"):
> >> Are we any closer to figuring out how to get a fix for this past the
> >> x86 maintainers ?
> >>
> >> Does the Linux kernel not have a "no regressions" policy ?  This is a
> >> regression, after all...
> > 
> > Another week has gone by and Linux tip still fails this test.
> 
> Yes, You've not fixed it yet.

My understanding from private emails is that the problem is mostly
political rather than technical.

You wrote:
>> mptsas is a driver that uses dma_get_required_mask() to determine a
>> "suitable" DMA mask -- but under Xen dma_get_required_mask() may return
>> the wrong mask since it only gets the physical RAM range and not machine
>> addresses.

I'm not familiar with the Linux kernel's VM system.  However, if
someone would write a patch which provides an arch override for this
(which from the private emails seems like is what required and would
not be too hard for someone who knew what they were doing), then I can
try to do the political work of negotiating with the Linux community.

Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-linus test] 25478: regressions - FAIL [and 1 more messages]
  2014-04-16 10:12                             ` Ian Jackson
@ 2014-04-16 10:53                               ` David Vrabel
  2014-04-16 11:47                                 ` Ian Jackson
  0 siblings, 1 reply; 9+ messages in thread
From: David Vrabel @ 2014-04-16 10:53 UTC (permalink / raw)
  To: Ian Jackson; +Cc: Boris.Ostrovsky, xen-devel, Ian Campbell

On 16/04/14 11:12, Ian Jackson wrote:
> Ian Jackson writes ("Re: [Xen-devel] [linux-linus test] 25478: regressions - FAIL [and 1 more messages]"):
>> xen.org writes ("[linux-linus test] 25677: regressions - FAIL"):
>>> flight 25677 linux-linus real [real]
>>> http://www.chiark.greenend.org.uk/~xensrcts/logs/25677/
>>>
>>> Regressions :-(
>>>
>>> Tests which did not succeed and are blocking,
>>> including tests which could not be run:
>>>  test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail REGR. vs. 12557
>>
>> This is still failing with zillions of this message
>>   Mar 27 12:25:48.036598 [  850.054546] mptsas 0000:03:00.0: swiotlb buffer is full
>>
>> Are we any closer to figuring out how to get a fix for this past the
>> x86 maintainers ?
>>
>> Does the Linux kernel not have a "no regressions" policy ?  This is a
>> regression, after all...
> 
> Another week has gone by and Linux tip still fails this test.

Yes, You've not fixed it yet.

David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-linus test] 25478: regressions - FAIL [and 1 more messages]
  2014-04-10 15:37                           ` [linux-linus test] 25478: regressions - FAIL [and 1 more messages] Ian Jackson
@ 2014-04-16 10:12                             ` Ian Jackson
  2014-04-16 10:53                               ` David Vrabel
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Jackson @ 2014-04-16 10:12 UTC (permalink / raw)
  To: Ian Campbell, David Vrabel, Boris.Ostrovsky,
	Konrad Rzeszutek Wilk, xen-devel

Ian Jackson writes ("Re: [Xen-devel] [linux-linus test] 25478: regressions - FAIL [and 1 more messages]"):
> xen.org writes ("[linux-linus test] 25677: regressions - FAIL"):
> > flight 25677 linux-linus real [real]
> > http://www.chiark.greenend.org.uk/~xensrcts/logs/25677/
> > 
> > Regressions :-(
> > 
> > Tests which did not succeed and are blocking,
> > including tests which could not be run:
> >  test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail REGR. vs. 12557
> 
> This is still failing with zillions of this message
>   Mar 27 12:25:48.036598 [  850.054546] mptsas 0000:03:00.0: swiotlb buffer is full
> 
> Are we any closer to figuring out how to get a fix for this past the
> x86 maintainers ?
> 
> Does the Linux kernel not have a "no regressions" policy ?  This is a
> regression, after all...

Another week has gone by and Linux tip still fails this test.

Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [linux-linus test] 25478: regressions - FAIL [and 1 more messages]
       [not found]                         ` <1395143026.12847.47.camel@kazak.uk.xensource.com>
@ 2014-04-10 15:37                           ` Ian Jackson
  2014-04-16 10:12                             ` Ian Jackson
  0 siblings, 1 reply; 9+ messages in thread
From: Ian Jackson @ 2014-04-10 15:37 UTC (permalink / raw)
  To: xen.org, Ian Campbell; +Cc: Boris.Ostrovsky, xen-devel, David Vrabel

xen.org writes ("[linux-linus test] 25677: regressions - FAIL"):
> flight 25677 linux-linus real [real]
> http://www.chiark.greenend.org.uk/~xensrcts/logs/25677/
> 
> Regressions :-(
> 
> Tests which did not succeed and are blocking,
> including tests which could not be run:
>  test-amd64-i386-pair 17 guest-migrate/src_host/dst_host fail REGR. vs. 12557

This is still failing with zillions of this message
  Mar 27 12:25:48.036598 [  850.054546] mptsas 0000:03:00.0: swiotlb buffer is full

Are we any closer to figuring out how to get a fix for this past the
x86 maintainers ?

Does the Linux kernel not have a "no regressions" policy ?  This is a
regression, after all...

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-04-16 14:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-16 11:49 [linux-linus test] 25478: regressions - FAIL [and 1 more messages] Konrad Rzeszutek Wilk
2014-04-16 12:06 ` Ian Jackson
2014-04-16 14:11   ` Konrad Rzeszutek Wilk
  -- strict thread matches above, loose matches on Subject: below --
2014-04-16 11:59 Konrad Rzeszutek Wilk
2014-03-27 18:00 [linux-linus test] 25677: regressions - FAIL xen.org
2014-03-14 16:42 ` [linux-linus test] 25478: " xen.org
2014-03-14 17:07   ` Ian Campbell
2014-03-14 18:23     ` Konrad Rzeszutek Wilk
2014-03-17 11:08       ` Ian Campbell
2014-03-17 19:36         ` Konrad Rzeszutek Wilk
2014-03-18  9:29           ` Ian Campbell
2014-03-18 10:55             ` David Vrabel
2014-03-18 11:04               ` Ian Campbell
2014-03-18 11:10                 ` Ian Campbell
2014-03-18 11:25                   ` David Vrabel
2014-03-18 11:29                     ` Ian Campbell
     [not found]                       ` <53283118.5050706@citrix.com>
     [not found]                         ` <1395143026.12847.47.camel@kazak.uk.xensource.com>
2014-04-10 15:37                           ` [linux-linus test] 25478: regressions - FAIL [and 1 more messages] Ian Jackson
2014-04-16 10:12                             ` Ian Jackson
2014-04-16 10:53                               ` David Vrabel
2014-04-16 11:47                                 ` Ian Jackson
2014-04-16 13:48                                   ` David Vrabel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.