All of lore.kernel.org
 help / color / mirror / Atom feed
* The whole round of i-g-t testing cost too long running time
@ 2014-04-15 15:46 Yang, Guang A
  2014-04-15 17:03 ` He, Shuang
  2014-04-15 17:17 ` Daniel Vetter
  0 siblings, 2 replies; 11+ messages in thread
From: Yang, Guang A @ 2014-04-15 15:46 UTC (permalink / raw)
  To: Vetter, Daniel, Barnes, Jesse, Widawsky, Benjamin, Wood, Thomas,
	Jin, Gordon, OTC GFX QA Extended, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1587 bytes --]

Hi all,
I have discussed with Daniel about the running time for each cases before and we set the standard as 10M, if one can’t finish after running 10M we will see it as Timeout and report bug on FDO(such as :  Bug 77474<https://bugs.freedesktop.org/show_bug.cgi?id=77474> - [PNV/IVB/HSW]igt/gem_tiled_swapping is slow and Bug 77475<https://bugs.freedesktop.org/show_bug.cgi?id=77475> - [PNV/IVB/HSW]igt//kms_pipe_crc_basic/read-crc-pipe-A is slow)
Now the true status is that i-g-t have more than 650+ subcases, running a whole round of testing will cost such a long time on QA side(beside that Timeout cases), QA also need to spend more time to analysis the result changing on each platforms.
You can find an example with this page: http://tinderbox.sh.intel.com/PRTS_UI/prtsresult.php?task_id=2778 for how long one testing round cost.
With the table of subtask:10831 on the page which for i-g-t test cases on BDW. Testing start at 19:16 PM and finished at 03:25 AM the next day, cost about 8 hours to run 638 test cases.
Each cases finished less than 10M as we expect, but the full time it too large, especially the BDW is the powerful machine on our side, ILK or PNV may take more than 10 hours. We not only run i-g-t but also need to test the piglit/performance/media which already need time.
Do we have any solutions to reduce the running time for whole i-g-t? it’s a pressing problem for QA after seeing the i-g-t case count enhance from 50 ->600+.


Best Regards~~

Open Source Technology Center (OTC)
Terence Yang(杨光)
Tel: 86-021-61167360
iNet: 8821-7360


[-- Attachment #1.2: Type: text/html, Size: 6062 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: The whole round of i-g-t testing cost too long running time
  2014-04-15 15:46 The whole round of i-g-t testing cost too long running time Yang, Guang A
@ 2014-04-15 17:03 ` He, Shuang
  2014-04-15 17:17 ` Daniel Vetter
  1 sibling, 0 replies; 11+ messages in thread
From: He, Shuang @ 2014-04-15 17:03 UTC (permalink / raw)
  To: Yang, Guang A, Vetter, Daniel, Barnes, Jesse, Widawsky, Benjamin,
	Wood, Thomas, Jin, Gordon, OTC GFX QA Extended, intel-gfx


[-- Attachment #1.1: Type: text/plain, Size: 1992 bytes --]

Chated with Ben last week about this
It may be feasiable to have a fast.tests for intel-gpu-tools also

Thanks
         --Shuang

From: Yang, Guang A
Sent: Tuesday, April 15, 2014 11:46 PM
To: Vetter, Daniel; Barnes, Jesse; Widawsky, Benjamin; Wood, Thomas; Jin, Gordon; OTC GFX QA Extended; intel-gfx@lists.freedesktop.org
Subject: The whole round of i-g-t testing cost too long running time

Hi all,
I have discussed with Daniel about the running time for each cases before and we set the standard as 10M, if one can’t finish after running 10M we will see it as Timeout and report bug on FDO(such as :  Bug 77474<https://bugs.freedesktop.org/show_bug.cgi?id=77474> - [PNV/IVB/HSW]igt/gem_tiled_swapping is slow and Bug 77475<https://bugs.freedesktop.org/show_bug.cgi?id=77475> - [PNV/IVB/HSW]igt//kms_pipe_crc_basic/read-crc-pipe-A is slow)
Now the true status is that i-g-t have more than 650+ subcases, running a whole round of testing will cost such a long time on QA side(beside that Timeout cases), QA also need to spend more time to analysis the result changing on each platforms.
You can find an example with this page: http://tinderbox.sh.intel.com/PRTS_UI/prtsresult.php?task_id=2778 for how long one testing round cost.
With the table of subtask:10831 on the page which for i-g-t test cases on BDW. Testing start at 19:16 PM and finished at 03:25 AM the next day, cost about 8 hours to run 638 test cases.
Each cases finished less than 10M as we expect, but the full time it too large, especially the BDW is the powerful machine on our side, ILK or PNV may take more than 10 hours. We not only run i-g-t but also need to test the piglit/performance/media which already need time.
Do we have any solutions to reduce the running time for whole i-g-t? it’s a pressing problem for QA after seeing the i-g-t case count enhance from 50 ->600+.


Best Regards~~

Open Source Technology Center (OTC)
Terence Yang(杨光)
Tel: 86-021-61167360
iNet: 8821-7360


[-- Attachment #1.2: Type: text/html, Size: 7571 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: The whole round of i-g-t testing cost too long running time
  2014-04-15 15:46 The whole round of i-g-t testing cost too long running time Yang, Guang A
  2014-04-15 17:03 ` He, Shuang
@ 2014-04-15 17:17 ` Daniel Vetter
  2014-04-15 21:07   ` He, Shuang
  2014-04-16 15:42   ` Jesse Barnes
  1 sibling, 2 replies; 11+ messages in thread
From: Daniel Vetter @ 2014-04-15 17:17 UTC (permalink / raw)
  To: Yang, Guang A, Barnes, Jesse, Widawsky, Benjamin, Wood, Thomas,
	Jin, Gordon, OTC GFX QA Extended, intel-gfx, Parenteau, Paul A,
	Nikkanen, Kimmo


[-- Attachment #1.1: Type: text/plain, Size: 2738 bytes --]

On 15/04/2014 17:46, Yang, Guang A wrote:
>
> Hi all,
>
> I have discussed with Daniel about the running time for each cases
> before and we set the standard as 10M, if one can’t finish after
> running 10M we will see it as Timeout and report bug on FDO(such as :
> Bug 77474 <https://bugs.freedesktop.org/show_bug.cgi?id=77474> -
> [PNV/IVB/HSW]igt/gem_tiled_swapping is slow and Bug 77475
> <https://bugs.freedesktop.org/show_bug.cgi?id=77475> -
> [PNV/IVB/HSW]igt//kms_pipe_crc_basic/read-crc-pipe-A is slow)
>
> Now the true status is that i-g-t have more than 650+ subcases,
> running a whole round of testing will cost such a long time on QA
> side(*beside that Timeout cases*), QA also need to spend more time to
> analysis the result changing on each platforms.
>
> You can find an example with this
> page:http://tinderbox.sh.intel.com/PRTS_UI/prtsresult.php?task_id=2778
> for how long one testing round cost.
>
> With the table of subtask:10831 on the page which for i-g-t test cases
> on BDW. Testing start at 19:16 PM and finished at 03:25 AM the next
> day, cost about *8 hours* to run 638 test cases.
>
> Each cases finished less than 10M as we expect, but the full time it
> too large, especially the BDW is the powerful machine on our side, ILK
> or PNV may take more than *10 hours*. We not only run i-g-t but also
> need to test the piglit/performance/media which already need time.
>
> Do we have any solutions to reduce the running time for whole i-g-t?
> it’s a pressing problem for QA after seeing the i-g-t case count
> enhance from 50 ->600+.
>
Ok there are a few cases where we can indeed make tests faster, but it
will be work for us. And that won't really speed up much since we're
adding piles more testcases at a pretty quick rate. And many of these
new testcases are CRC based, so inheritely take some time to run.

So I think longer-term we simply need to throw more machines at the
problem and run testcases in parallel on identical machines.

Wrt analyzing issues I think the right approach for moving forward is:
a) switch to piglit to run tests, not just enumerate them. This will
allow QA and developers to share testcase analysis.
b) add automated analysis for time-consuming and error prone cases like
dmesg warnings and backtraces. Thomas&I have just discussed a few ideas
in this are in our 1:1 today.

Reducing the set of igt tests we run is imo pointless: The goal of igt
is to hit corner-cases, arbitrarily selecting which kinds of
corner-cases we test just means that we have a nice illusion about our
test coverage.

Adding more people to the discussion.

Cheers, Daniel
Intel Semiconductor AG
Registered No. 020.30.913.786-7
Registered Office: Badenerstrasse 549, 8048 Zurich, Switzerland

[-- Attachment #1.2: Type: text/html, Size: 4757 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: The whole round of i-g-t testing cost too long running time
  2014-04-15 17:17 ` Daniel Vetter
@ 2014-04-15 21:07   ` He, Shuang
  2014-04-16  5:47     ` Yang, Guang A
  2014-04-16 15:42   ` Jesse Barnes
  1 sibling, 1 reply; 11+ messages in thread
From: He, Shuang @ 2014-04-15 21:07 UTC (permalink / raw)
  To: Vetter, Daniel, Yang, Guang A, Barnes, Jesse, Widawsky, Benjamin,
	Wood, Thomas, Jin, Gordon, OTC GFX QA Extended, intel-gfx,
	Parenteau, Paul A, Nikkanen, Kimmo


[-- Attachment #1.1: Type: text/plain, Size: 3843 bytes --]

From: Vetter, Daniel
Sent: Wednesday, April 16, 2014 1:18 AM
To: Yang, Guang A; Barnes, Jesse; Widawsky, Benjamin; Wood, Thomas; Jin, Gordon; OTC GFX QA Extended; intel-gfx@lists.freedesktop.org; Parenteau, Paul A; Nikkanen, Kimmo
Subject: Re: The whole round of i-g-t testing cost too long running time

On 15/04/2014 17:46, Yang, Guang A wrote:
Hi all,
I have discussed with Daniel about the running time for each cases before and we set the standard as 10M, if one can't finish after running 10M we will see it as Timeout and report bug on FDO(such as :  Bug 77474<https://bugs.freedesktop.org/show_bug.cgi?id=77474> - [PNV/IVB/HSW]igt/gem_tiled_swapping is slow and Bug 77475<https://bugs.freedesktop.org/show_bug.cgi?id=77475> - [PNV/IVB/HSW]igt//kms_pipe_crc_basic/read-crc-pipe-A is slow)
Now the true status is that i-g-t have more than 650+ subcases, running a whole round of testing will cost such a long time on QA side(beside that Timeout cases), QA also need to spend more time to analysis the result changing on each platforms.
You can find an example with this page: http://tinderbox.sh.intel.com/PRTS_UI/prtsresult.php?task_id=2778 for how long one testing round cost.
With the table of subtask:10831 on the page which for i-g-t test cases on BDW. Testing start at 19:16 PM and finished at 03:25 AM the next day, cost about 8 hours to run 638 test cases.
Each cases finished less than 10M as we expect, but the full time it too large, especially the BDW is the powerful machine on our side, ILK or PNV may take more than 10 hours. We not only run i-g-t but also need to test the piglit/performance/media which already need time.
Do we have any solutions to reduce the running time for whole i-g-t? it's a pressing problem for QA after seeing the i-g-t case count enhance from 50 ->600+.
Ok there are a few cases where we can indeed make tests faster, but it will be work for us. And that won't really speed up much since we're adding piles more testcases at a pretty quick rate. And many of these new testcases are CRC based, so inheritely take some time to run.
[He, Shuang] OK, so it takes at least n/60 in usual case to have result detected plus additional execution time, depending on how many rounds of testing. We will be absolutely happy to see more tests coming that is useful

So I think longer-term we simply need to throw more machines at the problem and run testcases in parallel on identical machines.
[He, Shuang] This would be the perfect way to go if all tests are really feasible to take long time to run. If we get more identical test machines, then problem solved

Wrt analyzing issues I think the right approach for moving forward is:
a) switch to piglit to run tests, not just enumerate them. This will allow QA and developers to share testcase analysis.
[He, Shuang] Yes, though this could not actually accelerate the test. We could directly wrap over piglit to run testing (have other control process to monitor and collecting test results)

b) add automated analysis for time-consuming and error prone cases like dmesg warnings and backtraces. Thomas&I have just discussed a few ideas in this are in our 1:1 today.

Reducing the set of igt tests we run is imo pointless: The goal of igt is to hit corner-cases, arbitrarily selecting which kinds of corner-cases we test just means that we have a nice illusion about our test coverage.
[He, Shuang] I don't think select a subset of test cases to run is pointless. It's a trade-off between speed and correctness. For our nightly testing it's not so useful to run only a small set of testing. But for fast sanity testing, it should be easier, which is supposed to catch regression in major/critical functionality (So other developers and QA could continue their work).


Adding more people to the discussion.

Cheers, Daniel

[-- Attachment #1.2: Type: text/html, Size: 8713 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: The whole round of i-g-t testing cost too long running time
  2014-04-15 21:07   ` He, Shuang
@ 2014-04-16  5:47     ` Yang, Guang A
  2014-04-16  8:24       ` Daniel Vetter
  0 siblings, 1 reply; 11+ messages in thread
From: Yang, Guang A @ 2014-04-16  5:47 UTC (permalink / raw)
  To: He, Shuang, Vetter, Daniel, Barnes, Jesse, Widawsky, Benjamin,
	Wood, Thomas, Jin, Gordon, OTC GFX QA Extended, intel-gfx,
	Parenteau, Paul A, Nikkanen, Kimmo


[-- Attachment #1.1: Type: text/plain, Size: 2821 bytes --]

Ok there are a few cases where we can indeed make tests faster, but it will be work for us. And that won't really speed up much since we're adding piles more testcases at a pretty quick rate. And many of these new testcases are CRC based, so inheritely take some time to run.
[He, Shuang] OK, so it takes at least n/60 in usual case to have result detected plus additional execution time, depending on how many rounds of testing. We will be absolutely happy to see more tests coming that is useful
[Guang YANG] Except these CRC case, some stress case may also cost a bit of time, especially<app:ds:especially> on some old platforms. Maybe can reduce the loop in that kind of stress case?

So I think longer-term we simply need to throw more machines at the problem and run testcases in parallel on identical machines.
[He, Shuang] This would be the perfect way to go if all tests are really feasible to take long time to run. If we get more identical test machines, then problem solved
[Guang YANG] shuang's PRTS can cover some work for i-g-t testing and catch some regressions. Most of the i-g-t bugs are from HSW+, so I hope keep focus on these new platforms.  but now we don't have enough free machine resource (such as BYT,BDW)to support one machine only run i-g-t in nightly.


Wrt analyzing issues I think the right approach for moving forward is:
a) switch to piglit to run tests, not just enumerate them. This will allow QA and developers to share testcase analysis.
[He, Shuang] Yes, though this could not actually accelerate the test. We could directly wrap over piglit to run testing (have other control process to monitor and collecting test results)
[Guang YANG] Yeah, Shuang said is what we did. Piglit have been improved more powerful, but our infrastructure have better remote control and result collecting. If it will be comfortable for Developers to see the case result from running piglit, we can discuss how to match these two framework together.

b) add automated analysis for time-consuming and error prone cases like dmesg warnings and backtraces. Thomas&I have just discussed a few ideas in this are in our 1:1 today.

Reducing the set of igt tests we run is imo pointless: The goal of igt is to hit corner-cases, arbitrarily selecting which kinds of corner-cases we test just means that we have a nice illusion about our test coverage.
[He, Shuang] I don't think select a subset of test cases to run is pointless. It's a trade-off between speed and correctness. For our nightly testing it's not so useful to run only a small set of testing. But for fast sanity testing, it should be easier, which is supposed to catch regression in major/critical functionality (So other developers and QA could continue their work).


Adding more people to the discussion.

Cheers, Daniel

[-- Attachment #1.2: Type: text/html, Size: 7243 bytes --]

[-- Attachment #2: Type: text/plain, Size: 159 bytes --]

_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: The whole round of i-g-t testing cost too long running time
  2014-04-16  5:47     ` Yang, Guang A
@ 2014-04-16  8:24       ` Daniel Vetter
  2014-04-16  9:27         ` Yang, Guang A
  0 siblings, 1 reply; 11+ messages in thread
From: Daniel Vetter @ 2014-04-16  8:24 UTC (permalink / raw)
  To: Yang, Guang A
  Cc: OTC GFX QA Extended, Nikkanen, Kimmo, intel-gfx, Widawsky,
	Benjamin, Barnes, Jesse, Jin, Gordon, Vetter, Daniel, Parenteau,
	Paul A

On Wed, Apr 16, 2014 at 7:47 AM, Yang, Guang A <guang.a.yang@intel.com> wrote:
> Ok there are a few cases where we can indeed make tests faster, but it will
> be work for us. And that won't really speed up much since we're adding piles
> more testcases at a pretty quick rate. And many of these new testcases are
> CRC based, so inheritely take some time to run.
>
> [He, Shuang] OK, so it takes at least n/60 in usual case to have result
> detected plus additional execution time, depending on how many rounds of
> testing. We will be absolutely happy to see more tests coming that is useful
>
> [Guang YANG] Except these CRC case, some stress case may also cost a bit of
> time, especially on some old platforms. Maybe can reduce the loop in that
> kind of stress case?

I think stopping the tests after 10 minutes is ok, but in general the
point of stress tests is to beat on the kernel for corner cases. E.g.
even with todays extensive set of stress tests some spurious OOM bugs
can only be reproduced in 1 out of 5 runs. Reducing the test time
could severely impact the testing power of a test, so I'm vary for
doing that.

But there are tricks to speed up some tests which shouldn't affect the
power of the testcase to find bugs, and we should definitely look into
those.

> So I think longer-term we simply need to throw more machines at the problem
> and run testcases in parallel on identical machines.
>
> [He, Shuang] This would be the perfect way to go if all tests are really
> feasible to take long time to run. If we get more identical test machines,
> then problem solved
>
> [Guang YANG] shuang’s PRTS can cover some work for i-g-t testing and catch
> some regressions. Most of the i-g-t bugs are from HSW+, so I hope keep focus
> on these new platforms.  but now we don’t have enough free machine resource
> (such as BYT,BDW)to support one machine only run i-g-t in nightly.

Does this mean that due to PRTS we now have fewer machines running
tests on drm-intel-nightly? I've thought the idea is to share machines
on an as-needed basis, with -nightly testing getting priority?

> Wrt analyzing issues I think the right approach for moving forward is:
> a) switch to piglit to run tests, not just enumerate them. This will allow
> QA and developers to share testcase analysis.
>
> [He, Shuang] Yes, though this could not actually accelerate the test. We
> could directly wrap over piglit to run testing (have other control process
> to monitor and collecting test results)
>
> [Guang YANG] Yeah, Shuang said is what we did. Piglit have been improved
> more powerful, but our infrastructure have better remote control and result
> collecting. If it will be comfortable for Developers to see the case result
> from running piglit, we can discuss how to match these two framework
> together.

Yeah keeping your overall test-runner infrastructure makes sense. The
idea behind my proposal to use piglit to execute the individual tests
is to share analysis scripts. That won't make the tests run any
faster, but it should (in the long term at least) speed up the
triaging a lot. And the high amount of time required for bug triaging
also seems to be an issue for you guys.

> b) add automated analysis for time-consuming and error prone cases like
> dmesg warnings and backtraces. Thomas&I have just discussed a few ideas in
> this are in our 1:1 today.
>
> Reducing the set of igt tests we run is imo pointless: The goal of igt is to
> hit corner-cases, arbitrarily selecting which kinds of corner-cases we test
> just means that we have a nice illusion about our test coverage.
>
> [He, Shuang] I don’t think select a subset of test cases to run is
> pointless. It’s a trade-off between speed and correctness. For our nightly
> testing it’s not so useful to run only a small set of testing. But for fast
> sanity testing, it should be easier, which is supposed to catch regression
> in major/critical functionality (So other developers and QA could continue
> their work).

I agree that for a quick sanity test a reduced test set makes sense.
Which is why we have a testcase naming convention which can be used
together with the piglit -x and -t flags. I do that a lot when
developing things.

But for regression testing imo only the full test suite makes sense,
otherwise we just have a false sense of security. I.e. if the full set
means we can only run it every 2 days then I prefer that over running
only a subset. Also very often there are other issues delaying the
time between when a buggy patch was committed and when the full bug
report is available, so imo the 10h runtime isn't too bad from my pov
really.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: The whole round of i-g-t testing cost too long running time
  2014-04-16  8:24       ` Daniel Vetter
@ 2014-04-16  9:27         ` Yang, Guang A
  0 siblings, 0 replies; 11+ messages in thread
From: Yang, Guang A @ 2014-04-16  9:27 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: OTC GFX QA Extended, Nikkanen, Kimmo, intel-gfx, Widawsky,
	Benjamin, Barnes, Jesse, Jin, Gordon, Vetter, Daniel, Parenteau,
	Paul A

> I think stopping the tests after 10 minutes is ok, but in general the point of
> stress tests is to beat on the kernel for corner cases. E.g.
> even with todays extensive set of stress tests some spurious OOM bugs can
> only be reproduced in 1 out of 5 runs. Reducing the test time could severely
> impact the testing power of a test, so I'm vary for doing that.

[Guang YANG] Agree with you, but whether stop these stress testing after 10M if it not finished will impact the effort?

> But there are tricks to speed up some tests which shouldn't affect the power of
> the testcase to find bugs, and we should definitely look into those.
> 
> Does this mean that due to PRTS we now have fewer machines running tests
> on drm-intel-nightly? I've thought the idea is to share machines on an
> as-needed basis, with -nightly testing getting priority?

[Guang YANG] No, we haven’t changed the scale for existed nightly testing resources ,and the PRTS can supplement our nightly well with more old platforms(ILK,PNV), but we still lack of BDW/BYT resources in both PRTS and nightly. 
			These new platforms are important and most of the bug reported on them. Running tests, checking patches, doing the bisect work keep the machine resources always busy. Shuang still hard work on sharing machines between PRTS and nightly. 
> 
> 
> Yeah keeping your overall test-runner infrastructure makes sense. The idea
> behind my proposal to use piglit to execute the individual tests is to share
> analysis scripts. That won't make the tests run any faster, but it should (in the
> long term at least) speed up the triaging a lot. And the high amount of time
> required for bug triaging also seems to be an issue for you guys.

[Guang YANG] Great, improve the analysis script or tool will be comfortable for QA, We can catch the regression more accurately for such as BackTrace. Do you have any plans? QA can also take over some tasks. 

> I agree that for a quick sanity test a reduced test set makes sense.
> Which is why we have a testcase naming convention which can be used
> together with the piglit -x and -t flags. I do that a lot when developing things.
> 
> But for regression testing imo only the full test suite makes sense, otherwise
> we just have a false sense of security. I.e. if the full set means we can only run
> it every 2 days then I prefer that over running only a subset. Also very often
> there are other issues delaying the time between when a buggy patch was
> committed and when the full bug report is available, so imo the 10h runtime
> isn't too bad from my pov really.

> -Daniel
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> +41 (0) 79 365 57 48 - http://blog.ffwll.ch[Guang YANG]  
_______________________________________________
Intel-gfx mailing list
Intel-gfx@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: The whole round of i-g-t testing cost too long running time
  2014-04-15 17:17 ` Daniel Vetter
  2014-04-15 21:07   ` He, Shuang
@ 2014-04-16 15:42   ` Jesse Barnes
  2014-04-16 15:50     ` Daniel Vetter
  2014-04-16 15:54     ` Damien Lespiau
  1 sibling, 2 replies; 11+ messages in thread
From: Jesse Barnes @ 2014-04-16 15:42 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: OTC GFX QA Extended, Nikkanen, Kimmo, intel-gfx, Widawsky,
	Benjamin, Jin, Gordon, Parenteau, Paul A

On Tue, 15 Apr 2014 19:17:59 +0200
Daniel Vetter <daniel.vetter@intel.com> wrote:
> Ok there are a few cases where we can indeed make tests faster, but it
> will be work for us. And that won't really speed up much since we're
> adding piles more testcases at a pretty quick rate. And many of these
> new testcases are CRC based, so inheritely take some time to run.

But each test should run very quickly in general; I think we have too
many tests that take much longer than they need to.  Adding some
hooks to the driver via debugfs may let us trigger specific cases
directly rather than trying to induce them through massive threading
and memory pressure for example.

And can you elaborate on the CRC tests?  It doesn't seem like those
should take more than a few frames to verify we're getting what we
expect...

> So I think longer-term we simply need to throw more machines at the
> problem and run testcases in parallel on identical machines.
> 
> Wrt analyzing issues I think the right approach for moving forward is:
> a) switch to piglit to run tests, not just enumerate them. This will
> allow QA and developers to share testcase analysis.
> b) add automated analysis for time-consuming and error prone cases like
> dmesg warnings and backtraces. Thomas&I have just discussed a few ideas
> in this are in our 1:1 today.
> 
> Reducing the set of igt tests we run is imo pointless: The goal of igt
> is to hit corner-cases, arbitrarily selecting which kinds of
> corner-cases we test just means that we have a nice illusion about our
> test coverage.

My goal is still to get full test coverage before patches get
committed, and that means having quick (<1hr) turnaround for testing
from the automated patch test system.  It seems like we'll need to
approach that from all angles: speeding up tests, parallelizing
execution, adding hooks to the driver, etc.

Jesse

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: The whole round of i-g-t testing cost too long running time
  2014-04-16 15:42   ` Jesse Barnes
@ 2014-04-16 15:50     ` Daniel Vetter
  2014-04-16 16:08       ` Ville Syrjälä
  2014-04-16 15:54     ` Damien Lespiau
  1 sibling, 1 reply; 11+ messages in thread
From: Daniel Vetter @ 2014-04-16 15:50 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: OTC GFX QA Extended, Nikkanen, Kimmo, intel-gfx, Widawsky,
	Benjamin, Jin, Gordon, Parenteau, Paul A

On 16/04/2014 17:42, Jesse Barnes wrote:
> On Tue, 15 Apr 2014 19:17:59 +0200
> Daniel Vetter <daniel.vetter@intel.com> wrote:
>> Ok there are a few cases where we can indeed make tests faster, but it
>> will be work for us. And that won't really speed up much since we're
>> adding piles more testcases at a pretty quick rate. And many of these
>> new testcases are CRC based, so inheritely take some time to run.
> But each test should run very quickly in general; I think we have too
> many tests that take much longer than they need to.  Adding some
> hooks to the driver via debugfs may let us trigger specific cases
> directly rather than trying to induce them through massive threading
> and memory pressure for example.
>
> And can you elaborate on the CRC tests?  It doesn't seem like those
> should take more than a few frames to verify we're getting what we
> expect...

Well they don't take more than a few frames, but we have a _lot_ of 
them, and there's a lot of cominations to test. It adds up quickly. Iirc 
we have over 150 kms_flip testcases alone ...

Like I've said I agree that we could speed tests up, but besides me 
doing the occasional tuning and improvement in that regard I have seen 0 
patches from developers in this area. Which lets me conclude that 
apparently it's not reallly that bad an isssue ;-) If people _really_ 
care about this I have a list of things to knock down. But first someone 
needs to find some time and resources for this.
>
>> So I think longer-term we simply need to throw more machines at the
>> problem and run testcases in parallel on identical machines.
>>
>> Wrt analyzing issues I think the right approach for moving forward is:
>> a) switch to piglit to run tests, not just enumerate them. This will
>> allow QA and developers to share testcase analysis.
>> b) add automated analysis for time-consuming and error prone cases like
>> dmesg warnings and backtraces. Thomas&I have just discussed a few ideas
>> in this are in our 1:1 today.
>>
>> Reducing the set of igt tests we run is imo pointless: The goal of igt
>> is to hit corner-cases, arbitrarily selecting which kinds of
>> corner-cases we test just means that we have a nice illusion about our
>> test coverage.
> My goal is still to get full test coverage before patches get
> committed, and that means having quick (<1hr) turnaround for testing
> from the automated patch test system.  It seems like we'll need to
> approach that from all angles: speeding up tests, parallelizing
> execution, adding hooks to the driver, etc.
>
>
Currently <1h and "full test coverage" are rather mutually exclusive 
unfortunately :( I agree that having it would be extremely useful for 
developers, but if this happens with any cost/service reduction for 
nightly testing on my branches I'm really opposed. Atm we already have a 
_really_ hard time keeping track of all the various regressions and bugs.
-Daniel
Intel Semiconductor AG
Registered No. 020.30.913.786-7
Registered Office: Badenerstrasse 549, 8048 Zurich, Switzerland

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: The whole round of i-g-t testing cost too long running time
  2014-04-16 15:42   ` Jesse Barnes
  2014-04-16 15:50     ` Daniel Vetter
@ 2014-04-16 15:54     ` Damien Lespiau
  1 sibling, 0 replies; 11+ messages in thread
From: Damien Lespiau @ 2014-04-16 15:54 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: OTC GFX QA Extended, Nikkanen, Kimmo, intel-gfx, Widawsky,
	Benjamin, Jin, Gordon, Daniel Vetter, Parenteau, Paul A

On Wed, Apr 16, 2014 at 08:42:27AM -0700, Jesse Barnes wrote:
> And can you elaborate on the CRC tests?  It doesn't seem like those
> should take more than a few frames to verify we're getting what we
> expect...

Indeed, if the CRC tests take a long time, that's a bug (for instance we
may never receive the CRC interrupt and at the moment this means the
test will be blocked into a read()). I'll fix this particular one (needs
to implement poll() support for the CRC result file in debugfs) when I'm
finished with more pressing tasks, if noone beats me to it, of course.

-- 
Damien

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: The whole round of i-g-t testing cost too long running time
  2014-04-16 15:50     ` Daniel Vetter
@ 2014-04-16 16:08       ` Ville Syrjälä
  0 siblings, 0 replies; 11+ messages in thread
From: Ville Syrjälä @ 2014-04-16 16:08 UTC (permalink / raw)
  To: Daniel Vetter
  Cc: OTC GFX QA Extended, Nikkanen, Kimmo, intel-gfx, Widawsky,
	Benjamin, Jesse Barnes, Jin, Gordon, Parenteau, Paul A

On Wed, Apr 16, 2014 at 05:50:20PM +0200, Daniel Vetter wrote:
> On 16/04/2014 17:42, Jesse Barnes wrote:
> > On Tue, 15 Apr 2014 19:17:59 +0200
> > Daniel Vetter <daniel.vetter@intel.com> wrote:
> >> Ok there are a few cases where we can indeed make tests faster, but it
> >> will be work for us. And that won't really speed up much since we're
> >> adding piles more testcases at a pretty quick rate. And many of these
> >> new testcases are CRC based, so inheritely take some time to run.
> > But each test should run very quickly in general; I think we have too
> > many tests that take much longer than they need to.  Adding some
> > hooks to the driver via debugfs may let us trigger specific cases
> > directly rather than trying to induce them through massive threading
> > and memory pressure for example.
> >
> > And can you elaborate on the CRC tests?  It doesn't seem like those
> > should take more than a few frames to verify we're getting what we
> > expect...
> 
> Well they don't take more than a few frames, but we have a _lot_ of 
> them, and there's a lot of cominations to test. It adds up quickly. Iirc 
> we have over 150 kms_flip testcases alone ...

Many kms_flips tests are more like "run this thing for a few (dozen)
seconds and see if there's something unexpected reported". They don't
really try to hit specific issues.

Some of the more targeted crc tests do run much quicker, but I'm
guessing there are still at least two factors that add up. The actual
modesets that the tests have to do cost time. And then there's the
num_crtcs * num_connectors factor that tends to lengthen the execution
time considerably. If we could run more tests for each pipe/connector
pair w/o doing intervening modesets could probably cut the time down
somewhat. But that sounds like a fairly massive undertaking as all
the tests would have to be in a library and get called from some
generic test runner code that does the modeset part.

-- 
Ville Syrjälä
Intel OTC

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2014-04-16 16:08 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-15 15:46 The whole round of i-g-t testing cost too long running time Yang, Guang A
2014-04-15 17:03 ` He, Shuang
2014-04-15 17:17 ` Daniel Vetter
2014-04-15 21:07   ` He, Shuang
2014-04-16  5:47     ` Yang, Guang A
2014-04-16  8:24       ` Daniel Vetter
2014-04-16  9:27         ` Yang, Guang A
2014-04-16 15:42   ` Jesse Barnes
2014-04-16 15:50     ` Daniel Vetter
2014-04-16 16:08       ` Ville Syrjälä
2014-04-16 15:54     ` Damien Lespiau

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.