re-running teuthology jobs

* re-running teuthology jobs
@ 2015-02-28 10:28 Loic Dachary
  2015-02-28 15:01 ` Loic Dachary
  0 siblings, 1 reply; 5+ messages in thread
From: Loic Dachary @ 2015-02-28 10:28 UTC (permalink / raw)
  To: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 1498 bytes --]

Hi,

A teuthology rados run ( https://github.com/ceph/ceph-qa-suite/tree/master/suites/rados ) completed with five dead jobs out of 693. They failed because of DNS errors and I'd like to re-run them. Ideally I could do something like:

teuthology-schedule --run loic-2015-02-27_20:22:09-rados-firefly-backports---basic-multi --job-id 781444 --job-id  781457 ...

and it would re-schedule a run of the designated jobs from the designated run. But I don't think such a command exist. 

I will therefore manually do what such a command would do, for each failed job:

* download http://qa-proxy.ceph.com/teuthology/loic-2015-02-27_20:22:09-rados-firefly-backports---basic-multi/781444/orig.config.yaml
* git clone https://github.com/ceph/ceph-qa-suite /srv/ceph-qa-suite
* cd /srv/ceph-qa-suite ; git checkout firefly (assuming that's the ceph-qa-suite branch I'm interested in)
* remove the fields:
   job_id: '781444'
   last_in_suite: false
   worker_log: /var/lib/teuthworker/archive/worker_logs/worker.multi.14588
* replace the suite_path: field with suite_path: /srv/ceph-qa-suite
* teuthology-lock --lock enough machines (i.e. one for each element in the roles: section of the orig.config.yaml)
* turn the machine list into a consumable file for teuthology : teuthology-lock --list-targets > targets.yaml 
* run teuthology orig.config.yaml targets.yaml
* wait for the result

Is there a better way to do that ? 

Cheers

-- 
Loïc Dachary, Artisan Logiciel Libre

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread