All of lore.kernel.org
 help / color / mirror / Atom feed
* teuthology timeout error
@ 2015-05-20  2:20 Miyamae, Takeshi
  2015-05-20  7:48 ` Loic Dachary
  0 siblings, 1 reply; 7+ messages in thread
From: Miyamae, Takeshi @ 2015-05-20  2:20 UTC (permalink / raw)
  To: Loic Dachary, Ceph Development
  Cc: Kawaguchi, Shotaro, Imai, Hiroki, Nakao, Takanori, Shiozawa, Kensuke

Hi Loic,

When we fixed our own issue and restarted teuthology, we encountered another issue
(timeout error) which occurs in case of LRC as well.
Do you have any information about that ?

[error messages (in case of LRC pool)]

2015-04-28 12:38:54,128.128 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status --format=json-pretty'
2015-04-28 12:38:54,516.516 INFO:tasks.ceph.ceph_manager:no progress seen, keeping timeout for now
2015-04-28 12:38:54,516.516 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
    return func(self)
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
    timeout=self.config.get('timeout')
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
    'failed to recover before timeout expired'
AssertionError: failed to recover before timeout expired

Traceback (most recent call last):
  File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
    result = self._run(*self.args, **self.kwargs)
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
    return func(self)
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
    timeout=self.config.get('timeout')
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
    'failed to recover before timeout expired'
AssertionError: failed to recover before timeout expired <Greenlet at 0x2a7d550: <bound method Thrasher.do_thrash of <tasks.ceph_manager.Thrasher instance at 0x2bd12d8>>> failed with AssertionError

[ceph version]
0.93-952-gfe28daa

[teuthology, ceph-qa-suite]
newest version at 3/25/2015

[configurations]
  check-locks: false
  overrides:
    ceph:
      conf:
        global:
          ms inject socket failures: 5000
        osd:
          osd heartbeat use min delay socket: true
          osd sloppy crc: true
      fs: xfs
  roles:
  - - mon.a
    - osd.0
    - osd.4
    - osd.8
    - osd.12
  - - mon.b
    - osd.1
    - osd.5
    - osd.9
    - osd.13
  - - mon.c
    - osd.2
    - osd.6
    - osd.10
    - osd.14
  - - osd.3
    - osd.7
    - osd.11
    - osd.15
    - client.0
  targets:
    ubuntu@RX35-1.primary.ceph-poc.fsc.net:
    ubuntu@RX35-2.primary.ceph-poc.fsc.net:
    ubuntu@RX35-3.primary.ceph-poc.fsc.net:
    ubuntu@RX35-4.primary.ceph-poc.fsc.net:
  tasks:
  - ceph:
      conf:
        osd:
          osd debug reject backfill probability: 0.3
          osd max backfills: 1
          osd scrub max interval: 120
          osd scrub min interval: 60
      log-whitelist:
      - wrongly marked me down
      - objects unfound and apparently lost
  - thrashosds:
      chance_pgnum_grow: 1
      chance_pgpnum_fix: 1
      min_in: 4
      timeout: 1200
  - rados:
      clients:
      - client.0
      ec_pool: true
      erasure_code_profile:
        k: 4
        l: 3
        m: 2
        name: lrcprofile
        plugin: lrc
        ruleset-failure-domain: osd
      objects: 50
      op_weights:
        append: 100
        copy_from: 50
        delete: 50
        read: 100
        rmattr: 25
        rollback: 50
        setattr: 25
        snap_create: 50
        snap_remove: 50
        write: 0
      ops: 190000

Best regards,
Takeshi Miyamae


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: teuthology timeout error
  2015-05-20  2:20 teuthology timeout error Miyamae, Takeshi
@ 2015-05-20  7:48 ` Loic Dachary
  2015-05-21  8:32   ` Miyamae, Takeshi
  0 siblings, 1 reply; 7+ messages in thread
From: Loic Dachary @ 2015-05-20  7:48 UTC (permalink / raw)
  To: Miyamae, Takeshi, Ceph Development
  Cc: Kawaguchi, Shotaro, Imai, Hiroki, Nakao, Takanori, Shiozawa, Kensuke

[-- Attachment #1: Type: text/plain, Size: 4103 bytes --]

Hi,

On 20/05/2015 04:20, Miyamae, Takeshi wrote:
> Hi Loic,
> 
> When we fixed our own issue and restarted teuthology, 

Great !

> we encountered another issue (timeout error) which occurs in case of LRC as well.
> Do you have any information about that ?

Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests so I can try to reproduce / diagnose the problem ?

Thanks

> 
> [error messages (in case of LRC pool)]
> 
> 2015-04-28 12:38:54,128.128 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status --format=json-pretty'
> 2015-04-28 12:38:54,516.516 INFO:tasks.ceph.ceph_manager:no progress seen, keeping timeout for now
> 2015-04-28 12:38:54,516.516 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>     return func(self)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>     timeout=self.config.get('timeout')
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>     'failed to recover before timeout expired'
> AssertionError: failed to recover before timeout expired
> 
> Traceback (most recent call last):
>   File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
>     result = self._run(*self.args, **self.kwargs)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>     return func(self)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>     timeout=self.config.get('timeout')
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>     'failed to recover before timeout expired'
> AssertionError: failed to recover before timeout expired <Greenlet at 0x2a7d550: <bound method Thrasher.do_thrash of <tasks.ceph_manager.Thrasher instance at 0x2bd12d8>>> failed with AssertionError
> 
> [ceph version]
> 0.93-952-gfe28daa
> 
> [teuthology, ceph-qa-suite]
> newest version at 3/25/2015
> 
> [configurations]
>   check-locks: false
>   overrides:
>     ceph:
>       conf:
>         global:
>           ms inject socket failures: 5000
>         osd:
>           osd heartbeat use min delay socket: true
>           osd sloppy crc: true
>       fs: xfs
>   roles:
>   - - mon.a
>     - osd.0
>     - osd.4
>     - osd.8
>     - osd.12
>   - - mon.b
>     - osd.1
>     - osd.5
>     - osd.9
>     - osd.13
>   - - mon.c
>     - osd.2
>     - osd.6
>     - osd.10
>     - osd.14
>   - - osd.3
>     - osd.7
>     - osd.11
>     - osd.15
>     - client.0
>   targets:
>     ubuntu@RX35-1.primary.ceph-poc.fsc.net:
>     ubuntu@RX35-2.primary.ceph-poc.fsc.net:
>     ubuntu@RX35-3.primary.ceph-poc.fsc.net:
>     ubuntu@RX35-4.primary.ceph-poc.fsc.net:
>   tasks:
>   - ceph:
>       conf:
>         osd:
>           osd debug reject backfill probability: 0.3
>           osd max backfills: 1
>           osd scrub max interval: 120
>           osd scrub min interval: 60
>       log-whitelist:
>       - wrongly marked me down
>       - objects unfound and apparently lost
>   - thrashosds:
>       chance_pgnum_grow: 1
>       chance_pgpnum_fix: 1
>       min_in: 4
>       timeout: 1200
>   - rados:
>       clients:
>       - client.0
>       ec_pool: true
>       erasure_code_profile:
>         k: 4
>         l: 3
>         m: 2
>         name: lrcprofile
>         plugin: lrc
>         ruleset-failure-domain: osd
>       objects: 50
>       op_weights:
>         append: 100
>         copy_from: 50
>         delete: 50
>         read: 100
>         rmattr: 25
>         rollback: 50
>         setattr: 25
>         snap_create: 50
>         snap_remove: 50
>         write: 0
>       ops: 190000
> 
> Best regards,
> Takeshi Miyamae
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: teuthology timeout error
  2015-05-20  7:48 ` Loic Dachary
@ 2015-05-21  8:32   ` Miyamae, Takeshi
  2015-05-21  9:30     ` Loic Dachary
  2015-05-21  9:37     ` Loic Dachary
  0 siblings, 2 replies; 7+ messages in thread
From: Miyamae, Takeshi @ 2015-05-21  8:32 UTC (permalink / raw)
  To: Loic Dachary, Ceph Development
  Cc: Kawaguchi, Shotaro, Imai, Hiroki, Nakao, Takanori, Shiozawa, Kensuke

Hi Loic,

> Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests
> so I can try to reproduce / diagnose the problem ?

https://github.com/kawaguchi-s/teuthology/tree/wip-10886
https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886

Here are our teuthology/ceph-qa-suite repositories. Thanks in advance.

Best regards,
Takeshi Miyamae

-----Original Message-----
From: Loic Dachary [mailto:loic@dachary.org] 
Sent: Wednesday, May 20, 2015 4:49 PM
To: Miyamae, Takeshi/宮前 剛; Ceph Development
Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 鷹詔; Shiozawa, Kensuke/塩沢 賢輔
Subject: Re: teuthology timeout error

Hi,

On 20/05/2015 04:20, Miyamae, Takeshi wrote:
> Hi Loic,
> 
> When we fixed our own issue and restarted teuthology, 

Great !

> we encountered another issue (timeout error) which occurs in case of LRC as well.
> Do you have any information about that ?

Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests so I can try to reproduce / diagnose the problem ?

Thanks

> 
> [error messages (in case of LRC pool)]
> 
> 2015-04-28 12:38:54,128.128 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status --format=json-pretty'
> 2015-04-28 12:38:54,516.516 INFO:tasks.ceph.ceph_manager:no progress seen, keeping timeout for now
> 2015-04-28 12:38:54,516.516 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>     return func(self)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>     timeout=self.config.get('timeout')
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>     'failed to recover before timeout expired'
> AssertionError: failed to recover before timeout expired
> 
> Traceback (most recent call last):
>   File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
>     result = self._run(*self.args, **self.kwargs)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>     return func(self)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>     timeout=self.config.get('timeout')
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>     'failed to recover before timeout expired'
> AssertionError: failed to recover before timeout expired <Greenlet at 0x2a7d550: <bound method Thrasher.do_thrash of <tasks.ceph_manager.Thrasher instance at 0x2bd12d8>>> failed with AssertionError
> 
> [ceph version]
> 0.93-952-gfe28daa
> 
> [teuthology, ceph-qa-suite]
> newest version at 3/25/2015
> 
> [configurations]
>   check-locks: false
>   overrides:
>     ceph:
>       conf:
>         global:
>           ms inject socket failures: 5000
>         osd:
>           osd heartbeat use min delay socket: true
>           osd sloppy crc: true
>       fs: xfs
>   roles:
>   - - mon.a
>     - osd.0
>     - osd.4
>     - osd.8
>     - osd.12
>   - - mon.b
>     - osd.1
>     - osd.5
>     - osd.9
>     - osd.13
>   - - mon.c
>     - osd.2
>     - osd.6
>     - osd.10
>     - osd.14
>   - - osd.3
>     - osd.7
>     - osd.11
>     - osd.15
>     - client.0
>   targets:
>     ubuntu@RX35-1.primary.ceph-poc.fsc.net:
>     ubuntu@RX35-2.primary.ceph-poc.fsc.net:
>     ubuntu@RX35-3.primary.ceph-poc.fsc.net:
>     ubuntu@RX35-4.primary.ceph-poc.fsc.net:
>   tasks:
>   - ceph:
>       conf:
>         osd:
>           osd debug reject backfill probability: 0.3
>           osd max backfills: 1
>           osd scrub max interval: 120
>           osd scrub min interval: 60
>       log-whitelist:
>       - wrongly marked me down
>       - objects unfound and apparently lost
>   - thrashosds:
>       chance_pgnum_grow: 1
>       chance_pgpnum_fix: 1
>       min_in: 4
>       timeout: 1200
>   - rados:
>       clients:
>       - client.0
>       ec_pool: true
>       erasure_code_profile:
>         k: 4
>         l: 3
>         m: 2
>         name: lrcprofile
>         plugin: lrc
>         ruleset-failure-domain: osd
>       objects: 50
>       op_weights:
>         append: 100
>         copy_from: 50
>         delete: 50
>         read: 100
>         rmattr: 25
>         rollback: 50
>         setattr: 25
>         snap_create: 50
>         snap_remove: 50
>         write: 0
>       ops: 190000
> 
> Best regards,
> Takeshi Miyamae
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: teuthology timeout error
  2015-05-21  8:32   ` Miyamae, Takeshi
@ 2015-05-21  9:30     ` Loic Dachary
  2015-05-21  9:37     ` Loic Dachary
  1 sibling, 0 replies; 7+ messages in thread
From: Loic Dachary @ 2015-05-21  9:30 UTC (permalink / raw)
  To: Miyamae, Takeshi, Ceph Development
  Cc: Kawaguchi, Shotaro, Imai, Hiroki, Nakao, Takanori, Shiozawa, Kensuke

[-- Attachment #1: Type: text/plain, Size: 5381 bytes --]



On 21/05/2015 10:32, Miyamae, Takeshi wrote:
> Hi Loic,
> 
>> Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests
>> so I can try to reproduce / diagnose the problem ?
> 
> https://github.com/kawaguchi-s/teuthology/tree/wip-10886
> https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886
> 

When compared against master they show differences that indicate it would be good to rebase:

https://github.com/ceph/teuthology/compare/master...kawaguchi-s:wip-10886
https://github.com/ceph/ceph-qa-suite/compare/master...kawaguchi-s:wip-10886

I think the teuthology commit on top of wip-10886 is a mistake


> Here are our teuthology/ceph-qa-suite repositories. Thanks in advance.
> 
> Best regards,
> Takeshi Miyamae
> 
> -----Original Message-----
> From: Loic Dachary [mailto:loic@dachary.org] 
> Sent: Wednesday, May 20, 2015 4:49 PM
> To: Miyamae, Takeshi/宮前 剛; Ceph Development
> Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 鷹詔; Shiozawa, Kensuke/塩沢 賢輔
> Subject: Re: teuthology timeout error
> 
> Hi,
> 
> On 20/05/2015 04:20, Miyamae, Takeshi wrote:
>> Hi Loic,
>>
>> When we fixed our own issue and restarted teuthology, 
> 
> Great !
> 
>> we encountered another issue (timeout error) which occurs in case of LRC as well.
>> Do you have any information about that ?
> 
> Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests so I can try to reproduce / diagnose the problem ?
> 
> Thanks
> 
>>
>> [error messages (in case of LRC pool)]
>>
>> 2015-04-28 12:38:54,128.128 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status --format=json-pretty'
>> 2015-04-28 12:38:54,516.516 INFO:tasks.ceph.ceph_manager:no progress seen, keeping timeout for now
>> 2015-04-28 12:38:54,516.516 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>>     return func(self)
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>>     timeout=self.config.get('timeout')
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>>     'failed to recover before timeout expired'
>> AssertionError: failed to recover before timeout expired
>>
>> Traceback (most recent call last):
>>   File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
>>     result = self._run(*self.args, **self.kwargs)
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>>     return func(self)
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>>     timeout=self.config.get('timeout')
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>>     'failed to recover before timeout expired'
>> AssertionError: failed to recover before timeout expired <Greenlet at 0x2a7d550: <bound method Thrasher.do_thrash of <tasks.ceph_manager.Thrasher instance at 0x2bd12d8>>> failed with AssertionError
>>
>> [ceph version]
>> 0.93-952-gfe28daa
>>
>> [teuthology, ceph-qa-suite]
>> newest version at 3/25/2015
>>
>> [configurations]
>>   check-locks: false
>>   overrides:
>>     ceph:
>>       conf:
>>         global:
>>           ms inject socket failures: 5000
>>         osd:
>>           osd heartbeat use min delay socket: true
>>           osd sloppy crc: true
>>       fs: xfs
>>   roles:
>>   - - mon.a
>>     - osd.0
>>     - osd.4
>>     - osd.8
>>     - osd.12
>>   - - mon.b
>>     - osd.1
>>     - osd.5
>>     - osd.9
>>     - osd.13
>>   - - mon.c
>>     - osd.2
>>     - osd.6
>>     - osd.10
>>     - osd.14
>>   - - osd.3
>>     - osd.7
>>     - osd.11
>>     - osd.15
>>     - client.0
>>   targets:
>>     ubuntu@RX35-1.primary.ceph-poc.fsc.net:
>>     ubuntu@RX35-2.primary.ceph-poc.fsc.net:
>>     ubuntu@RX35-3.primary.ceph-poc.fsc.net:
>>     ubuntu@RX35-4.primary.ceph-poc.fsc.net:
>>   tasks:
>>   - ceph:
>>       conf:
>>         osd:
>>           osd debug reject backfill probability: 0.3
>>           osd max backfills: 1
>>           osd scrub max interval: 120
>>           osd scrub min interval: 60
>>       log-whitelist:
>>       - wrongly marked me down
>>       - objects unfound and apparently lost
>>   - thrashosds:
>>       chance_pgnum_grow: 1
>>       chance_pgpnum_fix: 1
>>       min_in: 4
>>       timeout: 1200
>>   - rados:
>>       clients:
>>       - client.0
>>       ec_pool: true
>>       erasure_code_profile:
>>         k: 4
>>         l: 3
>>         m: 2
>>         name: lrcprofile
>>         plugin: lrc
>>         ruleset-failure-domain: osd
>>       objects: 50
>>       op_weights:
>>         append: 100
>>         copy_from: 50
>>         delete: 50
>>         read: 100
>>         rmattr: 25
>>         rollback: 50
>>         setattr: 25
>>         snap_create: 50
>>         snap_remove: 50
>>         write: 0
>>       ops: 190000
>>
>> Best regards,
>> Takeshi Miyamae
>>
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: teuthology timeout error
  2015-05-21  8:32   ` Miyamae, Takeshi
  2015-05-21  9:30     ` Loic Dachary
@ 2015-05-21  9:37     ` Loic Dachary
  2015-05-26  2:39       ` Miyamae, Takeshi
  1 sibling, 1 reply; 7+ messages in thread
From: Loic Dachary @ 2015-05-21  9:37 UTC (permalink / raw)
  To: Miyamae, Takeshi, Ceph Development
  Cc: Kawaguchi, Shotaro, Imai, Hiroki, Nakao, Takanori, Shiozawa, Kensuke

[-- Attachment #1: Type: text/plain, Size: 6396 bytes --]

Hi,

[sorry the previous mail was sent by accident, here is the full mail]

On 21/05/2015 10:32, Miyamae, Takeshi wrote:
> Hi Loic,
> 
>> Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests
>> so I can try to reproduce / diagnose the problem ?
> 
> https://github.com/kawaguchi-s/teuthology/tree/wip-10886
> https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886
> 

When compared against master they show differences that indicate it would be good to rebase:

https://github.com/ceph/teuthology/compare/master...kawaguchi-s:wip-10886
https://github.com/ceph/ceph-qa-suite/compare/master...kawaguchi-s:wip-10886

I think the teuthology commit on top of wip-10886 is a mistake

https://github.com/kawaguchi-s/teuthology/commit/348e54931f89c9b0ae7a84eb931576f8414017b5

do you really need to modify teuthology ? It should just be necessary to use the latest master branch.

It looks like the

https://github.com/kawaguchi-s/ceph-qa-suite/commit/f2e3ca5d12ceef742eae2a9cf4057c436e9040c3

commit in your ceph-qa-suite is not what you intended. However

https://github.com/kawaguchi-s/ceph-qa-suite/commit/4b39d6d4862f9091a849d224e880795be406815d
https://github.com/kawaguchi-s/ceph-qa-suite/commit/d16b4b058ae118931928541a2c8acd68f9703a44

look ok :-) Instead of naming the test 4nodes16osds3mons1client.yaml it would be better to use the same kind of naming you see at https://github.com/ceph/ceph-qa-suite/tree/master/suites/rados/thrash-erasure-code/workloads. That is a file name made of the distinctive parameters for the shec plugin (the parameters that are the default can be omited).

Cheers

> Here are our teuthology/ceph-qa-suite repositories. Thanks in advance.
> 
> Best regards,
> Takeshi Miyamae
> 
> -----Original Message-----
> From: Loic Dachary [mailto:loic@dachary.org] 
> Sent: Wednesday, May 20, 2015 4:49 PM
> To: Miyamae, Takeshi/宮前 剛; Ceph Development
> Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 鷹詔; Shiozawa, Kensuke/塩沢 賢輔
> Subject: Re: teuthology timeout error
> 
> Hi,
> 
> On 20/05/2015 04:20, Miyamae, Takeshi wrote:
>> Hi Loic,
>>
>> When we fixed our own issue and restarted teuthology, 
> 
> Great !
> 
>> we encountered another issue (timeout error) which occurs in case of LRC as well.
>> Do you have any information about that ?
> 
> Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests so I can try to reproduce / diagnose the problem ?
> 
> Thanks
> 
>>
>> [error messages (in case of LRC pool)]
>>
>> 2015-04-28 12:38:54,128.128 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status --format=json-pretty'
>> 2015-04-28 12:38:54,516.516 INFO:tasks.ceph.ceph_manager:no progress seen, keeping timeout for now
>> 2015-04-28 12:38:54,516.516 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>>     return func(self)
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>>     timeout=self.config.get('timeout')
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>>     'failed to recover before timeout expired'
>> AssertionError: failed to recover before timeout expired
>>
>> Traceback (most recent call last):
>>   File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
>>     result = self._run(*self.args, **self.kwargs)
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>>     return func(self)
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>>     timeout=self.config.get('timeout')
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>>     'failed to recover before timeout expired'
>> AssertionError: failed to recover before timeout expired <Greenlet at 0x2a7d550: <bound method Thrasher.do_thrash of <tasks.ceph_manager.Thrasher instance at 0x2bd12d8>>> failed with AssertionError
>>
>> [ceph version]
>> 0.93-952-gfe28daa
>>
>> [teuthology, ceph-qa-suite]
>> newest version at 3/25/2015
>>
>> [configurations]
>>   check-locks: false
>>   overrides:
>>     ceph:
>>       conf:
>>         global:
>>           ms inject socket failures: 5000
>>         osd:
>>           osd heartbeat use min delay socket: true
>>           osd sloppy crc: true
>>       fs: xfs
>>   roles:
>>   - - mon.a
>>     - osd.0
>>     - osd.4
>>     - osd.8
>>     - osd.12
>>   - - mon.b
>>     - osd.1
>>     - osd.5
>>     - osd.9
>>     - osd.13
>>   - - mon.c
>>     - osd.2
>>     - osd.6
>>     - osd.10
>>     - osd.14
>>   - - osd.3
>>     - osd.7
>>     - osd.11
>>     - osd.15
>>     - client.0
>>   targets:
>>     ubuntu@RX35-1.primary.ceph-poc.fsc.net:
>>     ubuntu@RX35-2.primary.ceph-poc.fsc.net:
>>     ubuntu@RX35-3.primary.ceph-poc.fsc.net:
>>     ubuntu@RX35-4.primary.ceph-poc.fsc.net:
>>   tasks:
>>   - ceph:
>>       conf:
>>         osd:
>>           osd debug reject backfill probability: 0.3
>>           osd max backfills: 1
>>           osd scrub max interval: 120
>>           osd scrub min interval: 60
>>       log-whitelist:
>>       - wrongly marked me down
>>       - objects unfound and apparently lost
>>   - thrashosds:
>>       chance_pgnum_grow: 1
>>       chance_pgpnum_fix: 1
>>       min_in: 4
>>       timeout: 1200
>>   - rados:
>>       clients:
>>       - client.0
>>       ec_pool: true
>>       erasure_code_profile:
>>         k: 4
>>         l: 3
>>         m: 2
>>         name: lrcprofile
>>         plugin: lrc
>>         ruleset-failure-domain: osd
>>       objects: 50
>>       op_weights:
>>         append: 100
>>         copy_from: 50
>>         delete: 50
>>         read: 100
>>         rmattr: 25
>>         rollback: 50
>>         setattr: 25
>>         snap_create: 50
>>         snap_remove: 50
>>         write: 0
>>       ops: 190000
>>
>> Best regards,
>> Takeshi Miyamae
>>
> 

-- 
Loïc Dachary, Artisan Logiciel Libre




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: teuthology timeout error
  2015-05-21  9:37     ` Loic Dachary
@ 2015-05-26  2:39       ` Miyamae, Takeshi
  2015-05-26  8:59         ` Loic Dachary
  0 siblings, 1 reply; 7+ messages in thread
From: Miyamae, Takeshi @ 2015-05-26  2:39 UTC (permalink / raw)
  To: Loic Dachary, Ceph Development
  Cc: Kawaguchi, Shotaro, Imai, Hiroki, Nakao, Takanori, Shiozawa, Kensuke

Hi Loic,

We rebased our teuthology/ceph-qa-suite and retried the test toward LRC on current master.
However, we unfortunately got the same result as before (timeout error).

[test conditions]
Target : Ceph-9.0.0-971-gd49d816
https://github.com/kawaguchi-s/teuthology
https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886-lrc

[teuthology log]

2015-05-25 10:18:23	# start

2015-05-25 11:59:52,106.106 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status -- format=json-pretty'
2015-05-25 11:59:52,564.564 INFO:tasks.ceph.ceph_manager:no progress seen, keeping timeout for now
2015-05-25 11:59:52,565.565 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 635, in wrapper
    return func(self)
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 668, in do_thrash
    timeout=self.config.get('timeout')
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1569, in wait_for_recovery
    'failed to recover before timeout expired'
AssertionError: failed to recover before timeout expired

Traceback (most recent call last):
  File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
    result = self._run(*self.args, **self.kwargs)
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 635, in wrapper
    return func(self)
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 668, in do_thrash
    timeout=self.config.get('timeout')
  File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1569, in wait_for_recovery
    'failed to recover before timeout expired'
AssertionError: failed to recover before timeout expired <Greenlet at 0x36cacd0: <bound method Thrasher.do_thrash of <tasks.ceph_manager.Thrasher instance at 0x36df3f8>>> failed with AssertionError

Best regards,
Takeshi Miyamae

-----Original Message-----
From: Loic Dachary [mailto:loic@dachary.org] 
Sent: Thursday, May 21, 2015 6:38 PM
To: Miyamae, Takeshi/宮前 剛; Ceph Development
Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 鷹詔; Shiozawa, Kensuke/塩沢 賢輔
Subject: Re: teuthology timeout error

Hi,

[sorry the previous mail was sent by accident, here is the full mail]

On 21/05/2015 10:32, Miyamae, Takeshi wrote:
> Hi Loic,
> 
>> Could you please share the teuthology/ceph-qa-suite repository you 
>> are using to run these tests so I can try to reproduce / diagnose the problem ?
> 
> https://github.com/kawaguchi-s/teuthology/tree/wip-10886
> https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886
> 

When compared against master they show differences that indicate it would be good to rebase:

https://github.com/ceph/teuthology/compare/master...kawaguchi-s:wip-10886
https://github.com/ceph/ceph-qa-suite/compare/master...kawaguchi-s:wip-10886

I think the teuthology commit on top of wip-10886 is a mistake

https://github.com/kawaguchi-s/teuthology/commit/348e54931f89c9b0ae7a84eb931576f8414017b5

do you really need to modify teuthology ? It should just be necessary to use the latest master branch.

It looks like the

https://github.com/kawaguchi-s/ceph-qa-suite/commit/f2e3ca5d12ceef742eae2a9cf4057c436e9040c3

commit in your ceph-qa-suite is not what you intended. However

https://github.com/kawaguchi-s/ceph-qa-suite/commit/4b39d6d4862f9091a849d224e880795be406815d
https://github.com/kawaguchi-s/ceph-qa-suite/commit/d16b4b058ae118931928541a2c8acd68f9703a44

look ok :-) Instead of naming the test 4nodes16osds3mons1client.yaml it would be better to use the same kind of naming you see at https://github.com/ceph/ceph-qa-suite/tree/master/suites/rados/thrash-erasure-code/workloads. That is a file name made of the distinctive parameters for the shec plugin (the parameters that are the default can be omited).

Cheers

> Here are our teuthology/ceph-qa-suite repositories. Thanks in advance.
> 
> Best regards,
> Takeshi Miyamae
> 
> -----Original Message-----
> From: Loic Dachary [mailto:loic@dachary.org]
> Sent: Wednesday, May 20, 2015 4:49 PM
> To: Miyamae, Takeshi/宮前 剛; Ceph Development
> Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 
> 鷹詔; Shiozawa, Kensuke/塩沢 賢輔
> Subject: Re: teuthology timeout error
> 
> Hi,
> 
> On 20/05/2015 04:20, Miyamae, Takeshi wrote:
>> Hi Loic,
>>
>> When we fixed our own issue and restarted teuthology,
> 
> Great !
> 
>> we encountered another issue (timeout error) which occurs in case of LRC as well.
>> Do you have any information about that ?
> 
> Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests so I can try to reproduce / diagnose the problem ?
> 
> Thanks
> 
>>
>> [error messages (in case of LRC pool)]
>>
>> 2015-04-28 12:38:54,128.128 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status --format=json-pretty'
>> 2015-04-28 12:38:54,516.516 INFO:tasks.ceph.ceph_manager:no progress 
>> seen, keeping timeout for now
>> 2015-04-28 12:38:54,516.516 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>>     return func(self)
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>>     timeout=self.config.get('timeout')
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>>     'failed to recover before timeout expired'
>> AssertionError: failed to recover before timeout expired
>>
>> Traceback (most recent call last):
>>   File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
>>     result = self._run(*self.args, **self.kwargs)
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>>     return func(self)
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>>     timeout=self.config.get('timeout')
>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>>     'failed to recover before timeout expired'
>> AssertionError: failed to recover before timeout expired <Greenlet at 
>> 0x2a7d550: <bound method Thrasher.do_thrash of 
>> <tasks.ceph_manager.Thrasher instance at 0x2bd12d8>>> failed with 
>> AssertionError
>>
>> [ceph version]
>> 0.93-952-gfe28daa
>>
>> [teuthology, ceph-qa-suite]
>> newest version at 3/25/2015
>>
>> [configurations]
>>   check-locks: false
>>   overrides:
>>     ceph:
>>       conf:
>>         global:
>>           ms inject socket failures: 5000
>>         osd:
>>           osd heartbeat use min delay socket: true
>>           osd sloppy crc: true
>>       fs: xfs
>>   roles:
>>   - - mon.a
>>     - osd.0
>>     - osd.4
>>     - osd.8
>>     - osd.12
>>   - - mon.b
>>     - osd.1
>>     - osd.5
>>     - osd.9
>>     - osd.13
>>   - - mon.c
>>     - osd.2
>>     - osd.6
>>     - osd.10
>>     - osd.14
>>   - - osd.3
>>     - osd.7
>>     - osd.11
>>     - osd.15
>>     - client.0
>>   targets:
>>     ubuntu@RX35-1.primary.ceph-poc.fsc.net:
>>     ubuntu@RX35-2.primary.ceph-poc.fsc.net:
>>     ubuntu@RX35-3.primary.ceph-poc.fsc.net:
>>     ubuntu@RX35-4.primary.ceph-poc.fsc.net:
>>   tasks:
>>   - ceph:
>>       conf:
>>         osd:
>>           osd debug reject backfill probability: 0.3
>>           osd max backfills: 1
>>           osd scrub max interval: 120
>>           osd scrub min interval: 60
>>       log-whitelist:
>>       - wrongly marked me down
>>       - objects unfound and apparently lost
>>   - thrashosds:
>>       chance_pgnum_grow: 1
>>       chance_pgpnum_fix: 1
>>       min_in: 4
>>       timeout: 1200
>>   - rados:
>>       clients:
>>       - client.0
>>       ec_pool: true
>>       erasure_code_profile:
>>         k: 4
>>         l: 3
>>         m: 2
>>         name: lrcprofile
>>         plugin: lrc
>>         ruleset-failure-domain: osd
>>       objects: 50
>>       op_weights:
>>         append: 100
>>         copy_from: 50
>>         delete: 50
>>         read: 100
>>         rmattr: 25
>>         rollback: 50
>>         setattr: 25
>>         snap_create: 50
>>         snap_remove: 50
>>         write: 0
>>       ops: 190000
>>
>> Best regards,
>> Takeshi Miyamae
>>
> 

--
Loïc Dachary, Artisan Logiciel Libre




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: teuthology timeout error
  2015-05-26  2:39       ` Miyamae, Takeshi
@ 2015-05-26  8:59         ` Loic Dachary
  0 siblings, 0 replies; 7+ messages in thread
From: Loic Dachary @ 2015-05-26  8:59 UTC (permalink / raw)
  To: Miyamae, Takeshi, Ceph Development
  Cc: Kawaguchi, Shotaro, Imai, Hiroki, Nakao, Takanori, Shiozawa, Kensuke

[-- Attachment #1: Type: text/plain, Size: 9304 bytes --]

Hi Takeshi,

I'm trying to repeat your problem at https://github.com/ceph/ceph-qa-suite/pull/445. To be continued :-)

Cheers

On 26/05/2015 04:39, Miyamae, Takeshi wrote:
> Hi Loic,
> 
> We rebased our teuthology/ceph-qa-suite and retried the test toward LRC on current master.
> However, we unfortunately got the same result as before (timeout error).
> 
> [test conditions]
> Target : Ceph-9.0.0-971-gd49d816
> https://github.com/kawaguchi-s/teuthology
> https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886-lrc
> 
> [teuthology log]
> 
> 2015-05-25 10:18:23	# start
> 
> 2015-05-25 11:59:52,106.106 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status -- format=json-pretty'
> 2015-05-25 11:59:52,564.564 INFO:tasks.ceph.ceph_manager:no progress seen, keeping timeout for now
> 2015-05-25 11:59:52,565.565 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 635, in wrapper
>     return func(self)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 668, in do_thrash
>     timeout=self.config.get('timeout')
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1569, in wait_for_recovery
>     'failed to recover before timeout expired'
> AssertionError: failed to recover before timeout expired
> 
> Traceback (most recent call last):
>   File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
>     result = self._run(*self.args, **self.kwargs)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 635, in wrapper
>     return func(self)
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 668, in do_thrash
>     timeout=self.config.get('timeout')
>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1569, in wait_for_recovery
>     'failed to recover before timeout expired'
> AssertionError: failed to recover before timeout expired <Greenlet at 0x36cacd0: <bound method Thrasher.do_thrash of <tasks.ceph_manager.Thrasher instance at 0x36df3f8>>> failed with AssertionError
> 
> Best regards,
> Takeshi Miyamae
> 
> -----Original Message-----
> From: Loic Dachary [mailto:loic@dachary.org] 
> Sent: Thursday, May 21, 2015 6:38 PM
> To: Miyamae, Takeshi/宮前 剛; Ceph Development
> Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 鷹詔; Shiozawa, Kensuke/塩沢 賢輔
> Subject: Re: teuthology timeout error
> 
> Hi,
> 
> [sorry the previous mail was sent by accident, here is the full mail]
> 
> On 21/05/2015 10:32, Miyamae, Takeshi wrote:
>> Hi Loic,
>>
>>> Could you please share the teuthology/ceph-qa-suite repository you 
>>> are using to run these tests so I can try to reproduce / diagnose the problem ?
>>
>> https://github.com/kawaguchi-s/teuthology/tree/wip-10886
>> https://github.com/kawaguchi-s/ceph-qa-suite/tree/wip-10886
>>
> 
> When compared against master they show differences that indicate it would be good to rebase:
> 
> https://github.com/ceph/teuthology/compare/master...kawaguchi-s:wip-10886
> https://github.com/ceph/ceph-qa-suite/compare/master...kawaguchi-s:wip-10886
> 
> I think the teuthology commit on top of wip-10886 is a mistake
> 
> https://github.com/kawaguchi-s/teuthology/commit/348e54931f89c9b0ae7a84eb931576f8414017b5
> 
> do you really need to modify teuthology ? It should just be necessary to use the latest master branch.
> 
> It looks like the
> 
> https://github.com/kawaguchi-s/ceph-qa-suite/commit/f2e3ca5d12ceef742eae2a9cf4057c436e9040c3
> 
> commit in your ceph-qa-suite is not what you intended. However
> 
> https://github.com/kawaguchi-s/ceph-qa-suite/commit/4b39d6d4862f9091a849d224e880795be406815d
> https://github.com/kawaguchi-s/ceph-qa-suite/commit/d16b4b058ae118931928541a2c8acd68f9703a44
> 
> look ok :-) Instead of naming the test 4nodes16osds3mons1client.yaml it would be better to use the same kind of naming you see at https://github.com/ceph/ceph-qa-suite/tree/master/suites/rados/thrash-erasure-code/workloads. That is a file name made of the distinctive parameters for the shec plugin (the parameters that are the default can be omited).
> 
> Cheers
> 
>> Here are our teuthology/ceph-qa-suite repositories. Thanks in advance.
>>
>> Best regards,
>> Takeshi Miyamae
>>
>> -----Original Message-----
>> From: Loic Dachary [mailto:loic@dachary.org]
>> Sent: Wednesday, May 20, 2015 4:49 PM
>> To: Miyamae, Takeshi/宮前 剛; Ceph Development
>> Cc: Kawaguchi, Shotaro/川口 翔太朗; Imai, Hiroki/今井 宏樹; Nakao, Takanori/中尾 
>> 鷹詔; Shiozawa, Kensuke/塩沢 賢輔
>> Subject: Re: teuthology timeout error
>>
>> Hi,
>>
>> On 20/05/2015 04:20, Miyamae, Takeshi wrote:
>>> Hi Loic,
>>>
>>> When we fixed our own issue and restarted teuthology,
>>
>> Great !
>>
>>> we encountered another issue (timeout error) which occurs in case of LRC as well.
>>> Do you have any information about that ?
>>
>> Could you please share the teuthology/ceph-qa-suite repository you are using to run these tests so I can try to reproduce / diagnose the problem ?
>>
>> Thanks
>>
>>>
>>> [error messages (in case of LRC pool)]
>>>
>>> 2015-04-28 12:38:54,128.128 INFO:teuthology.orchestra.run.RX35-1:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph status --format=json-pretty'
>>> 2015-04-28 12:38:54,516.516 INFO:tasks.ceph.ceph_manager:no progress 
>>> seen, keeping timeout for now
>>> 2015-04-28 12:38:54,516.516 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
>>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>>>     return func(self)
>>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>>>     timeout=self.config.get('timeout')
>>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>>>     'failed to recover before timeout expired'
>>> AssertionError: failed to recover before timeout expired
>>>
>>> Traceback (most recent call last):
>>>   File "/root/work/teuthology/virtualenv/lib/python2.7/site-packages/gevent/greenlet.py", line 390, in run
>>>     result = self._run(*self.args, **self.kwargs)
>>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 632, in wrapper
>>>     return func(self)
>>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 665, in do_thrash
>>>     timeout=self.config.get('timeout')
>>>   File "/root/src/ceph-qa-suite_master/tasks/ceph_manager.py", line 1566, in wait_for_recovery
>>>     'failed to recover before timeout expired'
>>> AssertionError: failed to recover before timeout expired <Greenlet at 
>>> 0x2a7d550: <bound method Thrasher.do_thrash of 
>>> <tasks.ceph_manager.Thrasher instance at 0x2bd12d8>>> failed with 
>>> AssertionError
>>>
>>> [ceph version]
>>> 0.93-952-gfe28daa
>>>
>>> [teuthology, ceph-qa-suite]
>>> newest version at 3/25/2015
>>>
>>> [configurations]
>>>   check-locks: false
>>>   overrides:
>>>     ceph:
>>>       conf:
>>>         global:
>>>           ms inject socket failures: 5000
>>>         osd:
>>>           osd heartbeat use min delay socket: true
>>>           osd sloppy crc: true
>>>       fs: xfs
>>>   roles:
>>>   - - mon.a
>>>     - osd.0
>>>     - osd.4
>>>     - osd.8
>>>     - osd.12
>>>   - - mon.b
>>>     - osd.1
>>>     - osd.5
>>>     - osd.9
>>>     - osd.13
>>>   - - mon.c
>>>     - osd.2
>>>     - osd.6
>>>     - osd.10
>>>     - osd.14
>>>   - - osd.3
>>>     - osd.7
>>>     - osd.11
>>>     - osd.15
>>>     - client.0
>>>   targets:
>>>     ubuntu@RX35-1.primary.ceph-poc.fsc.net:
>>>     ubuntu@RX35-2.primary.ceph-poc.fsc.net:
>>>     ubuntu@RX35-3.primary.ceph-poc.fsc.net:
>>>     ubuntu@RX35-4.primary.ceph-poc.fsc.net:
>>>   tasks:
>>>   - ceph:
>>>       conf:
>>>         osd:
>>>           osd debug reject backfill probability: 0.3
>>>           osd max backfills: 1
>>>           osd scrub max interval: 120
>>>           osd scrub min interval: 60
>>>       log-whitelist:
>>>       - wrongly marked me down
>>>       - objects unfound and apparently lost
>>>   - thrashosds:
>>>       chance_pgnum_grow: 1
>>>       chance_pgpnum_fix: 1
>>>       min_in: 4
>>>       timeout: 1200
>>>   - rados:
>>>       clients:
>>>       - client.0
>>>       ec_pool: true
>>>       erasure_code_profile:
>>>         k: 4
>>>         l: 3
>>>         m: 2
>>>         name: lrcprofile
>>>         plugin: lrc
>>>         ruleset-failure-domain: osd
>>>       objects: 50
>>>       op_weights:
>>>         append: 100
>>>         copy_from: 50
>>>         delete: 50
>>>         read: 100
>>>         rmattr: 25
>>>         rollback: 50
>>>         setattr: 25
>>>         snap_create: 50
>>>         snap_remove: 50
>>>         write: 0
>>>       ops: 190000
>>>
>>> Best regards,
>>> Takeshi Miyamae
>>>
>>
> 
> --
> Loïc Dachary, Artisan Logiciel Libre
> 
> 
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-05-26  8:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-20  2:20 teuthology timeout error Miyamae, Takeshi
2015-05-20  7:48 ` Loic Dachary
2015-05-21  8:32   ` Miyamae, Takeshi
2015-05-21  9:30     ` Loic Dachary
2015-05-21  9:37     ` Loic Dachary
2015-05-26  2:39       ` Miyamae, Takeshi
2015-05-26  8:59         ` Loic Dachary

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.