All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: failed erasure code pool creation after client upgrade
       [not found] <CAJBxG-tw0qdSBF_fA7LOuFc-vg8Rq1AQsrqU0ic5Wux1Ozpbew@mail.gmail.com>
@ 2014-10-25 17:29 ` Loic Dachary
       [not found]   ` <CAJBxG-uLYfpoJoTGCFCT5+2n3uE2ucjtKSnnQTnK2gwdvKhukA@mail.gmail.com>
       [not found]   ` <CAFd2DpTEMSJAykdicJPXND4_W6mgUM-gR8jXc3qubS1GzLwqQA@mail.gmail.com>
  0 siblings, 2 replies; 3+ messages in thread
From: Loic Dachary @ 2014-10-25 17:29 UTC (permalink / raw)
  To: Yuri Weinstein; +Cc: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 1953 bytes --]

[cc'ing ceph-devel for archive]

Hi,

I see a lot of errors with

#define	EOPNOTSUPP	95	/* Operation not supported on transport endpoint */

2014-10-24T20:26:54.335 INFO:tasks.workunit.client.0.plana63.stdout:[ RUN      ] LibRadosAioEC.SimpleWrite
2014-10-24T20:26:56.737 INFO:tasks.workunit.client.0.plana63.stdout:test/librados/aio.cc:1634: Failure
2014-10-24T20:26:56.737 INFO:tasks.workunit.client.0.plana63.stdout:Value of: test_data.init()
2014-10-24T20:26:56.737 INFO:tasks.workunit.client.0.plana63.stdout:  Actual: "create_one_ec_pool(test-rados-api-plana63-14645-33) failed: error rados_mon_command erasure-code-profile set name:testprofile failed with error -95"
2014-10-24T20:26:56.738 INFO:tasks.workunit.client.0.plana63.stdout:Expected: ""
2014-10-24T20:26:56.738 INFO:tasks.workunit.client.0.plana63.stdout:[  FAILED  ] LibRadosAioEC.SimpleWrite (2403 ms)
2014-10-24T20:26:56.738 INFO:tasks.workunit.client.0.plana63.stdout:[ RUN      ] LibRadosAioEC.SimpleWritePP
2014-10-24T20:26:59.141 INFO:tasks.workunit.client.0.plana63.stdout:test/librados/aio.cc:1669: Failure
2014-10-24T20:26:59.142 INFO:tasks.workunit.client.0.plana63.stdout:Value of: test_data.init()
2014-10-24T20:26:59.142 INFO:tasks.workunit.client.0.plana63.stdout:  Actual: "create_one_ec_pool(test-rados-api-plana63-14645-

which indeed suggests that the client is trying to create an erasure coded pool in a cluster that does not support it. But since it looks like it's upgrading from firefly to a later version, I don't understand why that would be a problem.

How did that get scheduled ?

Cheers

On 25/10/2014 08:37, Yuri Weinstein wrote:
> Not sure what's going on with it, thx.
> 
> It's unusual in a way that upgrades a client first.
> 
> http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-24_17:05:01-upgrade:firefly:singleton-firefly-distro-basic-multi/569532/teuthology.log

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: failed erasure code pool creation after client upgrade
       [not found]   ` <CAJBxG-uLYfpoJoTGCFCT5+2n3uE2ucjtKSnnQTnK2gwdvKhukA@mail.gmail.com>
@ 2014-10-25 17:36     ` Loic Dachary
  0 siblings, 0 replies; 3+ messages in thread
From: Loic Dachary @ 2014-10-25 17:36 UTC (permalink / raw)
  To: Yuri Weinstein; +Cc: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 2695 bytes --]

Hi Yuri,

In the logs I see

2014-10-24T20:08:46.258 INFO:teuthology.task.install:Package version is 0.67.11-26-g6a90775-1precise

which would explain why it fails. If the cluster is running dumpling and an upgrade to a firefly client tries to create an erasure coded pool, that will fail with -95 as expected.

Cheers

On 25/10/2014 10:30, Yuri Weinstein wrote:
> Via crontab treuthology-suite cl
> 
> On Saturday, October 25, 2014, Loic Dachary <loic@dachary.org <mailto:loic@dachary.org>> wrote:
> 
>     [cc'ing ceph-devel for archive]
> 
>     Hi,
> 
>     I see a lot of errors with
> 
>     #define EOPNOTSUPP      95      /* Operation not supported on transport endpoint */
> 
>     2014-10-24T20:26:54.335 INFO:tasks.workunit.client.0.plana63.stdout:[ RUN      ] LibRadosAioEC.SimpleWrite
>     2014-10-24T20:26:56.737 INFO:tasks.workunit.client.0.plana63.stdout:test/librados/aio.cc:1634: Failure
>     2014-10-24T20:26:56.737 INFO:tasks.workunit.client.0.plana63.stdout:Value of: test_data.init()
>     2014-10-24T20:26:56.737 INFO:tasks.workunit.client.0.plana63.stdout:  Actual: "create_one_ec_pool(test-rados-api-plana63-14645-33) failed: error rados_mon_command erasure-code-profile set name:testprofile failed with error -95"
>     2014-10-24T20:26:56.738 INFO:tasks.workunit.client.0.plana63.stdout:Expected: ""
>     2014-10-24T20:26:56.738 INFO:tasks.workunit.client.0.plana63.stdout:[  FAILED  ] LibRadosAioEC.SimpleWrite (2403 ms)
>     2014-10-24T20:26:56.738 INFO:tasks.workunit.client.0.plana63.stdout:[ RUN      ] LibRadosAioEC.SimpleWritePP
>     2014-10-24T20:26:59.141 INFO:tasks.workunit.client.0.plana63.stdout:test/librados/aio.cc:1669: Failure
>     2014-10-24T20:26:59.142 INFO:tasks.workunit.client.0.plana63.stdout:Value of: test_data.init()
>     2014-10-24T20:26:59.142 INFO:tasks.workunit.client.0.plana63.stdout:  Actual: "create_one_ec_pool(test-rados-api-plana63-14645-
> 
>     which indeed suggests that the client is trying to create an erasure coded pool in a cluster that does not support it. But since it looks like it's upgrading from firefly to a later version, I don't understand why that would be a problem.
> 
>     How did that get scheduled ?
> 
>     Cheers
> 
>     On 25/10/2014 08:37, Yuri Weinstein wrote:
>     > Not sure what's going on with it, thx.
>     >
>     > It's unusual in a way that upgrades a client first.
>     >
>     > http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-24_17:05:01-upgrade:firefly:singleton-firefly-distro-basic-multi/569532/teuthology.log
> 
>     --
>     Loïc Dachary, Artisan Logiciel Libre
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: failed erasure code pool creation after client upgrade
       [not found]     ` <CAJBxG-tyt9D4yFisb3J-vJeTq0_T5BjkWk5Kfi-X3gsZ97i2uw@mail.gmail.com>
@ 2014-10-26 15:47       ` Loic Dachary
  0 siblings, 0 replies; 3+ messages in thread
From: Loic Dachary @ 2014-10-26 15:47 UTC (permalink / raw)
  To: Yuri Weinstein, Tamil Muthamizhan; +Cc: Ceph Development

[-- Attachment #1: Type: text/plain, Size: 5739 bytes --]

Hi Yuri,

https://github.com/ceph/ceph-qa-suite/commit/3b4442a2014200222764e7fce0cb9c343d97efde

points to 

https://github.com/ceph/ceph/blob/dumpling/qa/workunits/rados/test-upgrade-firefly.sh

however, because the client was upgraded, the ceph_test_rados_api_aio binary being run is the firefly one (only the workunit is pulled from the repository, if I'm not mistaken) and it tries to create the erasure coded pool.

2014-10-26T05:27:18.338 INFO:tasks.workunit:Running workunits matching rados/test-upgrade-firefly.sh on client.0...
2014-10-26T05:27:18.338 INFO:tasks.workunit:Running workunit rados/test-upgrade-firefly.sh...
2014-10-26T05:27:18.339 INFO:teuthology.orchestra.run.plana67:Running: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=dumpling TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/rados/test-upgrade-firefly.sh'
2014-10-26T05:27:18.350 INFO:tasks.workunit.client.0.plana67.stderr:+ ceph_test_rados_api_aio --gtest_filter=-LibRadosAio.OmapPP
2014-10-26T05:27:18.356 INFO:tasks.workunit.client.0.plana67.stdout:Running main() from gtest_main.cc
2014-10-26T05:27:18.356 INFO:tasks.workunit.client.0.plana67.stdout:Note: Google Test filter = -LibRadosAio.OmapPP
...
2014-10-26T05:29:14.364 INFO:tasks.workunit.client.0.plana67.stdout:[ RUN      ] LibRadosAioEC.SimpleWrite
2014-10-26T05:29:20.002 INFO:tasks.workunit.client.0.plana67.stdout:test/librados/aio.cc:1634: Failure
2014-10-26T05:29:20.002 INFO:tasks.workunit.client.0.plana67.stdout:Value of: test_data.init()
2014-10-26T05:29:20.002 INFO:tasks.workunit.client.0.plana67.stdout:  Actual: "create_one_ec_pool(test-rados-api-plana67-12901-32) failed: error rados_mon_command erasure-code-profile set name:testprofile failed with error -95"
2014-10-26T05:29:20.003 INFO:tasks.workunit.client.0.plana67.stdout:Expected: ""

What would probably make sense is to make sure firefly tests are able to run successfully against a dumpling cluster. Or just silently skip tests that can't run on a cluster that does not have the required features ? In any case I can't think of a solution that would run what you want just by juggling with binaries in various branches. But someone else may have an idea, it is entirely possible that I'm missing something simple ;-)

Cheers

On 26/10/2014 08:33, Yuri Weinstein wrote:
> So far the change did not help still having issues in this run 
> http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-25_17:05:01-upgrade:firefly:singleton-firefly-distro-basic-multi/571576/teuthology.log
> 
> On Sat, Oct 25, 2014 at 3:49 PM, Tamil Muthamizhan <tamil.muthamizhan@inktank.com <mailto:tamil.muthamizhan@inktank.com>> wrote:
> 
>     ok, so it looks like we are running the wrong version of rados/test.sh in this test.
> 
>     we are actually upgrading from dumpling to firefly in this failing test and we should have used dumpling version of rados/test-upgrade-firefly.sh [which is exclusively when upgrading the cluster from dumpling].
> 
>     Yuri is working on fixing this in the suite.
> 
>     Thanks,
>     Tamil
> 
> 
>     On Sat, Oct 25, 2014 at 10:29 AM, Loic Dachary <loic@dachary.org <mailto:loic@dachary.org>> wrote:
> 
>         [cc'ing ceph-devel for archive]
> 
>         Hi,
> 
>         I see a lot of errors with
> 
>         #define EOPNOTSUPP      95      /* Operation not supported on transport endpoint */
> 
>         2014-10-24T20:26:54.335 INFO:tasks.workunit.client.0.plana63.stdout:[ RUN      ] LibRadosAioEC.SimpleWrite
>         2014-10-24T20:26:56.737 INFO:tasks.workunit.client.0.plana63.stdout:test/librados/aio.cc:1634: Failure
>         2014-10-24T20:26:56.737 INFO:tasks.workunit.client.0.plana63.stdout:Value of: test_data.init()
>         2014-10-24T20:26:56.737 INFO:tasks.workunit.client.0.plana63.stdout:  Actual: "create_one_ec_pool(test-rados-api-plana63-14645-33) failed: error rados_mon_command erasure-code-profile set name:testprofile failed with error -95"
>         2014-10-24T20:26:56.738 INFO:tasks.workunit.client.0.plana63.stdout:Expected: ""
>         2014-10-24T20:26:56.738 INFO:tasks.workunit.client.0.plana63.stdout:[  FAILED  ] LibRadosAioEC.SimpleWrite (2403 ms)
>         2014-10-24T20:26:56.738 INFO:tasks.workunit.client.0.plana63.stdout:[ RUN      ] LibRadosAioEC.SimpleWritePP
>         2014-10-24T20:26:59.141 INFO:tasks.workunit.client.0.plana63.stdout:test/librados/aio.cc:1669: Failure
>         2014-10-24T20:26:59.142 INFO:tasks.workunit.client.0.plana63.stdout:Value of: test_data.init()
>         2014-10-24T20:26:59.142 INFO:tasks.workunit.client.0.plana63.stdout:  Actual: "create_one_ec_pool(test-rados-api-plana63-14645-
> 
>         which indeed suggests that the client is trying to create an erasure coded pool in a cluster that does not support it. But since it looks like it's upgrading from firefly to a later version, I don't understand why that would be a problem.
> 
>         How did that get scheduled ?
> 
>         Cheers
> 
>         On 25/10/2014 08:37, Yuri Weinstein wrote:
>         > Not sure what's going on with it, thx.
>         >
>         > It's unusual in a way that upgrades a client first.
>         >
>         > http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-24_17:05:01-upgrade:firefly:singleton-firefly-distro-basic-multi/569532/teuthology.log
> 
>         --
>         Loïc Dachary, Artisan Logiciel Libre
> 
> 
> 
> 
>     -- 
>     Regards,
>     Tamil
> 
> 

-- 
Loïc Dachary, Artisan Logiciel Libre


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-10-26 15:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAJBxG-tw0qdSBF_fA7LOuFc-vg8Rq1AQsrqU0ic5Wux1Ozpbew@mail.gmail.com>
2014-10-25 17:29 ` failed erasure code pool creation after client upgrade Loic Dachary
     [not found]   ` <CAJBxG-uLYfpoJoTGCFCT5+2n3uE2ucjtKSnnQTnK2gwdvKhukA@mail.gmail.com>
2014-10-25 17:36     ` Loic Dachary
     [not found]   ` <CAFd2DpTEMSJAykdicJPXND4_W6mgUM-gR8jXc3qubS1GzLwqQA@mail.gmail.com>
     [not found]     ` <CAJBxG-tyt9D4yFisb3J-vJeTq0_T5BjkWk5Kfi-X3gsZ97i2uw@mail.gmail.com>
2014-10-26 15:47       ` Loic Dachary

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.