All of lore.kernel.org
 help / color / mirror / Atom feed
* better doc (and build) validation
@ 2017-12-05 13:31 Alfredo Deza
  2017-12-05 17:44 ` Gregory Farnum
  0 siblings, 1 reply; 7+ messages in thread
From: Alfredo Deza @ 2017-12-05 13:31 UTC (permalink / raw)
  To: ceph-devel

It seems like we keep hitting catastrophic build errors with minor doc
changes. The last one was introduced by a change I made to the
ceph-disk man page that updated the title:

https://github.com/ceph/ceph/pull/19241/commits/bd00560c20caeeb1ded211cb81533280338014d1

That change builds correctly (!) and will render just fine, and it was
merged after the docs build reported as successful.

However, this also made all subsequent binary builds break. This
disconnect has caused similar issues in the past. At some point the
docs build was completely broken for days, because there was no
visibility from pull requests.

To mitigate that, we now have the docs job building for every pull
request, regardless if it is only editing rst files (because, yes, one
can break the docs build with C++).

This has helped a lot, but it still falls short because the binary
build process for Ceph is different from how we build docs (including
man pages). This is why a successful docs build in a PR can still
break a binary build.

There are a couple of things that could help:

1) I don't entirely follow how CMake is building the man/doc pages,
but a job that tries to replicate that path could catch these
inconsistencies in PRs.

2) Build binaries on every PR, and report them as status.

I don't know enough of CMake to attempt #1, and #2 would cause
contributors a lot of pain because it would involve having to push to
ceph-ci.git always.

I am open to any other ideas or suggestions here.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: better doc (and build) validation
  2017-12-05 13:31 better doc (and build) validation Alfredo Deza
@ 2017-12-05 17:44 ` Gregory Farnum
  2017-12-05 20:27   ` Alfredo Deza
  0 siblings, 1 reply; 7+ messages in thread
From: Gregory Farnum @ 2017-12-05 17:44 UTC (permalink / raw)
  To: Alfredo Deza; +Cc: ceph-devel

On Tue, Dec 5, 2017 at 5:31 AM, Alfredo Deza <adeza@redhat.com> wrote:
> It seems like we keep hitting catastrophic build errors with minor doc
> changes. The last one was introduced by a change I made to the
> ceph-disk man page that updated the title:
>
> https://github.com/ceph/ceph/pull/19241/commits/bd00560c20caeeb1ded211cb81533280338014d1
>
> That change builds correctly (!) and will render just fine, and it was
> merged after the docs build reported as successful.
>
> However, this also made all subsequent binary builds break. This
> disconnect has caused similar issues in the past. At some point the
> docs build was completely broken for days, because there was no
> visibility from pull requests.
>
> To mitigate that, we now have the docs job building for every pull
> request, regardless if it is only editing rst files (because, yes, one
> can break the docs build with C++).
>
> This has helped a lot, but it still falls short because the binary
> build process for Ceph is different from how we build docs (including
> man pages). This is why a successful docs build in a PR can still
> break a binary build.

Do we understand why it failed on the binary build but not the docs build?

Seems like that would help us see if we can replicate any other issues.

> There are a couple of things that could help:
>
> 1) I don't entirely follow how CMake is building the man/doc pages,
> but a job that tries to replicate that path could catch these
> inconsistencies in PRs.
>
> 2) Build binaries on every PR, and report them as status.
>
> I don't know enough of CMake to attempt #1, and #2 would cause
> contributors a lot of pain because it would involve having to push to
> ceph-ci.git always.
>
> I am open to any other ideas or suggestions here.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: better doc (and build) validation
  2017-12-05 17:44 ` Gregory Farnum
@ 2017-12-05 20:27   ` Alfredo Deza
  2017-12-06  7:44     ` kefu chai
  2017-12-12  4:08     ` Dan Mick
  0 siblings, 2 replies; 7+ messages in thread
From: Alfredo Deza @ 2017-12-05 20:27 UTC (permalink / raw)
  To: Gregory Farnum; +Cc: ceph-devel

On Tue, Dec 5, 2017 at 12:44 PM, Gregory Farnum <gfarnum@redhat.com> wrote:
> On Tue, Dec 5, 2017 at 5:31 AM, Alfredo Deza <adeza@redhat.com> wrote:
>> It seems like we keep hitting catastrophic build errors with minor doc
>> changes. The last one was introduced by a change I made to the
>> ceph-disk man page that updated the title:
>>
>> https://github.com/ceph/ceph/pull/19241/commits/bd00560c20caeeb1ded211cb81533280338014d1
>>
>> That change builds correctly (!) and will render just fine, and it was
>> merged after the docs build reported as successful.
>>
>> However, this also made all subsequent binary builds break. This
>> disconnect has caused similar issues in the past. At some point the
>> docs build was completely broken for days, because there was no
>> visibility from pull requests.
>>
>> To mitigate that, we now have the docs job building for every pull
>> request, regardless if it is only editing rst files (because, yes, one
>> can break the docs build with C++).
>>
>> This has helped a lot, but it still falls short because the binary
>> build process for Ceph is different from how we build docs (including
>> man pages). This is why a successful docs build in a PR can still
>> break a binary build.
>
> Do we understand why it failed on the binary build but not the docs build?
>
> Seems like that would help us see if we can replicate any other issues.

It failed there because the man pages are generated differently, using
a couple of ad-hoc helpers that try to ensure certain conditions. In
this case, it was a Python
utility (ceph/man/conf.py). It was trying to make sure that the title
of the man page matched the name of the man page.
I had added "[DEPRECATED]" to the ceph-disk title, which failed the assertion.

But because the utilities use plain asserts, you don't get any of that
rich error reporting, just that some value != some other value.

Regardless, the reason is because the code paths are not the same, so
we need to find a way to either make them the same, or try to report
whatever code paths
we need to touch for a PR.

>
>> There are a couple of things that could help:
>>
>> 1) I don't entirely follow how CMake is building the man/doc pages,
>> but a job that tries to replicate that path could catch these
>> inconsistencies in PRs.
>>
>> 2) Build binaries on every PR, and report them as status.
>>
>> I don't know enough of CMake to attempt #1, and #2 would cause
>> contributors a lot of pain because it would involve having to push to
>> ceph-ci.git always.
>>
>> I am open to any other ideas or suggestions here.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: better doc (and build) validation
  2017-12-05 20:27   ` Alfredo Deza
@ 2017-12-06  7:44     ` kefu chai
  2017-12-06 12:02       ` Alfredo Deza
  2017-12-12  4:08     ` Dan Mick
  1 sibling, 1 reply; 7+ messages in thread
From: kefu chai @ 2017-12-06  7:44 UTC (permalink / raw)
  To: Alfredo Deza; +Cc: Gregory Farnum, ceph-devel

On Wed, Dec 6, 2017 at 4:27 AM, Alfredo Deza <adeza@redhat.com> wrote:
> On Tue, Dec 5, 2017 at 12:44 PM, Gregory Farnum <gfarnum@redhat.com> wrote:
>> On Tue, Dec 5, 2017 at 5:31 AM, Alfredo Deza <adeza@redhat.com> wrote:
>>> It seems like we keep hitting catastrophic build errors with minor doc
>>> changes. The last one was introduced by a change I made to the
>>> ceph-disk man page that updated the title:
>>>
>>> https://github.com/ceph/ceph/pull/19241/commits/bd00560c20caeeb1ded211cb81533280338014d1
>>>
>>> That change builds correctly (!) and will render just fine, and it was
>>> merged after the docs build reported as successful.
>>>
>>> However, this also made all subsequent binary builds break. This
>>> disconnect has caused similar issues in the past. At some point the
>>> docs build was completely broken for days, because there was no
>>> visibility from pull requests.
>>>
>>> To mitigate that, we now have the docs job building for every pull
>>> request, regardless if it is only editing rst files (because, yes, one
>>> can break the docs build with C++).
>>>
>>> This has helped a lot, but it still falls short because the binary
>>> build process for Ceph is different from how we build docs (including

please note, the cmake building system builds all the artifacts we ship to
our enduser. in addition to the binaries, it also builds and intalls scripts,
documents, sample configurations, copyright, and more. it's more than
merely a "binary build process". i'd call it a "build process". =)

>>> man pages). This is why a successful docs build in a PR can still
>>> break a binary build.
>>
>> Do we understand why it failed on the binary build but not the docs build?
>>
>> Seems like that would help us see if we can replicate any other issues.
>
> It failed there because the man pages are generated differently, using
> a couple of ad-hoc helpers that try to ensure certain conditions. In
> this case, it was a Python
> utility (ceph/man/conf.py). It was trying to make sure that the title
> of the man page matched the name of the man page.
> I had added "[DEPRECATED]" to the ceph-disk title, which failed the assertion.
>
> But because the utilities use plain asserts, you don't get any of that
> rich error reporting, just that some value != some other value.
>
> Regardless, the reason is because the code paths are not the same, so
> we need to find a way to either make them the same, or try to report
> whatever code paths
> we need to touch for a PR.
>
>>
>>> There are a couple of things that could help:
>>>
>>> 1) I don't entirely follow how CMake is building the man/doc pages,
>>> but a job that tries to replicate that path could catch these
>>> inconsistencies in PRs.

could you define "a job"? we have a target for building the man pages.
"make manpages" will build all of them.

>>>
>>> 2) Build binaries on every PR, and report them as status.

yeah, i guess that's what "needs-qa" is for.

>>>
>>> I don't know enough of CMake to attempt #1, and #2 would cause
>>> contributors a lot of pain because it would involve having to push to
>>> ceph-ci.git always.
>>>
>>> I am open to any other ideas or suggestions here.
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Regards
Kefu Chai

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: better doc (and build) validation
  2017-12-06  7:44     ` kefu chai
@ 2017-12-06 12:02       ` Alfredo Deza
  2017-12-06 16:23         ` kefu chai
  0 siblings, 1 reply; 7+ messages in thread
From: Alfredo Deza @ 2017-12-06 12:02 UTC (permalink / raw)
  To: kefu chai; +Cc: Gregory Farnum, ceph-devel

On Wed, Dec 6, 2017 at 2:44 AM, kefu chai <tchaikov@gmail.com> wrote:
> On Wed, Dec 6, 2017 at 4:27 AM, Alfredo Deza <adeza@redhat.com> wrote:
>> On Tue, Dec 5, 2017 at 12:44 PM, Gregory Farnum <gfarnum@redhat.com> wrote:
>>> On Tue, Dec 5, 2017 at 5:31 AM, Alfredo Deza <adeza@redhat.com> wrote:
>>>> It seems like we keep hitting catastrophic build errors with minor doc
>>>> changes. The last one was introduced by a change I made to the
>>>> ceph-disk man page that updated the title:
>>>>
>>>> https://github.com/ceph/ceph/pull/19241/commits/bd00560c20caeeb1ded211cb81533280338014d1
>>>>
>>>> That change builds correctly (!) and will render just fine, and it was
>>>> merged after the docs build reported as successful.
>>>>
>>>> However, this also made all subsequent binary builds break. This
>>>> disconnect has caused similar issues in the past. At some point the
>>>> docs build was completely broken for days, because there was no
>>>> visibility from pull requests.
>>>>
>>>> To mitigate that, we now have the docs job building for every pull
>>>> request, regardless if it is only editing rst files (because, yes, one
>>>> can break the docs build with C++).
>>>>
>>>> This has helped a lot, but it still falls short because the binary
>>>> build process for Ceph is different from how we build docs (including
>
> please note, the cmake building system builds all the artifacts we ship to
> our enduser. in addition to the binaries, it also builds and intalls scripts,
> documents, sample configurations, copyright, and more. it's more than
> merely a "binary build process". i'd call it a "build process". =)
>
>>>> man pages). This is why a successful docs build in a PR can still
>>>> break a binary build.
>>>
>>> Do we understand why it failed on the binary build but not the docs build?
>>>
>>> Seems like that would help us see if we can replicate any other issues.
>>
>> It failed there because the man pages are generated differently, using
>> a couple of ad-hoc helpers that try to ensure certain conditions. In
>> this case, it was a Python
>> utility (ceph/man/conf.py). It was trying to make sure that the title
>> of the man page matched the name of the man page.
>> I had added "[DEPRECATED]" to the ceph-disk title, which failed the assertion.
>>
>> But because the utilities use plain asserts, you don't get any of that
>> rich error reporting, just that some value != some other value.
>>
>> Regardless, the reason is because the code paths are not the same, so
>> we need to find a way to either make them the same, or try to report
>> whatever code paths
>> we need to touch for a PR.
>>
>>>
>>>> There are a couple of things that could help:
>>>>
>>>> 1) I don't entirely follow how CMake is building the man/doc pages,
>>>> but a job that tries to replicate that path could catch these
>>>> inconsistencies in PRs.
>
> could you define "a job"? we have a target for building the man pages.
> "make manpages" will build all of them.

See, I didn't know :)

However! That would only build the man pages. I've broken builds
before when a man page that was expected
by ceph.spec wasn't there. `make manpages` would get us closer though,
this is great. By 'job' I meant A jenkins job that could check this.

>
>>>>
>>>> 2) Build binaries on every PR, and report them as status.
>
> yeah, i guess that's what "needs-qa" is for.

That lacks visibility on reporting back to the PR, and it is a very
manual process that someone has to go through. My concern is to solve
this
in a programmatic way that has a way to notify the author and reviewers of a PR.

I realize that building binaries for every PR is just not practical
for us at the pace they get produced.
>
>>>>
>>>> I don't know enough of CMake to attempt #1, and #2 would cause
>>>> contributors a lot of pain because it would involve having to push to
>>>> ceph-ci.git always.
>>>>
>>>> I am open to any other ideas or suggestions here.
>> --
>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Regards
> Kefu Chai

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: better doc (and build) validation
  2017-12-06 12:02       ` Alfredo Deza
@ 2017-12-06 16:23         ` kefu chai
  0 siblings, 0 replies; 7+ messages in thread
From: kefu chai @ 2017-12-06 16:23 UTC (permalink / raw)
  To: Alfredo Deza; +Cc: Gregory Farnum, ceph-devel

On Wed, Dec 6, 2017 at 8:02 PM, Alfredo Deza <adeza@redhat.com> wrote:
> On Wed, Dec 6, 2017 at 2:44 AM, kefu chai <tchaikov@gmail.com> wrote:
>> On Wed, Dec 6, 2017 at 4:27 AM, Alfredo Deza <adeza@redhat.com> wrote:
>>> On Tue, Dec 5, 2017 at 12:44 PM, Gregory Farnum <gfarnum@redhat.com> wrote:
>>>> On Tue, Dec 5, 2017 at 5:31 AM, Alfredo Deza <adeza@redhat.com> wrote:
>>>>> It seems like we keep hitting catastrophic build errors with minor doc
>>>>> changes. The last one was introduced by a change I made to the
>>>>> ceph-disk man page that updated the title:
>>>>>
>>>>> https://github.com/ceph/ceph/pull/19241/commits/bd00560c20caeeb1ded211cb81533280338014d1
>>>>>
>>>>> That change builds correctly (!) and will render just fine, and it was
>>>>> merged after the docs build reported as successful.
>>>>>
>>>>> However, this also made all subsequent binary builds break. This
>>>>> disconnect has caused similar issues in the past. At some point the
>>>>> docs build was completely broken for days, because there was no
>>>>> visibility from pull requests.
>>>>>
>>>>> To mitigate that, we now have the docs job building for every pull
>>>>> request, regardless if it is only editing rst files (because, yes, one
>>>>> can break the docs build with C++).
>>>>>
>>>>> This has helped a lot, but it still falls short because the binary
>>>>> build process for Ceph is different from how we build docs (including
>>
>> please note, the cmake building system builds all the artifacts we ship to
>> our enduser. in addition to the binaries, it also builds and intalls scripts,
>> documents, sample configurations, copyright, and more. it's more than
>> merely a "binary build process". i'd call it a "build process". =)
>>
>>>>> man pages). This is why a successful docs build in a PR can still
>>>>> break a binary build.
>>>>
>>>> Do we understand why it failed on the binary build but not the docs build?
>>>>
>>>> Seems like that would help us see if we can replicate any other issues.
>>>
>>> It failed there because the man pages are generated differently, using
>>> a couple of ad-hoc helpers that try to ensure certain conditions. In
>>> this case, it was a Python
>>> utility (ceph/man/conf.py). It was trying to make sure that the title
>>> of the man page matched the name of the man page.
>>> I had added "[DEPRECATED]" to the ceph-disk title, which failed the assertion.
>>>
>>> But because the utilities use plain asserts, you don't get any of that
>>> rich error reporting, just that some value != some other value.
>>>
>>> Regardless, the reason is because the code paths are not the same, so
>>> we need to find a way to either make them the same, or try to report
>>> whatever code paths
>>> we need to touch for a PR.
>>>
>>>>
>>>>> There are a couple of things that could help:
>>>>>
>>>>> 1) I don't entirely follow how CMake is building the man/doc pages,
>>>>> but a job that tries to replicate that path could catch these
>>>>> inconsistencies in PRs.
>>
>> could you define "a job"? we have a target for building the man pages.
>> "make manpages" will build all of them.
>
> See, I didn't know :)
>
> However! That would only build the man pages. I've broken builds
> before when a man page that was expected
> by ceph.spec wasn't there. `make manpages` would get us closer though,
> this is great. By 'job' I meant A jenkins job that could check this.
>
>>
>>>>>
>>>>> 2) Build binaries on every PR, and report them as status.
>>
>> yeah, i guess that's what "needs-qa" is for.
>
> That lacks visibility on reporting back to the PR, and it is a very
> manual process that someone has to go through. My concern is to solve
> this
> in a programmatic way that has a way to notify the author and reviewers of a PR.


i see. but probably you can use more reviewers next time. this is how
i do with my
PRs: check the recent changes of the files the PR is changing, and
find the authors
and reviewers of them, then add them as reviewer or @ them .

well, i know, it's still not a programmatic way. =(


>
> I realize that building binaries for every PR is just not practical
> for us at the pace they get produced.
>>
>>>>>
>>>>> I don't know enough of CMake to attempt #1, and #2 would cause
>>>>> contributors a lot of pain because it would involve having to push to
>>>>> ceph-ci.git always.
>>>>>
>>>>> I am open to any other ideas or suggestions here.
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>>
>> --
>> Regards
>> Kefu Chai



-- 
Regards
Kefu Chai

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: better doc (and build) validation
  2017-12-05 20:27   ` Alfredo Deza
  2017-12-06  7:44     ` kefu chai
@ 2017-12-12  4:08     ` Dan Mick
  1 sibling, 0 replies; 7+ messages in thread
From: Dan Mick @ 2017-12-12  4:08 UTC (permalink / raw)
  To: Alfredo Deza, Gregory Farnum; +Cc: ceph-devel

On 12/05/2017 12:27 PM, Alfredo Deza wrote:
> On Tue, Dec 5, 2017 at 12:44 PM, Gregory Farnum <gfarnum@redhat.com> wrote:
>> On Tue, Dec 5, 2017 at 5:31 AM, Alfredo Deza <adeza@redhat.com> wrote:
>>> It seems like we keep hitting catastrophic build errors with minor doc
>>> changes. The last one was introduced by a change I made to the
>>> ceph-disk man page that updated the title:
>>>
>>> https://github.com/ceph/ceph/pull/19241/commits/bd00560c20caeeb1ded211cb81533280338014d1

> It failed there because the man pages are generated differently, using
> a couple of ad-hoc helpers that try to ensure certain conditions. In
> this case, it was a Python
> utility (ceph/man/conf.py). It was trying to make sure that the title
> of the man page matched the name of the man page.
> I had added "[DEPRECATED]" to the ceph-disk title, which failed the assertion.

Ugh.  FWIW I agree that's horrid.  Seems to me the doc build should be
doing the same thing the docs-part-of-the-binary-build is doing (and
always should have) and yes I know it's not simple.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-12-12  4:08 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-05 13:31 better doc (and build) validation Alfredo Deza
2017-12-05 17:44 ` Gregory Farnum
2017-12-05 20:27   ` Alfredo Deza
2017-12-06  7:44     ` kefu chai
2017-12-06 12:02       ` Alfredo Deza
2017-12-06 16:23         ` kefu chai
2017-12-12  4:08     ` Dan Mick

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.