All of lore.kernel.org
 help / color / mirror / Atom feed
* Can I use CRoaring library in Git?
@ 2022-07-16 13:50 Abhradeep Chakraborty
  2022-07-16 14:16 ` Ævar Arnfjörð Bjarmason
                   ` (4 more replies)
  0 siblings, 5 replies; 25+ messages in thread
From: Abhradeep Chakraborty @ 2022-07-16 13:50 UTC (permalink / raw)
  To: git, Junio C Hamano, Taylor Blau, Kaartic Sivaraam

Hello,

I need the CRoaring[1] library to use roaring bitmaps. But it has
Apache license v2 which is not compatible with GPLv2[2].

Is there a way to use the CRoaring library in Git? Taylor told me that
contrib/persistent-https tree is also licensed under Apache License
version 2.

[1] https://github.com/RoaringBitmap/CRoaring
[2] https://www.apache.org/licenses/GPL-compatibility.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-16 13:50 Can I use CRoaring library in Git? Abhradeep Chakraborty
@ 2022-07-16 14:16 ` Ævar Arnfjörð Bjarmason
  2022-07-16 16:26   ` Abhradeep Chakraborty
  2022-07-17 14:43 ` Derrick Stolee
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-16 14:16 UTC (permalink / raw)
  To: Abhradeep Chakraborty; +Cc: git, Junio C Hamano, Taylor Blau, Kaartic Sivaraam


On Sat, Jul 16 2022, Abhradeep Chakraborty wrote:

> Hello,
>
> I need the CRoaring[1] library to use roaring bitmaps. But it has
> Apache license v2 which is not compatible with GPLv2[2].
>
> Is there a way to use the CRoaring library in Git? Taylor told me that
> contrib/persistent-https tree is also licensed under Apache License
> version 2.
>
> [1] https://github.com/RoaringBitmap/CRoaring
> [2] https://www.apache.org/licenses/GPL-compatibility.html

As a replacement for git's own bitmap implementation?

It's one thing to have differently licensed code in-tree that's built as
a separate utility (like that persistent-https tool), but another if
this is going to be something linked to git itself.

My understanding is that such a thing could not be legally distributed
as a binary (e.g. by Debian et al), so the users will be limited to
those willing to build the two pieces from scratch locally, i.e. similar
to ZFS on Linux (which I think is still the state of that ...).

But I'm not a lawyer and all that.

Another possibility is to get the library to dual-license itself,
running "git shortlog -sn" on it it seems it's mainly written by one
contributor, with a relatively short tail of others, perhaps they'd be
willing to dual-license at the prospect of having git use it?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-16 14:16 ` Ævar Arnfjörð Bjarmason
@ 2022-07-16 16:26   ` Abhradeep Chakraborty
  2022-07-17 12:25     ` Kaartic Sivaraam
  0 siblings, 1 reply; 25+ messages in thread
From: Abhradeep Chakraborty @ 2022-07-16 16:26 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: git, Junio C Hamano, Taylor Blau, Kaartic Sivaraam

On Sat, Jul 16, 2022 at 7:50 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> As a replacement for git's own bitmap implementation?

Yeah,it can replace EWAH bitmaps if it gives better performance.

> Another possibility is to get the library to dual-license itself,
> running "git shortlog -sn" on it it seems it's mainly written by one
> contributor, with a relatively short tail of others, perhaps they'd be
> willing to dual-license at the prospect of having git use it?

Kaartic suggested the same and it seems better than the previous one.

As a side note, the current EWAH implementation is also relicensed[1]
to make it compatible with Git. As these two are mainly written by the
same person, I think he will help us this time also.

Thanks!

[1]  https://github.blog/2015-09-22-counting-objects/#footnote-1

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-16 16:26   ` Abhradeep Chakraborty
@ 2022-07-17 12:25     ` Kaartic Sivaraam
  2022-07-17 22:00       ` Junio C Hamano
  0 siblings, 1 reply; 25+ messages in thread
From: Kaartic Sivaraam @ 2022-07-17 12:25 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: Ævar Arnfjörð Bjarmason, git, Junio C Hamano, Taylor Blau

On 16-07-2022 21:56, Abhradeep Chakraborty wrote:
> On Sat, Jul 16, 2022 at 7:50 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> Another possibility is to get the library to dual-license itself,
>> running "git shortlog -sn" on it it seems it's mainly written by one
>> contributor, with a relatively short tail of others, perhaps they'd be
>> willing to dual-license at the prospect of having git use it?
> 
> Kaartic suggested the same and it seems better than the previous one.
> 
> As a side note, the current EWAH implementation is also relicensed[1]
> to make it compatible with Git. As these two are mainly written by the
> same person, I think he will help us this time also.
>

The EWAH case is a bit different. The original EWAH implementation
[ewah-cpp] was in C++. It was then ported to C [ewah-c] by Git
contributors [ewah-git]. The ported version has been relicensed under
GPLv2 with Deniel Lemire's permission.

The case with CRoaring is that the implementation already exists in C
[croaring] and that is the one which is licensed under Apache V2. I'm
not sure how relicensing works for already existing code.

I suppose we could enquire Daniel Lemire about using the Apache licensed
code for Git. Let's hope for the best.


[[ References ]]

[ewah-cpp]: https://github.com/lemire/EWAHBoolArray

[ewah-c]: https://github.com/vmg/libewok
 [ewah-git]: The initial commit of 'ewah' directory mentions the
    relicensing:

    e1273106f6 (ewah: compressed bitmap implementation, 2013-11-14)

[croaring]: https://github.com/RoaringBitmap/CRoaring

> Thanks!
> 
> [1]  https://github.blog/2015-09-22-counting-objects/#footnote-1

--
Sivaraam

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-16 13:50 Can I use CRoaring library in Git? Abhradeep Chakraborty
  2022-07-16 14:16 ` Ævar Arnfjörð Bjarmason
@ 2022-07-17 14:43 ` Derrick Stolee
  2022-07-18 11:13 ` Jakub Narębski
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 25+ messages in thread
From: Derrick Stolee @ 2022-07-17 14:43 UTC (permalink / raw)
  To: Abhradeep Chakraborty, git, Junio C Hamano, Taylor Blau,
	Kaartic Sivaraam

On 7/16/22 9:50 AM, Abhradeep Chakraborty wrote:
> Hello,
> 
> I need the CRoaring[1] library to use roaring bitmaps. But it has
> Apache license v2 which is not compatible with GPLv2[2].
> 
> Is there a way to use the CRoaring library in Git? Taylor told me that
> contrib/persistent-https tree is also licensed under Apache License
> version 2.
> 
> [1] https://github.com/RoaringBitmap/CRoaring
> [2] https://www.apache.org/licenses/GPL-compatibility.html

I know that working around a license would be the the optimal way to get a
battle-tested implementation. Its API should be close enough to the EWAH
bitmap implementation that we can transition between the formats easily.
Continue pursuing that for now.

However, we always have the option of implementing a version from scratch
based on the description in the paper [3]. The benefit there is that we
would only need to implement what we need from the format and logic, and
we could even get some benefits from exposing some of the internals to the
rest of Git's codebase.

[3] https://arxiv.org/pdf/1603.06549.pdf

I mention this because I made an independent C# implementation of
Roaring+Run for the Azure Repos back-end. The way that the bitmaps are
split into "chunks" of 65k positions was helpful with how the object order
was set up: older objects were in early chunks and so deltas only needed
the later chunks. When using a chunk, we could lazy-load and unload each
chunk as we went through the object order.

So, if we really want to try this, an independent implementation might be
the way to start, at least as a prototype while pursuing the licensing
angle.

Thanks,
-Stolee 

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-17 12:25     ` Kaartic Sivaraam
@ 2022-07-17 22:00       ` Junio C Hamano
  2022-07-17 22:25         ` Taylor Blau
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2022-07-17 22:00 UTC (permalink / raw)
  To: Kaartic Sivaraam
  Cc: Abhradeep Chakraborty, Ævar Arnfjörð Bjarmason,
	git, Taylor Blau

Kaartic Sivaraam <kaartic.sivaraam@gmail.com> writes:

> The EWAH case is a bit different. The original EWAH implementation
> [ewah-cpp] was in C++. It was then ported to C [ewah-c] by Git
> contributors [ewah-git]. The ported version has been relicensed under
> GPLv2 with Deniel Lemire's permission.
>
> The case with CRoaring is that the implementation already exists in C
> [croaring] and that is the one which is licensed under Apache V2. I'm
> not sure how relicensing works for already existing code.

As long as the author says they are willing to relicense, that would
"work".  It is entirely up to them.

> I suppose we could enquire Daniel Lemire about using the Apache licensed
> code for Git. Let's hope for the best.

Request to relicense it so that we can use it in our GPLv2 project.
Relicensing it under GPLv2, MIT, or BSD, would work for us.

Assuming that we can clear the licensing issues (or we can write our
own implementation from spec), how would the transition plan look
like?  Does our bitmap format carry enough metadata to allow
existing clients who never saw anything but ewah bitmaps to say "ah,
this bitmap file uses encoding I do not understand" and gracefully
fall back to not using the bitmap?

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-17 22:00       ` Junio C Hamano
@ 2022-07-17 22:25         ` Taylor Blau
  2022-07-18  8:57           ` Abhradeep Chakraborty
  0 siblings, 1 reply; 25+ messages in thread
From: Taylor Blau @ 2022-07-17 22:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Kaartic Sivaraam, Abhradeep Chakraborty,
	T��CVT��CVvar Arnfjjjrrr Bjarmason,
	git

On Sun, Jul 17, 2022 at 03:00:36PM -0700, Junio C Hamano wrote:
> Kaartic Sivaraam <kaartic.sivaraam@gmail.com> writes:
>
> > The EWAH case is a bit different. The original EWAH implementation
> > [ewah-cpp] was in C++. It was then ported to C [ewah-c] by Git
> > contributors [ewah-git]. The ported version has been relicensed under
> > GPLv2 with Deniel Lemire's permission.
> >
> > The case with CRoaring is that the implementation already exists in C
> > [croaring] and that is the one which is licensed under Apache V2. I'm
> > not sure how relicensing works for already existing code.
>
> As long as the author says they are willing to relicense, that would
> "work".  It is entirely up to them.

Yes, using an existing library would be my vast preference. Not only
because it reduces the amount of work needed to prove out this new
concept (that Roaring+Run provides a speed or space advantage when
compared to EWAH), but because:

  - the existing implementation is widely-used, and would give us
    confidence in adopting a "battle-tested" implementation

  - there is a standard serialization format that is understood in the
    various language re-implementations of CRoaring

The latter point is important for users like libgit2 and JGit who would
also be able to adopt an "off the shelf" solution and have the bitmaps
be read according to the standard format.

> Assuming that we can clear the licensing issues (or we can write our
> own implementation from spec), how would the transition plan look
> like?  Does our bitmap format carry enough metadata to allow
> existing clients who never saw anything but ewah bitmaps to say "ah,
> this bitmap file uses encoding I do not understand" and gracefully
> fall back to not using the bitmap?

Yes, the version field alone does this, since the existing readers know
to ignore a bitmap whose version they do not understand.

I assume that Abhradeep will want to pursue some format redesign as part
of the transition, though, at least to see if changing the format beyond
a version bump and new compression scheme is worthwhile.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-17 22:25         ` Taylor Blau
@ 2022-07-18  8:57           ` Abhradeep Chakraborty
  2022-07-25 22:11             ` Taylor Blau
  0 siblings, 1 reply; 25+ messages in thread
From: Abhradeep Chakraborty @ 2022-07-18  8:57 UTC (permalink / raw)
  To: Taylor Blau
  Cc: Junio C Hamano, Kaartic Sivaraam,
	Ævar Arnfjörð Bjarmason, git, Derrick Stolee

On Mon, Jul 18, 2022 at 3:55 AM Taylor Blau <me@ttaylorr.com> wrote:
> > Assuming that we can clear the licensing issues (or we can write our
> > own implementation from spec), how would the transition plan look
> > like?  Does our bitmap format carry enough metadata to allow
> > existing clients who never saw anything but ewah bitmaps to say "ah,
> > this bitmap file uses encoding I do not understand" and gracefully
> > fall back to not using the bitmap?
>
> Yes, the version field alone does this, since the existing readers know
> to ignore a bitmap whose version they do not understand.

Yeah, the version field itself is enough to do this.

> I assume that Abhradeep will want to pursue some format redesign as part
> of the transition, though, at least to see if changing the format beyond
> a version bump and new compression scheme is worthwhile.

I haven't thought much about it until now. As far as I think we don't
need Xor Flag anymore.
My primary goal is to implement Roaring Bitmap as soon as possible and
perform performance testing. If the performance tests give good
results, I will think about reformatting.
As far as it seems most will stay the same.

Thanks :)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-16 13:50 Can I use CRoaring library in Git? Abhradeep Chakraborty
  2022-07-16 14:16 ` Ævar Arnfjörð Bjarmason
  2022-07-17 14:43 ` Derrick Stolee
@ 2022-07-18 11:13 ` Jakub Narębski
  2022-07-18 11:38   ` Abhradeep Chakraborty
  2022-07-18 11:48 ` Abhradeep Chakraborty
  2022-07-21  4:07 ` Abhradeep Chakraborty
  4 siblings, 1 reply; 25+ messages in thread
From: Jakub Narębski @ 2022-07-18 11:13 UTC (permalink / raw)
  To: Abhradeep Chakraborty; +Cc: git, Junio C Hamano, Taylor Blau, Kaartic Sivaraam

Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:

> Hello,
>
> I need the CRoaring[1] library to use roaring bitmaps. But it has
> Apache license v2 which is not compatible with GPLv2[2].

Actually Apache License v2.0 *is* compatibile with GPLv2 and GPLv3
in the sense that you can include the Apache licensed code (like the
CRoaring library) in the GPLv2 project (like Git).

Quote from the cited "Apache License V2.0 and GPL Compatibility"[2]:

  The Free Software Foundation considers the Apache License, Version 2.0
  to be a free software license, compatible with version 3 of the GPL.
  The Software Freedom Law Center provides practical advice for
  developers about including permissively licensed source.

  Apache 2 software can therefore be included in GPLv3 projects, because
  the GPLv3 license accepts our software into GPLv3 works. However,
  GPLv3 software cannot be included in Apache projects. The licenses are
  incompatible in one direction only, and it is a result of ASF's
  licensing philosophy and the GPLv3 authors' interpretation of
  copyright law.

License compatibility is directional.

See also "The Free-Libre / Open Source Software (FLOSS) License Slide"[3]
by David A. Wheeler which shows which licenses are compatibile with
which, as a directed graph

[3] https://dwheeler.com/essays/floss-license-slide.html

>
> Is there a way to use the CRoaring library in Git? Taylor told me that
> contrib/persistent-https tree is also licensed under Apache License
> version 2.
>
> [1] https://github.com/RoaringBitmap/CRoaring
> [2] https://www.apache.org/licenses/GPL-compatibility.html

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-18 11:13 ` Jakub Narębski
@ 2022-07-18 11:38   ` Abhradeep Chakraborty
  2022-07-18 13:38     ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 25+ messages in thread
From: Abhradeep Chakraborty @ 2022-07-18 11:38 UTC (permalink / raw)
  To: Jakub Narębski; +Cc: git, Junio C Hamano, Taylor Blau, Kaartic Sivaraam

On Mon, Jul 18, 2022 at 4:43 PM Jakub Narębski <jnareb@gmail.com> wrote:
>
> Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:
>
> > Hello,
> >
> > I need the CRoaring[1] library to use roaring bitmaps. But it has
> > Apache license v2 which is not compatible with GPLv2[2].
>
> Actually Apache License v2.0 *is* compatibile with GPLv2 and GPLv3
> in the sense that you can include the Apache licensed code (like the
> CRoaring library) in the GPLv2 project (like Git).
>
> Quote from the cited "Apache License V2.0 and GPL Compatibility"[2]:
>
>   The Free Software Foundation considers the Apache License, Version 2.0
>   to be a free software license, compatible with version 3 of the GPL.
>   The Software Freedom Law Center provides practical advice for
>   developers about including permissively licensed source.
>
>   Apache 2 software can therefore be included in GPLv3 projects, because
>   the GPLv3 license accepts our software into GPLv3 works. However,
>   GPLv3 software cannot be included in Apache projects. The licenses are
>   incompatible in one direction only, and it is a result of ASF's
>   licensing philosophy and the GPLv3 authors' interpretation of
>   copyright law.

But the same article also says  -

  Despite our best efforts, the FSF has never considered the Apache License
  to be compatible with GPL version 2, citing the patent termination
and indemnification
  provisions as restrictions not present in the older GPL license. The
Apache Software
  Foundation believes that you should always try to obey the
constraints expressed by
  the copyright holder when redistributing their work.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-16 13:50 Can I use CRoaring library in Git? Abhradeep Chakraborty
                   ` (2 preceding siblings ...)
  2022-07-18 11:13 ` Jakub Narębski
@ 2022-07-18 11:48 ` Abhradeep Chakraborty
  2022-07-18 12:18   ` Derrick Stolee
  2022-07-21  4:07 ` Abhradeep Chakraborty
  4 siblings, 1 reply; 25+ messages in thread
From: Abhradeep Chakraborty @ 2022-07-18 11:48 UTC (permalink / raw)
  To: git, Junio C Hamano, Taylor Blau, Kaartic Sivaraam,
	Derrick Stolee, Ævar Arnfjörð Bjarmason,
	Jakub Narębski

I just got to know that CRoaring doesn't support Big Endian systems (till now) -

https://groups.google.com/g/roaring-bitmaps/c/CzLmIRnYlps

What do you think about this?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-18 11:48 ` Abhradeep Chakraborty
@ 2022-07-18 12:18   ` Derrick Stolee
  2022-07-18 13:15     ` Abhradeep Chakraborty
                       ` (2 more replies)
  0 siblings, 3 replies; 25+ messages in thread
From: Derrick Stolee @ 2022-07-18 12:18 UTC (permalink / raw)
  To: Abhradeep Chakraborty, git, Junio C Hamano, Taylor Blau,
	Kaartic Sivaraam, Ævar Arnfjörð Bjarmason,
	Jakub Narębski

On 7/18/22 7:48 AM, Abhradeep Chakraborty wrote:
> I just got to know that CRoaring doesn't support Big Endian systems (till now) -
> 
> https://groups.google.com/g/roaring-bitmaps/c/CzLmIRnYlps
> 
> What do you think about this?

Git cares enough about compatibility that that might be a
deal-breaker for taking the code as-is. If we _did_ take it
as-is, then we would need to not make it available on such
machines using compiler macros.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-18 12:18   ` Derrick Stolee
@ 2022-07-18 13:15     ` Abhradeep Chakraborty
  2022-07-18 21:48     ` brian m. carlson
  2022-07-25 22:14     ` Taylor Blau
  2 siblings, 0 replies; 25+ messages in thread
From: Abhradeep Chakraborty @ 2022-07-18 13:15 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: git, Junio C Hamano, Taylor Blau, Kaartic Sivaraam,
	Ævar Arnfjörð Bjarmason, Jakub Narębski

On Mon, Jul 18, 2022 at 5:48 PM Derrick Stolee <derrickstolee@github.com> wrote:
>
> On 7/18/22 7:48 AM, Abhradeep Chakraborty wrote:
> > I just got to know that CRoaring doesn't support Big Endian systems (till now) -
> >
> > https://groups.google.com/g/roaring-bitmaps/c/CzLmIRnYlps
> >
> > What do you think about this?
>
> Git cares enough about compatibility that that might be a
> deal-breaker for taking the code as-is. If we _did_ take it
> as-is, then we would need to not make it available on such
> machines using compiler macros.

Yeah, we can't use the code as-is. We might need to make some
Git-favourable changes on top of it if we use the library.

I still haven't asked him about Relicensing. Let us see what others say.

Thanks :)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-18 11:38   ` Abhradeep Chakraborty
@ 2022-07-18 13:38     ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-18 13:38 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: Jakub Narębski, git, Junio C Hamano, Taylor Blau, Kaartic Sivaraam


On Mon, Jul 18 2022, Abhradeep Chakraborty wrote:

> On Mon, Jul 18, 2022 at 4:43 PM Jakub Narębski <jnareb@gmail.com> wrote:
>>
>> Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:
>>
>> > Hello,
>> >
>> > I need the CRoaring[1] library to use roaring bitmaps. But it has
>> > Apache license v2 which is not compatible with GPLv2[2].
>>
>> Actually Apache License v2.0 *is* compatibile with GPLv2 and GPLv3
>> in the sense that you can include the Apache licensed code (like the
>> CRoaring library) in the GPLv2 project (like Git).
>>
>> Quote from the cited "Apache License V2.0 and GPL Compatibility"[2]:
>>
>>   The Free Software Foundation considers the Apache License, Version 2.0
>>   to be a free software license, compatible with version 3 of the GPL.
>>   The Software Freedom Law Center provides practical advice for
>>   developers about including permissively licensed source.
>>
>>   Apache 2 software can therefore be included in GPLv3 projects, because
>>   the GPLv3 license accepts our software into GPLv3 works. However,
>>   GPLv3 software cannot be included in Apache projects. The licenses are
>>   incompatible in one direction only, and it is a result of ASF's
>>   licensing philosophy and the GPLv3 authors' interpretation of
>>   copyright law.
>
> But the same article also says  -
>
>   Despite our best efforts, the FSF has never considered the Apache License
>   to be compatible with GPL version 2, citing the patent termination
> and indemnification
>   provisions as restrictions not present in the older GPL license. The
> Apache Software
>   Foundation believes that you should always try to obey the
> constraints expressed by
>   the copyright holder when redistributing their work.

...indeed, and for those that don't remember around the time the GPLv3
was being discussed & eventually released having it be compatible with
the Apache license was a major thing that the Apache Foundation and FSF
worked towards.

But we use GPLv2 only, which as you note is explicitly known to be
incompatible with Apache v2.0.


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-18 12:18   ` Derrick Stolee
  2022-07-18 13:15     ` Abhradeep Chakraborty
@ 2022-07-18 21:48     ` brian m. carlson
  2022-07-25 22:14     ` Taylor Blau
  2 siblings, 0 replies; 25+ messages in thread
From: brian m. carlson @ 2022-07-18 21:48 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Abhradeep Chakraborty, git, Junio C Hamano, Taylor Blau,
	Kaartic Sivaraam, Ævar Arnfjörð Bjarmason,
	Jakub Narębski

[-- Attachment #1: Type: text/plain, Size: 1400 bytes --]

On 2022-07-18 at 12:18:14, Derrick Stolee wrote:
> On 7/18/22 7:48 AM, Abhradeep Chakraborty wrote:
> > I just got to know that CRoaring doesn't support Big Endian systems (till now) -
> > 
> > https://groups.google.com/g/roaring-bitmaps/c/CzLmIRnYlps
> > 
> > What do you think about this?
> 
> Git cares enough about compatibility that that might be a
> deal-breaker for taking the code as-is. If we _did_ take it
> as-is, then we would need to not make it available on such
> machines using compiler macros.

Debian definitely targets big-endian systems and the Debian maintainer
will likely not be amused if functionality differs across systems.  That
tends to add a bunch of hassle to the maintenance process and ends up
resulting in bug reports and a poor user experience.

I certainly strongly feel that our code should be fully functional
across all architectures that make POSIX-compatible assumptions,
including big-endian systems.  I've ported code to make it work on
UltraSPARC before (for endianness and alignment) and it isn't usually
too hard to fix things, so we likely be able to ship code that's
portable.

In addition, we also need to consider that other systems like NetBSD,
Dragonfly BSD, and OpenBSD are not supported upstream, and thus we will
likely need to patch the code anyway.
-- 
brian m. carlson (he/him or they/them)
Toronto, Ontario, CA

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-16 13:50 Can I use CRoaring library in Git? Abhradeep Chakraborty
                   ` (3 preceding siblings ...)
  2022-07-18 11:48 ` Abhradeep Chakraborty
@ 2022-07-21  4:07 ` Abhradeep Chakraborty
  2022-07-21  6:12   ` Junio C Hamano
  4 siblings, 1 reply; 25+ messages in thread
From: Abhradeep Chakraborty @ 2022-07-21  4:07 UTC (permalink / raw)
  To: git, Junio C Hamano, Taylor Blau, Kaartic Sivaraam,
	Derrick Stolee, Ævar Arnfjörð Bjarmason,
	Jakub Narębski

On Sat, Jul 16, 2022 at 7:20 PM Abhradeep Chakraborty
<chakrabortyabhradeep79@gmail.com> wrote:
>
> Hello,
>
> I need the CRoaring[1] library to use roaring bitmaps. But it has
> Apache license v2 which is not compatible with GPLv2[2].

I have reached out to Daniel and he agreed to make CRoaring
dual-licensed under MIT and Apachev2[1].
Now, I can use CRoaring, right?

[1] https://groups.google.com/g/roaring-bitmaps/c/0d7KoA79k3A

Thanks :)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-21  4:07 ` Abhradeep Chakraborty
@ 2022-07-21  6:12   ` Junio C Hamano
  2022-07-21 12:14     ` Derrick Stolee
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2022-07-21  6:12 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: git, Taylor Blau, Kaartic Sivaraam, Derrick Stolee,
	Ævar Arnfjörð Bjarmason, Jakub Narębski

Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:

> On Sat, Jul 16, 2022 at 7:20 PM Abhradeep Chakraborty
> <chakrabortyabhradeep79@gmail.com> wrote:
>>
>> Hello,
>>
>> I need the CRoaring[1] library to use roaring bitmaps. But it has
>> Apache license v2 which is not compatible with GPLv2[2].
>
> I have reached out to Daniel and he agreed to make CRoaring
> dual-licensed under MIT and Apachev2[1].
> Now, I can use CRoaring, right?
>
> [1] https://groups.google.com/g/roaring-bitmaps/c/0d7KoA79k3A
>
> Thanks :)

Nice.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-21  6:12   ` Junio C Hamano
@ 2022-07-21 12:14     ` Derrick Stolee
  2022-07-21 13:51       ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 25+ messages in thread
From: Derrick Stolee @ 2022-07-21 12:14 UTC (permalink / raw)
  To: Junio C Hamano, Abhradeep Chakraborty
  Cc: git, Taylor Blau, Kaartic Sivaraam,
	Ævar Arnfjörð Bjarmason, Jakub Narębski

On 7/21/2022 2:12 AM, Junio C Hamano wrote:
> Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:
> 
>> On Sat, Jul 16, 2022 at 7:20 PM Abhradeep Chakraborty
>> <chakrabortyabhradeep79@gmail.com> wrote:
>>>
>>> Hello,
>>>
>>> I need the CRoaring[1] library to use roaring bitmaps. But it has
>>> Apache license v2 which is not compatible with GPLv2[2].
>>
>> I have reached out to Daniel and he agreed to make CRoaring
>> dual-licensed under MIT and Apachev2[1].
>> Now, I can use CRoaring, right?
>>
>> [1] https://groups.google.com/g/roaring-bitmaps/c/0d7KoA79k3A
>>
>> Thanks :)
> 
> Nice.

Great news! Thanks for reaching out. I'm pleasantly surprised at
the turnaround. Good luck integrating it into the Git codebase!

-Stolee

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-21 12:14     ` Derrick Stolee
@ 2022-07-21 13:51       ` Ævar Arnfjörð Bjarmason
  2022-07-21 14:57         ` Abhradeep Chakraborty
  0 siblings, 1 reply; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-21 13:51 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Junio C Hamano, Abhradeep Chakraborty, git, Taylor Blau,
	Kaartic Sivaraam, Jakub Narębski


On Thu, Jul 21 2022, Derrick Stolee wrote:

> On 7/21/2022 2:12 AM, Junio C Hamano wrote:
>> Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> writes:
>> 
>>> On Sat, Jul 16, 2022 at 7:20 PM Abhradeep Chakraborty
>>> <chakrabortyabhradeep79@gmail.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I need the CRoaring[1] library to use roaring bitmaps. But it has
>>>> Apache license v2 which is not compatible with GPLv2[2].
>>>
>>> I have reached out to Daniel and he agreed to make CRoaring
>>> dual-licensed under MIT and Apachev2[1].
>>> Now, I can use CRoaring, right?
>>>
>>> [1] https://groups.google.com/g/roaring-bitmaps/c/0d7KoA79k3A
>>>
>>> Thanks :)
>> 
>> Nice.
>
> Great news! Thanks for reaching out. I'm pleasantly surprised at
> the turnaround. Good luck integrating it into the Git codebase!

It's great that the primary author of the library wants to release it
under a compatible license.

But I feel like I'm missing something here, don't we still need the
other contributors to that code to sign off on such a license change,
and for us to be comfortable with integrating such code?

I tried a one-liner to see who has git-blame-able ranges[1] in the code,
which of course is just a rough approximation of "derived work" and
"copyright holder".

My understanding (again, not a lawyer and all that) is that such
transitions happen one of a few ways:

 A. One entity had been assigned copyright in the first place, and can
    re-license the work. E.g. the FSF requiring copyright assignments
    for anything non-trivial.

 B. The license itself has an "upgrade" clause (e.g. GPLv2 "or later"
    projects being GPLv3 compatible).

 C. All copyright holders (or near enough) agree to
    relicense. E.g. OpenStreetMap went through this process at some
    point.

Aren't we just at the beginning (but already past the most significant
step) of C?

1. $ git ls-files | xargs -P 8 -L 1 git -P blame --porcelain HEAD -- |grep -E '^author ' 2>/dev/null |sort|uniq -c|sort -nr
   2872 author Daniel Lemire
    131 author Owen Kaser
     87 author AE1020
     70 author Cerul Alain
     63 author Andrei Gudkov
     62 author François Saint-Jacques
     50 author Jacob Evans
     40 author Tom Cornebize
     37 author Mats Klepsland
     27 author Luca Deri
     17 author Mario Rugiero
     14 author DarrenJiang13
     11 author Wojciech Muła
     11 author GuillaumeHolley
     10 author Mario J. Rugiero
      9 author Paul Smith
      9 author Matt Olan
      6 author Salvatore Previti
      6 author Brian-Esch
      5 author tony eve
      5 author Chris O'Hara
      4 author Alexander Gallego
      3 author Zachary Dremann
      3 author Simon McVittie
      3 author Richard Odenweller
      3 author Hurricane Lee
      3 author daniel-j-h
      2 author stdpain
      2 author Shawn Cao
      2 author Murali Vemulapati
      2 author longqimin
      1 author Yuce Tekol
      1 author yperbasis
      1 author Saulius Grigaliunas
      1 author plantree
      1 author Nathan Kurz
      1 author Guillaume Holley
      1 author gssiyankai
      1 author Daniel Lem_equal(cm12, array_container_cardinality(AM));
      1 author Amos Bird

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-21 13:51       ` Ævar Arnfjörð Bjarmason
@ 2022-07-21 14:57         ` Abhradeep Chakraborty
  2022-07-22 11:07           ` Ævar Arnfjörð Bjarmason
  0 siblings, 1 reply; 25+ messages in thread
From: Abhradeep Chakraborty @ 2022-07-21 14:57 UTC (permalink / raw)
  To: Ævar Arnfjörð Bjarmason
  Cc: Derrick Stolee, Junio C Hamano, git, Taylor Blau,
	Kaartic Sivaraam, Jakub Narębski

On Thu, Jul 21, 2022 at 7:29 PM Ævar Arnfjörð Bjarmason
<avarab@gmail.com> wrote:
>
> It's great that the primary author of the library wants to release it
> under a compatible license.
>
> But I feel like I'm missing something here, don't we still need the
> other contributors to that code to sign off on such a license change,
> and for us to be comfortable with integrating such code?

As far as I see their commits, they don't use sign-off in any of their commits.
I know what you want to mean but the license text uses "The CRoaring
authors" rather than "Daniel Lemire". Below is the text -

    /*
    * MIT License
    *
    * Copyright 2016-2022 The CRoaring authors
   *
   * Permission is hereby granted, free of charge, to any
   * person obtaining a copy of this software and associated
     ...
   */

So, isn't it enough for us?

> My understanding (again, not a lawyer and all that) is that such
> transitions happen one of a few ways:
>
>  A. One entity had been assigned copyright in the first place, and can
>     re-license the work. E.g. the FSF requiring copyright assignments
>     for anything non-trivial.
>
>  B. The license itself has an "upgrade" clause (e.g. GPLv2 "or later"
>     projects being GPLv3 compatible).
>
>  C. All copyright holders (or near enough) agree to
>     relicense. E.g. OpenStreetMap went through this process at some
>     point.

I got your point here. I am sure that "All copyright holders" have no
problem with this relicensing.

Daniel already said in his comment[1] that they do not have any problem with it.

[1] https://groups.google.com/g/roaring-bitmaps/c/0d7KoA79k3A/m/t8e09-wPAgAJ

Thanks :)

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-21 14:57         ` Abhradeep Chakraborty
@ 2022-07-22 11:07           ` Ævar Arnfjörð Bjarmason
  0 siblings, 0 replies; 25+ messages in thread
From: Ævar Arnfjörð Bjarmason @ 2022-07-22 11:07 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: Derrick Stolee, Junio C Hamano, git, Taylor Blau,
	Kaartic Sivaraam, Jakub Narębski


On Thu, Jul 21 2022, Abhradeep Chakraborty wrote:

> On Thu, Jul 21, 2022 at 7:29 PM Ævar Arnfjörð Bjarmason
> <avarab@gmail.com> wrote:
>>
>> It's great that the primary author of the library wants to release it
>> under a compatible license.
>>
>> But I feel like I'm missing something here, don't we still need the
>> other contributors to that code to sign off on such a license change,
>> and for us to be comfortable with integrating such code?
>
> As far as I see their commits, they don't use sign-off in any of their commits.

That's unrelated, that's just a convention linux.git & git.git (and
maybe some others) use to mean "I pinky promise this is my work, or can
be licensed under the project terms".

It doesn't impact how copyright or software licencing works in general.

> I know what you want to mean but the license text uses "The CRoaring
> authors" rather than "Daniel Lemire". Below is the text -
>
>     /*
>     * MIT License
>     *
>     * Copyright 2016-2022 The CRoaring authors
>    *
>    * Permission is hereby granted, free of charge, to any
>    * person obtaining a copy of this software and associated
>      ...
>    */
>
> So, isn't it enough for us?

That's a commonly used shorthand for not having to exhaustively list all
authors everywhere, but it's unrelated to the process by which
dual-licencing can happen after the fact.

If you and I come up with a 1000 line file together (each contributing
500 lines) and it says "copyright <this file's authors> and we license
it under the GPLv3" that doesn't give either of us permission to then
re-license the work later without the other copyright holder's approval.

>> My understanding (again, not a lawyer and all that) is that such
>> transitions happen one of a few ways:
>>
>>  A. One entity had been assigned copyright in the first place, and can
>>     re-license the work. E.g. the FSF requiring copyright assignments
>>     for anything non-trivial.
>>
>>  B. The license itself has an "upgrade" clause (e.g. GPLv2 "or later"
>>     projects being GPLv3 compatible).
>>
>>  C. All copyright holders (or near enough) agree to
>>     relicense. E.g. OpenStreetMap went through this process at some
>>     point.
>
> I got your point here. I am sure that "All copyright holders" have no
> problem with this relicensing.

Yes, that seems unlikely in practice. But I'm asking because it's not
obvious from the linked-to discussion that anyone except the primary
author decided this.

So if we integrate it into git.git and one of those people /would/ have
a problem with it we'd be the ones in trouble.

> Daniel already said in his comment[1] that they do not have any problem with it.
>
> [1] https://groups.google.com/g/roaring-bitmaps/c/0d7KoA79k3A/m/t8e09-wPAgAJ

Anyway, I don't see much of a point in two non-lawyers continuing this
discussion, I just asked in case there was something obvious I was
missing. E.g. the primary author is a professor, perhaps all (or
substantial amount of) the contributors were students at the same
university, and some copyright assignment etc. happened behind the
scenes.

I think it would be prudent if/when we decide to integrate this code to
ask our contacts at the SFC to give this a once-over, luckily we do have
actual laywers to call on if needed :)


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-18  8:57           ` Abhradeep Chakraborty
@ 2022-07-25 22:11             ` Taylor Blau
  0 siblings, 0 replies; 25+ messages in thread
From: Taylor Blau @ 2022-07-25 22:11 UTC (permalink / raw)
  To: Abhradeep Chakraborty
  Cc: Junio C Hamano, Kaartic Sivaraam,
	Ævar Arnfjörð Bjarmason, git, Derrick Stolee

On Mon, Jul 18, 2022 at 02:27:59PM +0530, Abhradeep Chakraborty wrote:
> > I assume that Abhradeep will want to pursue some format redesign as part
> > of the transition, though, at least to see if changing the format beyond
> > a version bump and new compression scheme is worthwhile.
>
> I haven't thought much about it until now. As far as I think we don't
> need Xor Flag anymore.

I think that would be an interesting experiment to run. I suspect that
XOR-compression is helping us quite a lot with on-disk file size with
EWAH bitmaps, but that may or may not be true with Roaring.

If Roaring can compress the same selection of bitmaps to a comparable
size without the additional layer of XOR-offsets, then I think they are
additional complexity that can be eschewed for now.

(Keep in mind, we can always revisit that decision if we decide that we
want to add XOR compression back in through another version bump. But it
would be good to make backwards-incompatible changes as infrequently as
possible. So this is a good opportunity for us to be as thorough in our
experimentation as possible).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-18 12:18   ` Derrick Stolee
  2022-07-18 13:15     ` Abhradeep Chakraborty
  2022-07-18 21:48     ` brian m. carlson
@ 2022-07-25 22:14     ` Taylor Blau
  2022-07-25 22:35       ` rsbecker
  2 siblings, 1 reply; 25+ messages in thread
From: Taylor Blau @ 2022-07-25 22:14 UTC (permalink / raw)
  To: Derrick Stolee
  Cc: Abhradeep Chakraborty, git, Junio C Hamano, Kaartic Sivaraam,
	Ævar Arnfjörð Bjarmason, Jakub Narębski

On Mon, Jul 18, 2022 at 08:18:14AM -0400, Derrick Stolee wrote:
> On 7/18/22 7:48 AM, Abhradeep Chakraborty wrote:
> > I just got to know that CRoaring doesn't support Big Endian systems (till now) -
> >
> > https://groups.google.com/g/roaring-bitmaps/c/CzLmIRnYlps
> >
> > What do you think about this?
>
> Git cares enough about compatibility that that might be a
> deal-breaker for taking the code as-is. If we _did_ take it
> as-is, then we would need to not make it available on such
> machines using compiler macros.

I definitely agree here. If I'm understanding CRoaring's implementation
correctly, a bitmap written on a machine that uses big endian would be
unreadable on a little endian machine and vice-versa.

That's definitely *not* the case with the existing EWAH bitmaps, which
are readable on machines using either endianness, since we always write
numbers in network byte order, independent of machine endinaness.

(I suspect you know all of this already, but just stating here
explicitly for the benefit of the list).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 25+ messages in thread

* RE: Can I use CRoaring library in Git?
  2022-07-25 22:14     ` Taylor Blau
@ 2022-07-25 22:35       ` rsbecker
  2022-07-25 23:37         ` Taylor Blau
  0 siblings, 1 reply; 25+ messages in thread
From: rsbecker @ 2022-07-25 22:35 UTC (permalink / raw)
  To: 'Taylor Blau', 'Derrick Stolee'
  Cc: 'Abhradeep Chakraborty', 'git',
	'Junio C Hamano', 'Kaartic Sivaraam',
	'Ævar Arnfjörð Bjarmason',
	'Jakub Narębski'

On July 25, 2022 6:15 PM, Taylor Blau wrote:
>On Mon, Jul 18, 2022 at 08:18:14AM -0400, Derrick Stolee wrote:
>> On 7/18/22 7:48 AM, Abhradeep Chakraborty wrote:
>> > I just got to know that CRoaring doesn't support Big Endian systems
>> > (till now) -
>> >
>> > https://groups.google.com/g/roaring-bitmaps/c/CzLmIRnYlps
>> >
>> > What do you think about this?
>>
>> Git cares enough about compatibility that that might be a deal-breaker
>> for taking the code as-is. If we _did_ take it as-is, then we would
>> need to not make it available on such machines using compiler macros.
>
>I definitely agree here. If I'm understanding CRoaring's implementation correctly, a
>bitmap written on a machine that uses big endian would be unreadable on a little
>endian machine and vice-versa.
>
>That's definitely *not* the case with the existing EWAH bitmaps, which are
>readable on machines using either endianness, since we always write numbers in
>network byte order, independent of machine endinaness.
>
>(I suspect you know all of this already, but just stating here explicitly for the
>benefit of the list).

It is possible to use a Clean/Smudge filter to normalize the format to be independent of the endian-ness of the target platforms?

I have to admit being a fan of that approach.
--Randall


^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: Can I use CRoaring library in Git?
  2022-07-25 22:35       ` rsbecker
@ 2022-07-25 23:37         ` Taylor Blau
  0 siblings, 0 replies; 25+ messages in thread
From: Taylor Blau @ 2022-07-25 23:37 UTC (permalink / raw)
  To: rsbecker
  Cc: 'Taylor Blau', 'Derrick Stolee',
	'Abhradeep Chakraborty', 'git',
	'Junio C Hamano', 'Kaartic Sivaraam',
	'Ævar Arnfjörð Bjarmason',
	'Jakub Narębski'

On Mon, Jul 25, 2022 at 06:35:30PM -0400, rsbecker@nexbridge.com wrote:
> It is possible to use a Clean/Smudge filter to normalize the format to
> be independent of the endian-ness of the target platforms?
>
> I have to admit being a fan of that approach.

We aren't checking these files in, since these are just the .bitmap
files stored in $GIT_DIR/objects/pack. So we'd have to invent such a
mechanism, and I suspect it would be less effort (and more
straightforward to implement) if we write all numbers in network byte
order from the start.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2022-07-25 23:37 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-16 13:50 Can I use CRoaring library in Git? Abhradeep Chakraborty
2022-07-16 14:16 ` Ævar Arnfjörð Bjarmason
2022-07-16 16:26   ` Abhradeep Chakraborty
2022-07-17 12:25     ` Kaartic Sivaraam
2022-07-17 22:00       ` Junio C Hamano
2022-07-17 22:25         ` Taylor Blau
2022-07-18  8:57           ` Abhradeep Chakraborty
2022-07-25 22:11             ` Taylor Blau
2022-07-17 14:43 ` Derrick Stolee
2022-07-18 11:13 ` Jakub Narębski
2022-07-18 11:38   ` Abhradeep Chakraborty
2022-07-18 13:38     ` Ævar Arnfjörð Bjarmason
2022-07-18 11:48 ` Abhradeep Chakraborty
2022-07-18 12:18   ` Derrick Stolee
2022-07-18 13:15     ` Abhradeep Chakraborty
2022-07-18 21:48     ` brian m. carlson
2022-07-25 22:14     ` Taylor Blau
2022-07-25 22:35       ` rsbecker
2022-07-25 23:37         ` Taylor Blau
2022-07-21  4:07 ` Abhradeep Chakraborty
2022-07-21  6:12   ` Junio C Hamano
2022-07-21 12:14     ` Derrick Stolee
2022-07-21 13:51       ` Ævar Arnfjörð Bjarmason
2022-07-21 14:57         ` Abhradeep Chakraborty
2022-07-22 11:07           ` Ævar Arnfjörð Bjarmason

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.