All of lore.kernel.org
 help / color / mirror / Atom feed
* Transparently encrypt repository contents with GPG
@ 2009-03-12 21:19 Matthias Nothhaft
  2009-03-12 21:34 ` Sverre Rabbelier
  2012-04-21 17:25 ` bigbear
  0 siblings, 2 replies; 18+ messages in thread
From: Matthias Nothhaft @ 2009-03-12 21:19 UTC (permalink / raw)
  To: git

Hi,

I'm new to Git but I really already love it. ;-)

I would like to have repository that transparently encrypts and
decrypts all files using GPG.

What I need is a way to automatically modify each file

a) before it is written in the repository
b) after it is read from the repository

Is there a way to get this work somehow? Can someone give me some
hints where I need to begin?

regards,
Matthias

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-12 21:19 Transparently encrypt repository contents with GPG Matthias Nothhaft
@ 2009-03-12 21:34 ` Sverre Rabbelier
  2009-03-13 10:46   ` Michael J Gruber
  2012-04-21 17:25 ` bigbear
  1 sibling, 1 reply; 18+ messages in thread
From: Sverre Rabbelier @ 2009-03-12 21:34 UTC (permalink / raw)
  To: Matthias Nothhaft; +Cc: git

Heya,

On Thu, Mar 12, 2009 at 22:19, Matthias Nothhaft
<matthias.nothhaft@googlemail.com> > What I need is a way to
automatically modify each file
>
> a) before it is written in the repository
> b) after it is read from the repository

Have a look at smudging, you might not need to touch the git source
code at all ;).

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-12 21:34 ` Sverre Rabbelier
@ 2009-03-13 10:46   ` Michael J Gruber
  2009-03-13 10:51     ` Sverre Rabbelier
  2009-03-13 11:15     ` Thomas Rast
  0 siblings, 2 replies; 18+ messages in thread
From: Michael J Gruber @ 2009-03-13 10:46 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Matthias Nothhaft, git

Sverre Rabbelier venit, vidit, dixit 12.03.2009 22:34:
> Heya,
> 
> On Thu, Mar 12, 2009 at 22:19, Matthias Nothhaft
> <matthias.nothhaft@googlemail.com> > What I need is a way to
> automatically modify each file
>>
>> a) before it is written in the repository
>> b) after it is read from the repository
> 
> Have a look at smudging, you might not need to touch the git source
> code at all ;).
> 

And people asked me not to be cryptic... even though the OP explicitely
asked for encryption, of course ;)

"git help attributes" may help: look for filter and set attributes and
config (filter.$name.{clean,smudge}) accordingly. smudge should probably
decrypt, clean should encrypt.

BTW: Why not use an encrypted file system? That way your work tree would
be encrypted also.

Cheers,
Michael

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-13 10:46   ` Michael J Gruber
@ 2009-03-13 10:51     ` Sverre Rabbelier
  2009-03-13 11:15     ` Thomas Rast
  1 sibling, 0 replies; 18+ messages in thread
From: Sverre Rabbelier @ 2009-03-13 10:51 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Matthias Nothhaft, git

Heya,

On Fri, Mar 13, 2009 at 11:46, Michael J Gruber
<git@drmicha.warpmail.net> wrote:
> And people asked me not to be cryptic... even though the OP explicitely
> asked for encryption, of course ;)

I wasn't being cryptic, I just don't remember the details of smudge,
just that it exists, and that it allows you to perform operations on a
file on checkout and on add.

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-13 10:46   ` Michael J Gruber
  2009-03-13 10:51     ` Sverre Rabbelier
@ 2009-03-13 11:15     ` Thomas Rast
  2009-03-13 11:17       ` Sverre Rabbelier
  1 sibling, 1 reply; 18+ messages in thread
From: Thomas Rast @ 2009-03-13 11:15 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Sverre Rabbelier, Matthias Nothhaft, git

[-- Attachment #1: Type: text/plain, Size: 668 bytes --]

Michael J Gruber wrote:
> "git help attributes" may help: look for filter and set attributes and
> config (filter.$name.{clean,smudge}) accordingly. smudge should probably
> decrypt, clean should encrypt.

Wouldn't this trip over the randomness included in all encryption [to
avoid generating the same cyphertext for two separate identical
messages, which gives away some information], which would let git
think the file has been changed as soon as its stat info has changed
(or is just racy)?

Not to mention that this makes most source-oriented features such as
diff, blame, merge, etc., rather useless.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-13 11:15     ` Thomas Rast
@ 2009-03-13 11:17       ` Sverre Rabbelier
  2009-03-13 13:56         ` Michael J Gruber
  0 siblings, 1 reply; 18+ messages in thread
From: Sverre Rabbelier @ 2009-03-13 11:17 UTC (permalink / raw)
  To: Thomas Rast; +Cc: Michael J Gruber, Matthias Nothhaft, git

Heya,

On Fri, Mar 13, 2009 at 12:15, Thomas Rast <trast@student.ethz.ch> wrote:
> Not to mention that this makes most source-oriented features such as
> diff, blame, merge, etc., rather useless.

I would assume that smudge takes care of this somehow, it'd seem like
a rather useless feature otherwise :).

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-13 11:17       ` Sverre Rabbelier
@ 2009-03-13 13:56         ` Michael J Gruber
  2009-03-13 14:19           ` Sverre Rabbelier
                             ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Michael J Gruber @ 2009-03-13 13:56 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Thomas Rast, Michael J Gruber, Matthias Nothhaft, git

Sverre Rabbelier venit, vidit, dixit 13.03.2009 12:17:
> Heya,
> 
> On Fri, Mar 13, 2009 at 12:15, Thomas Rast <trast@student.ethz.ch>
> wrote:
>> Not to mention that this makes most source-oriented features such
>> as diff, blame, merge, etc., rather useless.
> 
> I would assume that smudge takes care of this somehow, it'd seem
> like a rather useless feature otherwise :).

Sverre was being prophetic with the somehow. Here's a working setup
(though I still don't know why not to use luks):

In .gitattributes (or.git/info/a..) use

* filter=gpg diff=gpg

In your config:

[filter "gpg"]
        smudge = gpg -d -q --batch --no-tty
        clean = gpg -ea -q --batch --no-tty -r C920A124
[diff "gpg"]
        textconv = decrypt

This gives you textual diffs even in log! You want use gpg-agent here.

Now for Sverre's prophecy and the helper I haven't shown you yet: It
turns out that blobs are not smudged before they are fed to textconv!
[Also, it seems that the textconv config does allow parameters, bit I
haven't checked thoroughly.]

This means that e.g. when diffing work tree with HEAD textconv is called
twice: once is with a smudged file (from the work tree) and once with a
cleaned file (from HEAD). That's why I needed a small helper script
"decrypt" which does nothing but

#!/bin/sh
gpg -d -q --batch --no-tty "$1" || cat $1

Yeah, this assumes gpg errors out because it's fed something unencrypted
(and not encrypted with the wrong key) etc. It's only proof of concept
quality.

Me thinks it's not right that diff is failing to call smudge here, isn't it?

Michael

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-13 13:56         ` Michael J Gruber
@ 2009-03-13 14:19           ` Sverre Rabbelier
  2009-03-13 17:13           ` Jeff King
  2009-03-13 20:23           ` Junio C Hamano
  2 siblings, 0 replies; 18+ messages in thread
From: Sverre Rabbelier @ 2009-03-13 14:19 UTC (permalink / raw)
  To: Michael J Gruber; +Cc: Thomas Rast, Michael J Gruber, Matthias Nothhaft, git

Heya,

On Fri, Mar 13, 2009 at 14:56, Michael J Gruber
<michaeljgruber+gmane@fastmail.fm> wrote:
> Sverre was being prophetic with the somehow. Here's a working setup
> (though I still don't know why not to use luks):

Glad to hear I was right ;). Also awesome that you looked into this
and shared your findings, thanks!

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-13 13:56         ` Michael J Gruber
  2009-03-13 14:19           ` Sverre Rabbelier
@ 2009-03-13 17:13           ` Jeff King
  2009-03-13 20:23           ` Junio C Hamano
  2 siblings, 0 replies; 18+ messages in thread
From: Jeff King @ 2009-03-13 17:13 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: Sverre Rabbelier, Thomas Rast, Michael J Gruber, Matthias Nothhaft, git

On Fri, Mar 13, 2009 at 02:56:22PM +0100, Michael J Gruber wrote:

> Sverre was being prophetic with the somehow. Here's a working setup
> (though I still don't know why not to use luks):
> 
> In .gitattributes (or.git/info/a..) use
> 
> * filter=gpg diff=gpg
> 
> In your config:
> 
> [filter "gpg"]
>         smudge = gpg -d -q --batch --no-tty
>         clean = gpg -ea -q --batch --no-tty -r C920A124
> [diff "gpg"]
>         textconv = decrypt
> 
> This gives you textual diffs even in log! You want use gpg-agent here.

This is not going to work very well in general.  Smudging and cleaning
is about putting the canonical version of a file in the git repo, and
munging it for the working tree. Trying to go backwards is going to lead
to problems, including:

  1. Git sometimes wants to look at content of special files inside
     trees, like .gitignore. Now it can't.

  2. Git uses timestamps and inodes to decide whether files need to be
     looked at all to determine if they are different. So when you do
     a checkout and "git diff", everything will look OK. But when it
     does actually look at file contents, it compares canonical
     versions. And your canonical versions are going to be _different_
     everytime you encrypt, even if the content is the same:

       echo content >file
       git add file
       git diff ;# no output
       touch file
       git diff ;# looks like file is totally rewritten

     So you will probably end up with extra cruft in your commits if you
     ever touch files.

> Now for Sverre's prophecy and the helper I haven't shown you yet: It
> turns out that blobs are not smudged before they are fed to textconv!
> [Also, it seems that the textconv config does allow parameters, bit I
> haven't checked thoroughly.]

I don't think they should be smudged. Smudging is about converting for
the working tree, and the diff is operating on canonical formats. If
anything, I think the error is that we feed smudged data from the
working tree to textconv; we should always be handing it clean data (and
this goes for external diff, too, which I suspect behaves the same way).

I haven't looked, but it probably is a result of the optimization to
reuse worktree files.

-Peff

PS If it isn't obvious, I don't think this smudge/filter technique is
the right way to go about this. But one final comment if you did want to
pursue this: you are using asymmetric encryption in your GPG invocation,
which is going to be a lot slower and the result will take up more
space. Try using a symmetric cipher.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-13 13:56         ` Michael J Gruber
  2009-03-13 14:19           ` Sverre Rabbelier
  2009-03-13 17:13           ` Jeff King
@ 2009-03-13 20:23           ` Junio C Hamano
  2009-03-14 11:16             ` Michael J Gruber
  2009-03-17  8:22             ` Jeff King
  2 siblings, 2 replies; 18+ messages in thread
From: Junio C Hamano @ 2009-03-13 20:23 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: Sverre Rabbelier, Thomas Rast, Michael J Gruber, Matthias Nothhaft, git

Michael J Gruber <michaeljgruber+gmane@fastmail.fm> writes:

> In .gitattributes (or.git/info/a..) use
>
> * filter=gpg diff=gpg
>
> In your config:
>
> [filter "gpg"]
>         smudge = gpg -d -q --batch --no-tty
>         clean = gpg -ea -q --batch --no-tty -r C920A124
> [diff "gpg"]
>         textconv = decrypt
>
> This gives you textual diffs even in log! You want use gpg-agent here.

Don't do this.

Think why the smudge/clean pair exists.

The version controlled data, the contents, may not be suitable for
consumption in the work tree in its verbatim form.  For example, a cross
platform project would want to consistently use LF line termination inside
a repository, but on a platform whose tools expect CRLF line endings, the
contents cannot be used verbatim.  We "smudge" the contents running
unix2dos when checking things out on such platforms, and "clean" the
platform specific CRLF line endings by running dos2unix when checking
things in.  By doing so, you can see what really got changed between
versions without getting distracted, and more importantly, "you" in this
sentence is not limited to the human end users alone.

git internally runs diff and xdelta to see what was changed, so that:

 * it can reduce storage requirement when it runs pack-objects;

 * it can check what path in the preimage was similar to what other path
   in the postimage, to deduce a rename;

 * it can check what blocks of lines in the postimage came from what other
   blocks of lines in the preimage, to pass blames across file boundaries.

If your "clean" encrypts and "smudge" decrypts, it means you are refusing
all the benifit git offers.  You are making a pair of similar "smudged"
contents totally dissimilar in their "clean" counterparts.  That is simply
backwards.

As the sole raison d'etre of diff.textconv is to allow potentially lossy
conversion (e.g. msword-to-text) applied to the preimage and postimage
pair of contents (that are supposed to be "clean") before giving a textual
diff to human consumption, the above config may appear to work, but if you
really want an encrypted repository, you should be using an encrypting
filesystem.  That would give an added benefit that the work tree
associated with your repository would also be encrypted.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-13 20:23           ` Junio C Hamano
@ 2009-03-14 11:16             ` Michael J Gruber
  2009-03-14 18:45               ` Junio C Hamano
  2009-03-17  8:22             ` Jeff King
  1 sibling, 1 reply; 18+ messages in thread
From: Michael J Gruber @ 2009-03-14 11:16 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Sverre Rabbelier, Thomas Rast, Michael J Gruber,
	Matthias Nothhaft, git, Jeff King

Junio C Hamano venit, vidit, dixit 13.03.2009 21:23:
> Michael J Gruber <michaeljgruber+gmane@fastmail.fm> writes:
> 
>> In .gitattributes (or.git/info/a..) use
>>
>> * filter=gpg diff=gpg
>>
>> In your config:
>>
>> [filter "gpg"]
>>         smudge = gpg -d -q --batch --no-tty
>>         clean = gpg -ea -q --batch --no-tty -r C920A124
>> [diff "gpg"]
>>         textconv = decrypt
>>
>> This gives you textual diffs even in log! You want use gpg-agent here.
> 
> Don't do this.
> 
> Think why the smudge/clean pair exists.
> 
> The version controlled data, the contents, may not be suitable for
> consumption in the work tree in its verbatim form.  For example, a cross
> platform project would want to consistently use LF line termination inside
> a repository, but on a platform whose tools expect CRLF line endings, the
> contents cannot be used verbatim.  We "smudge" the contents running
> unix2dos when checking things out on such platforms, and "clean" the
> platform specific CRLF line endings by running dos2unix when checking
> things in.  By doing so, you can see what really got changed between
> versions without getting distracted, and more importantly, "you" in this
> sentence is not limited to the human end users alone.
> 
> git internally runs diff and xdelta to see what was changed, so that:
> 
>  * it can reduce storage requirement when it runs pack-objects;
> 
>  * it can check what path in the preimage was similar to what other path
>    in the postimage, to deduce a rename;
> 
>  * it can check what blocks of lines in the postimage came from what other
>    blocks of lines in the preimage, to pass blames across file boundaries.
> 
> If your "clean" encrypts and "smudge" decrypts, it means you are refusing
> all the benifit git offers.  You are making a pair of similar "smudged"
> contents totally dissimilar in their "clean" counterparts.  That is simply
> backwards.
> 
> As the sole raison d'etre of diff.textconv is to allow potentially lossy
> conversion (e.g. msword-to-text) applied to the preimage and postimage
> pair of contents (that are supposed to be "clean") before giving a textual
> diff to human consumption, the above config may appear to work, but if you
> really want an encrypted repository, you should be using an encrypting
> filesystem.  That would give an added benefit that the work tree
> associated with your repository would also be encrypted.

Exactly. This is why I suggested using cryptfs/luks in my first response
already.

But I don't know the OP's requirements, which is why I also told him how
to do what he wanted, even though it has the drawbacks you and Jeff (and
maybe I) mentioned. Maybe it's an attempt at hosting a semi-private repo
on a public (free) server?

Besides the non-text nature of encrypted content, the problem here is
that d(e(x))=x for all x but e(d(x)) differs from x most probably, and
hopefully randomly, unless you use the right version of debian's openssl
of course ;)

That being said:
git diff calls textconv filters with smudged as well as cleaned files
(when diffing work tree files to blobs), and this does not seem right. I
hope this is not happening with the internal diff, nor with crlf!

Since both the cleaned and the smudged version are supposed to be
"authoritative" (as opposed to the textconv'ed one) one may argue either
way what's the right approach. For internal use comparing the cleaned
versions may make more sense, for displaying diff's the checked-out
form, i.e. smudged versions make more sense.

But that is another topic which would need to be substantiated with
tests. It's not completely unlikely I may come up with some, but don't
count on it...

Cheers,
Michael

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-14 11:16             ` Michael J Gruber
@ 2009-03-14 18:45               ` Junio C Hamano
  2009-03-16 16:01                 ` Michael J Gruber
  0 siblings, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2009-03-14 18:45 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: Sverre Rabbelier, Thomas Rast, Matthias Nothhaft, git, Jeff King

Michael J Gruber <git@drmicha.warpmail.net> writes:

> Since both the cleaned and the smudged version are supposed to be
> "authoritative" (as opposed to the textconv'ed one) one may argue either
> way what's the right approach.

Smudged one can never be authoritative.  That is the whole point of smudge
filter and in general the whole convert_to_working_tree() infrastructure.
It changes depending on who you are (e.g. on what platform you are on).
So running comparison between two clean versions is the only sane thing to
do.

You could argue textconv should work on smudged contents or on clean
contents before smudging.  As long as it is done consistently, I do not
care either way too deeply, as its output is not supposed to be used for
anything but human consumption.  Two equally sane arrangement would be:

 (1) Start from two clean contents (run convert_to_git() if contents were
     obtained from the work tree), run textconv, run diff, and output the
     result literally; or

 (2) Start from two smudged contents (run convert_to_working_tree() for
     contents taken from the repository), run textconv, run diff, and
     run clean before sending the result to the output.

The former assumes a textconv filter that wants to work on clean
contents, the latter for a one that expects smudged input.  I probably
would suggest going the former approach, as it is consistent with the
general principle in other parts of the system (the internal processing
happens on clean contents).

Both of the above two assumes that the output should come in clean form;
it is consistent with the way normal diff is generated for consumption by
git-apply. You can certainly argue that the final output should be in
smudged form when textconv is used, as it is purely for human consumption,
and is not even supposed to be fed to apply.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-14 18:45               ` Junio C Hamano
@ 2009-03-16 16:01                 ` Michael J Gruber
  2009-03-17  7:40                   ` Jeff King
  0 siblings, 1 reply; 18+ messages in thread
From: Michael J Gruber @ 2009-03-16 16:01 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Sverre Rabbelier, Thomas Rast, Matthias Nothhaft, git, Jeff King

Junio C Hamano venit, vidit, dixit 14.03.2009 19:45:
> Michael J Gruber <git@drmicha.warpmail.net> writes:
> 
>> Since both the cleaned and the smudged version are supposed to be
>> "authoritative" (as opposed to the textconv'ed one) one may argue either
>> way what's the right approach.
> 
> Smudged one can never be authoritative.  That is the whole point of smudge
> filter and in general the whole convert_to_working_tree() infrastructure.
> It changes depending on who you are (e.g. on what platform you are on).
> So running comparison between two clean versions is the only sane thing to
> do.

Yes. I guess I'm being too much of a mathematician here: if clean is a
well-defined function, then clean(x) is well defined by specifying x. In
that sense x is equally authoritative.
Again, if smudge is the inverse of clean, i.e. smudge and clean are
bijective, then x differs from y iff clean(x) differs from clean(y).

> You could argue textconv should work on smudged contents or on clean
> contents before smudging.  As long as it is done consistently, I do not
> care either way too deeply, as its output is not supposed to be used for
> anything but human consumption.  Two equally sane arrangement would be:
> 
>  (1) Start from two clean contents (run convert_to_git() if contents were
>      obtained from the work tree), run textconv, run diff, and output the
>      result literally; or
> 
>  (2) Start from two smudged contents (run convert_to_working_tree() for
>      contents taken from the repository), run textconv, run diff, and
>      run clean before sending the result to the output.
> 
> The former assumes a textconv filter that wants to work on clean
> contents, the latter for a one that expects smudged input.  I probably
> would suggest going the former approach, as it is consistent with the
> general principle in other parts of the system (the internal processing
> happens on clean contents).
> 
> Both of the above two assumes that the output should come in clean form;
> it is consistent with the way normal diff is generated for consumption by
> git-apply. You can certainly argue that the final output should be in
> smudged form when textconv is used, as it is purely for human consumption,
> and is not even supposed to be fed to apply.

Also, I don't expect clean to be necessarily meaningful when applied to
the result of textconv, and even less so to the output of diff.

Now, a simple test shows that git diff obviously does this when diffing
HEAD to worktree:

diff between HEAD and clean(worktree)

Which is the right thing. It just seems so that textconv is not even
called "in the wrong place of the chain", but messes the diff up in this
way:

diff between textconv(HEAD) and textconv(worktree)

(I expected clean(textconv(worktree)) first, which would be wrong, too).
I.e., the clean filter is ignored completely in the presence of textconv.

OK, I'll stop bugging you, until I checked the existing tests and the
code...

Michael

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-16 16:01                 ` Michael J Gruber
@ 2009-03-17  7:40                   ` Jeff King
  0 siblings, 0 replies; 18+ messages in thread
From: Jeff King @ 2009-03-17  7:40 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: Junio C Hamano, Sverre Rabbelier, Thomas Rast, Matthias Nothhaft, git

On Mon, Mar 16, 2009 at 05:01:33PM +0100, Michael J Gruber wrote:

> Now, a simple test shows that git diff obviously does this when diffing
> HEAD to worktree:
> 
> diff between HEAD and clean(worktree)
> 
> Which is the right thing. It just seems so that textconv is not even
> called "in the wrong place of the chain", but messes the diff up in this
> way:
> 
> diff between textconv(HEAD) and textconv(worktree)
> 
> (I expected clean(textconv(worktree)) first, which would be wrong, too).
> I.e., the clean filter is ignored completely in the presence of textconv.

Yeah, I think this should probably be textconv(clean(worktree)) to match
the regular HEAD/worktree diff (if it isn't already). Can you put
together a test that shows the breakage?

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-13 20:23           ` Junio C Hamano
  2009-03-14 11:16             ` Michael J Gruber
@ 2009-03-17  8:22             ` Jeff King
  1 sibling, 0 replies; 18+ messages in thread
From: Jeff King @ 2009-03-17  8:22 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Michael J Gruber, Sverre Rabbelier, Thomas Rast,
	Michael J Gruber, Matthias Nothhaft, git

On Fri, Mar 13, 2009 at 01:23:08PM -0700, Junio C Hamano wrote:

> As the sole raison d'etre of diff.textconv is to allow potentially lossy
> conversion (e.g. msword-to-text) applied to the preimage and postimage
> pair of contents (that are supposed to be "clean") before giving a textual
> diff to human consumption, the above config may appear to work, but if you
> really want an encrypted repository, you should be using an encrypting
> filesystem.  That would give an added benefit that the work tree
> associated with your repository would also be encrypted.

I can think of one reason that having git do the encryption might be
beneficial: pushing to an untrusted source.

If you encrypted all blobs but kept trees and commits in plaintext, you
could retain (some of) the benefits of git's incremental push. The
downsides, though, are:

  1. You are revealing the hashes of your blobs' plaintext. Which means
     I can try brute-forcing your blobs by checking against a hash
     function.

  2. The remote can't actually look at the blobs. The most obvious
     problem with this is that you can't send it thin packs, since it
     can't actually resolve deltas.

And given the ensuing mess that it would make of the code to
conditionally say "Oh, we have this object, but you're not allowed to
read it", it is almost certainly not worth it.

But maybe somebody can prove me wrong and design a system that allows
efficient encrypted pushing to a non-trusted remote and also doesn't
suck.

-Peff

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2009-03-12 21:19 Transparently encrypt repository contents with GPG Matthias Nothhaft
  2009-03-12 21:34 ` Sverre Rabbelier
@ 2012-04-21 17:25 ` bigbear
  2012-06-17  7:33   ` lalebarde
  1 sibling, 1 reply; 18+ messages in thread
From: bigbear @ 2012-04-21 17:25 UTC (permalink / raw)
  To: git


Matthias Nothhaft wrote
> 
> Hi,
> 
> I'm new to Git but I really already love it. ;-)
> 
> I would like to have repository that transparently encrypts and
> decrypts all files using GPG.
> 
> What I need is a way to automatically modify each file
> 
> a) before it is written in the repository
> b) after it is read from the repository
> 
> Is there a way to get this work somehow? Can someone give me some
> hints where I need to begin?
> 
> regards,
> Matthias
> 
> 

Have come across this on my own search for an encrypted git repo. Matthias
it looks as if somebody has come up with a "working" system that uses the
'smudge & clean' filter features of git. 
Seems to me that to use it for storing the repo on a non trusted or possibly
public git repo with some private content in the files this seems to be a
workable solution.

Transparent Git Encryption
https://gist.github.com/873637
and/or possibly 
https://github.com/shadowhand/git-encrypt

The way to do this is to use git's "smudge" and "clean" filters, but it's
not necessarily recommended for reasons that are explained here by Junio C
Hamano, the maintainer of git:

    http://article.gmane.org/gmane.comp.version-control.git/113221






--
View this message in context: http://git.661346.n2.nabble.com/Transparently-encrypt-repository-contents-with-GPG-tp2470145p7487506.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
  2012-04-21 17:25 ` bigbear
@ 2012-06-17  7:33   ` lalebarde
       [not found]     ` <CAL1Gx-Ufs8TNVeeefAXBnX-eCnEk_DC1w6oJVRPcMcStdL_+-Q@mail.gmail.com>
  0 siblings, 1 reply; 18+ messages in thread
From: lalebarde @ 2012-06-17  7:33 UTC (permalink / raw)
  To: git

Hi,
I am puzzled from the 
http://article.gmane.org/gmane.comp.version-control.git/113221
recommandation of Junio C Hamano , the maintainer of git, to not encrypt
files before pushing them :

Junio C Hamano wrote
> If your "clean" encrypts and "smudge" decrypts, it means you are refusing
> all the benifit git offers.

Junio C Hamano wrote
> the above config may appear to work
*So, does it work or not, or partially ? And if partially, what does not
work ?*

Another issue is the use of the cypher ECB by 
https://github.com/shadowhand/git-encrypt git-encrypt . 
http://stackoverflow.com/questions/1220751/how-to-choose-an-aes-encryption-mode-cbc-ecb-ctr-ocb-cfb
Some  argue it is bad (cf also 
http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation#Electronic_codebook_.28ECB.29
that ). 

So I made some experiments, tacking a 15Mb pdf :

/$ openssl enc -base64 -aes-256-ecb -S 1762851 -k a5G4juy64VVBgfq4
<Wiley.pdf >WileyE1
$ openssl enc -base64 -aes-256-ecb -S 1762851 -k a5G4juy64VVBgfq4 <Wiley.pdf
>WileyE2
$ md5sum WileyE1
d43058d8443777aea871350245d9865b  WileyE1
$ md5sum WileyE2
d43058d8443777aea871350245d9865b  WileyE2

$ openssl enc -base64 -aes-256-ofb -S 1762851 -k a5G4juy64VVBgfq4 <Wiley.pdf
>WileyE1
$ openssl enc -base64 -aes-256-ofb -S 1762851 -k a5G4juy64VVBgfq4 <Wiley.pdf
>WileyE2
503d82849ad53652268d1abdcfbce9de  WileyE1
503d82849ad53652268d1abdcfbce9de  WileyE2

$ openssl enc -base64 -aes-256-cbc -S 1762851 -k a5G4juy64VVBgfq4 <Wiley.pdf
>WileyE1
$ openssl enc -base64 -aes-256-cbc -S 1762851 -k a5G4juy64VVBgfq4 <Wiley.pdf
>WileyE2
e726431cbd9ff8780946ddfad775600a  WileyE1
e726431cbd9ff8780946ddfad775600a  WileyE2/

*As the hash are identical from one run to another, I don't understand why
we should stick to the ECB cypher.*

Can some one clarify the two points please ?


--
View this message in context: http://git.661346.n2.nabble.com/Transparently-encrypt-repository-contents-with-GPG-tp2470145p7561644.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Transparently encrypt repository contents with GPG
       [not found]     ` <CAL1Gx-Ufs8TNVeeefAXBnX-eCnEk_DC1w6oJVRPcMcStdL_+-Q@mail.gmail.com>
@ 2012-06-18 20:03       ` lalebarde
  0 siblings, 0 replies; 18+ messages in thread
From: lalebarde @ 2012-06-18 20:03 UTC (permalink / raw)
  To: git

Thanks for your clarifications ! stars 2 & 3 are still not clear for me.
Probably because I am new to git.

Do you think that if a solution is found, in the hypothesis it respects both
git & strong cryptography, it would have success ? My analyse is that small
enterprises that do not have many servers nor premises may need git hosting.
Even big companies with their own networks if they want more security. 

TrueCrypt or encrypted file system on the host is not feasible off the
shelves. One have to settle its own dedicated server at the host.

On my side, I am afraid to push my projects in clear into a host. But
possibly I am too much paranoïde. Do you have an idea of the risk ?

--
View this message in context: http://git.661346.n2.nabble.com/Transparently-encrypt-repository-contents-with-GPG-tp2470145p7561687.html
Sent from the git mailing list archive at Nabble.com.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2012-06-18 20:03 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-12 21:19 Transparently encrypt repository contents with GPG Matthias Nothhaft
2009-03-12 21:34 ` Sverre Rabbelier
2009-03-13 10:46   ` Michael J Gruber
2009-03-13 10:51     ` Sverre Rabbelier
2009-03-13 11:15     ` Thomas Rast
2009-03-13 11:17       ` Sverre Rabbelier
2009-03-13 13:56         ` Michael J Gruber
2009-03-13 14:19           ` Sverre Rabbelier
2009-03-13 17:13           ` Jeff King
2009-03-13 20:23           ` Junio C Hamano
2009-03-14 11:16             ` Michael J Gruber
2009-03-14 18:45               ` Junio C Hamano
2009-03-16 16:01                 ` Michael J Gruber
2009-03-17  7:40                   ` Jeff King
2009-03-17  8:22             ` Jeff King
2012-04-21 17:25 ` bigbear
2012-06-17  7:33   ` lalebarde
     [not found]     ` <CAL1Gx-Ufs8TNVeeefAXBnX-eCnEk_DC1w6oJVRPcMcStdL_+-Q@mail.gmail.com>
2012-06-18 20:03       ` lalebarde

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.