All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH/RFC] contrib: add win32 credential-helper
@ 2012-03-19 23:06 Erik Faye-Lund
  2012-03-23 21:10 ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Erik Faye-Lund @ 2012-03-19 23:06 UTC (permalink / raw)
  To: git; +Cc: msysgit, peff

This one pretty much sucks. Mem-leaks and a sketchy deletion-filter.

Currently uses "::" as an attribute-separator, but this is not robust
without encoding if the attribute values themselves contains "::".

Since the Windows port of Git expects binary pipes, we need to make
sure the helper-end also sets up binary pipes.

Side-step CRLF-issue in test to make it pass.

Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
---

Here's a hacky and not-quite-as-cool-as-I'd-hope credential-helper
for Windows. The good news is that it Works For Me(tm), but the bad
news is that it sucks.

The code is provided for discussion only so far.

I'm not really sure how to make it less sucky in some regards, part
of this I blame on lacking documentation of the credential-helper
prococol :P

Here's some things I can't find documented:

 1) Encoding of usernames. I'm assuming this is supposed to be
 UTF-8, because SecKeychainFindInternetPassword which is used by
 the OSX-helper is documented to take accountName as UTF-8.

 2) Encoding of passwords. I'm assuming UTF-8, as mixing encodings
 here would be insane :P

 3) How to match credentials for look-up when not supplying all
 attributes. I've implemented a kind-of dodgy matching; I use a
 string that packs all supplied attributes. This seems to work,
 but I'm a bit worried about what happens when people try to mix
 credentials saved with credential.useHttpPath=true with those
 saved with credential.useHttpPath=false.

 4) How to match credentials for deletion when not supplying
 all attributes. It seems from the test-suite that a credential is
 expected to be deleted even if the username is not given. But what
 happens in such a case when there's two different credentials for
 one domain, with different username? The OSX-helper seems to only
 delete one of them (but I'm not entirely sure which one, due to
 lacking documentation from Apple's side).
 
 The OSX-helper also have a comment saying "Require at least a
 protocol and host for removal, which is what git will give us; if
 you want to do  something more fancy, use the Keychain manager".
 This comment puzzles me a bit; won't git also give us path in case
 of credential.useHttpPath=true?

 5) How overwriting credentials work. If there's a credential where
 all attributes except for the password match, will "store" just
 overwrite it? Should it complain?

I'm guessing that a lot of these questions fall in the "who cares"
category. For instance, git probably never tries to create a
credential it already has stored. But at least saying "This cannot
happen" in the documentation makes it a lot easier to write a
credential-helper.

Thoughts?

 contrib/credential/wincred/Makefile                |    8 +
 .../credential/wincred/git-credential-wincred.c    |  320 ++++++++++++++++++++
 t/lib-credential.sh                                |    4 +
 3 files changed, 332 insertions(+), 0 deletions(-)
 create mode 100644 contrib/credential/wincred/Makefile
 create mode 100644 contrib/credential/wincred/git-credential-wincred.c

diff --git a/contrib/credential/wincred/Makefile b/contrib/credential/wincred/Makefile
new file mode 100644
index 0000000..b4f098f
--- /dev/null
+++ b/contrib/credential/wincred/Makefile
@@ -0,0 +1,8 @@
+all: git-credential-wincred.exe
+
+CC = gcc
+RM = rm -f
+CFLAGS = -O2 -Wall
+
+git-credential-wincred.exe : git-credential-wincred.c
+	$(LINK.c) $^ $(LOADLIBES) $(LDLIBS) -o $@
diff --git a/contrib/credential/wincred/git-credential-wincred.c b/contrib/credential/wincred/git-credential-wincred.c
new file mode 100644
index 0000000..760a3dc
--- /dev/null
+++ b/contrib/credential/wincred/git-credential-wincred.c
@@ -0,0 +1,320 @@
+/*
+ * A git credential helper that interface with Windows' Credential Manager
+ *
+ */
+#include <windows.h>
+#include <stdio.h>
+#include <io.h>
+#include <fcntl.h>
+
+/* common helpers */
+
+static void die(const char *err, ...)
+{
+	char msg[4096];
+	va_list params;
+	va_start(params, err);
+	vsnprintf(msg, sizeof(msg), err, params);
+	fprintf(stderr, "%s\n", msg);
+	va_end(params);
+	exit(1);
+}
+
+static void *xmalloc(size_t size)
+{
+	void *ret = malloc(size);
+	if (!ret && !size)
+		ret = malloc(1);
+	if (!ret)
+		 die("Out of memory");
+	return ret;
+}
+
+static char *xstrdup(const char *str)
+{
+	char *ret = strdup(str);
+	if (!ret)
+		die("Out of memory");
+	return ret;
+}
+
+/* MinGW doesn't have wincred.h, so we need to define stuff */
+
+typedef struct _CREDENTIAL_ATTRIBUTE {
+	LPWSTR Keyword;
+	DWORD  Flags;
+	DWORD  ValueSize;
+	LPBYTE Value;
+} CREDENTIAL_ATTRIBUTE, *PCREDENTIAL_ATTRIBUTE;
+
+typedef struct _CREDENTIALW {
+	DWORD                 Flags;
+	DWORD                 Type;
+	LPWSTR                TargetName;
+	LPWSTR                Comment;
+	FILETIME              LastWritten;
+	DWORD                 CredentialBlobSize;
+	LPBYTE                CredentialBlob;
+	DWORD                 Persist;
+	DWORD                 AttributeCount;
+	PCREDENTIAL_ATTRIBUTE Attributes;
+	LPWSTR                TargetAlias;
+	LPWSTR                UserName;
+} CREDENTIALW, *PCREDENTIALW;
+
+#define CRED_TYPE_GENERIC 1
+#define CRED_PERSIST_LOCAL_MACHINE 2
+#define CRED_PACK_GENERIC_CREDENTIALS 4
+
+typedef BOOL (WINAPI *CredWriteWT)(PCREDENTIALW, DWORD);
+typedef BOOL (WINAPI *CredUnPackAuthenticationBufferWT)(DWORD, PVOID, DWORD,
+    LPWSTR, DWORD *, LPWSTR, DWORD *, LPWSTR, DWORD *);
+typedef BOOL (WINAPI *CredEnumerateWT)(LPCWSTR, DWORD, DWORD *,
+    PCREDENTIALW **);
+typedef BOOL (WINAPI *CredPackAuthenticationBufferWT)(DWORD, LPWSTR, LPWSTR,
+    PBYTE, DWORD *);
+typedef VOID (WINAPI *CredFreeT)(PVOID);
+typedef BOOL (WINAPI *CredDeleteWT)(LPCWSTR, DWORD, DWORD);
+
+static HMODULE advapi, credui;
+static CredWriteWT CredWriteW;
+static CredUnPackAuthenticationBufferWT CredUnPackAuthenticationBufferW;
+static CredEnumerateWT CredEnumerateW;
+static CredPackAuthenticationBufferWT CredPackAuthenticationBufferW;
+static CredFreeT CredFree;
+static CredDeleteWT CredDeleteW;
+
+static void load_cred_funcs(void)
+{
+	/* load DLLs */
+	advapi = LoadLibrary("advapi32.dll");
+	credui = LoadLibrary("credui.dll");
+	if (!advapi || !credui)
+		die("failed to load DLLs");
+
+	/* get function pointers */
+	CredWriteW = (CredWriteWT)GetProcAddress(advapi, "CredWriteW");
+	CredUnPackAuthenticationBufferW = (CredUnPackAuthenticationBufferWT)
+	    GetProcAddress(credui, "CredUnPackAuthenticationBufferW");
+	CredEnumerateW = (CredEnumerateWT)GetProcAddress(advapi,
+	    "CredEnumerateW");
+	CredPackAuthenticationBufferW = (CredPackAuthenticationBufferWT)
+	    GetProcAddress(credui, "CredPackAuthenticationBufferW");
+	CredFree = (CredFreeT)GetProcAddress(advapi, "CredFree");
+	CredDeleteW = (CredDeleteWT)GetProcAddress(advapi, "CredDeleteW");
+	if (!CredWriteW || !CredUnPackAuthenticationBufferW ||
+	    !CredEnumerateW || !CredPackAuthenticationBufferW || !CredFree ||
+	    !CredDeleteW)
+		die("failed to load functions");
+}
+
+static char target_buf[1024];
+static char *protocol, *host, *path, *username;
+static WCHAR *wusername, *password, *target;
+
+static void write_item(const char *what, WCHAR *wbuf)
+{
+	char *buf;
+	int len = WideCharToMultiByte(CP_UTF8, 0, wbuf, -1, NULL, 0, NULL,
+	    FALSE);
+	buf = xmalloc(len);
+
+	if (!WideCharToMultiByte(CP_UTF8, 0, wbuf, -1, buf, len, NULL, FALSE))
+		die("WideCharToMultiByte failed!");
+
+	printf("%s=", what);
+	fwrite(buf, 1, len - 1, stdout);
+	putchar('\n');
+	free(buf);
+}
+
+static void get_credential(void)
+{
+	WCHAR *user_buf, *pass_buf;
+	DWORD user_buf_size = 0, pass_buf_size = 0;
+	CREDENTIALW **creds, *cred;
+	DWORD num_creds;
+	WCHAR temp[4096];
+
+	wcscpy(temp, target);
+	wcscat(temp, L"*");
+
+	if (!CredEnumerateW(temp, 0, &num_creds, &creds))
+		return;
+
+	if (!wusername) {
+		/* no username was specified, just pick the first one */
+		cred = creds[0];
+	} else {
+		/* search for the first credential that matches username */
+		int i;
+		cred = NULL;
+		for (i = 0; i < num_creds; ++i)
+			if (!wcscmp(wusername, creds[i]->UserName)) {
+				cred = creds[i];
+				break;
+			}
+		if (!cred)
+			return;
+	}
+
+	CredUnPackAuthenticationBufferW(0, cred->CredentialBlob,
+	    cred->CredentialBlobSize, NULL, &user_buf_size, NULL, NULL,
+	    NULL, &pass_buf_size);
+
+	user_buf = xmalloc(user_buf_size * sizeof(WCHAR));
+	pass_buf = xmalloc(pass_buf_size * sizeof(WCHAR));
+
+	if (!CredUnPackAuthenticationBufferW(0, cred->CredentialBlob,
+	    cred->CredentialBlobSize, user_buf, &user_buf_size, NULL, NULL,
+	    pass_buf, &pass_buf_size))
+		die("CredUnPackAuthenticationBuffer failed");
+
+	CredFree(creds);
+
+	/* zero-terminate (sizes include zero-termination) */
+	user_buf[user_buf_size - 1] = L'\0';
+	pass_buf[pass_buf_size - 1] = L'\0';
+
+	write_item("username", user_buf);
+	write_item("password", pass_buf);
+
+	free(user_buf);
+	free(pass_buf);
+}
+
+static void store_credential(void)
+{
+	CREDENTIALW cred;
+	BYTE *auth_buf;
+	DWORD auth_buf_size = 0;
+
+	if (!wusername || !password)
+		return;
+
+	/* query buffer size */
+	CredPackAuthenticationBufferW(0, wusername, password,
+	    NULL, &auth_buf_size);
+
+	auth_buf = xmalloc(auth_buf_size);
+
+	if (!CredPackAuthenticationBufferW(0, wusername, password,
+	    auth_buf, &auth_buf_size))
+		die("CredPackAuthenticationBuffer failed");
+
+	cred.Flags = 0;
+	cred.Type = CRED_TYPE_GENERIC;
+	cred.TargetName = target;
+	cred.Comment = L"saved by git-credential-wincred";
+	cred.CredentialBlobSize = auth_buf_size;
+	cred.CredentialBlob = auth_buf;
+	cred.Persist = CRED_PERSIST_LOCAL_MACHINE;
+	cred.AttributeCount = 0;
+	cred.Attributes = NULL;
+	cred.TargetAlias = NULL;
+	cred.UserName = wusername;
+	if (!CredWriteW(&cred, 0))
+		die("CredWrite failed");
+}
+
+static void erase_credential(void)
+{
+	WCHAR temp[4096];
+	CREDENTIALW **creds;
+	DWORD num_creds;
+	int i;
+
+	wcscpy(temp, target);
+	wcscat(temp, L"*");
+
+	if (!CredEnumerateW(temp, 0, &num_creds, &creds))
+		return;
+
+	for (i = 0; i < num_creds; ++i)
+		CredDeleteW(creds[i]->TargetName, creds[i]->Type, 0);
+
+	CredFree(creds);
+}
+
+static WCHAR *utf8_to_utf16_dup(const char *str)
+{
+	int wlen = MultiByteToWideChar(CP_UTF8, 0, str, -1, NULL, 0);
+	WCHAR *wstr = xmalloc(sizeof(WCHAR) * wlen);
+	MultiByteToWideChar(CP_UTF8, 0, str, -1, wstr, wlen);
+	return wstr;
+}
+
+static void read_credential(void)
+{
+	char buf[1024];
+
+	while (fgets(buf, sizeof(buf), stdin)) {
+		char *v;
+
+		if (!strcmp(buf, "\n"))
+			break;
+		buf[strlen(buf)-1] = '\0';
+
+		v = strchr(buf, '=');
+		if (!v)
+			die("bad input: %s", buf);
+		*v++ = '\0';
+
+		if (!strcmp(buf, "protocol"))
+			protocol = xstrdup(v);
+		else if (!strcmp(buf, "host"))
+			host = xstrdup(v);
+		else if (!strcmp(buf, "path"))
+			path = xstrdup(v);
+		else if (!strcmp(buf, "username")) {
+			username = xstrdup(v);
+			wusername = utf8_to_utf16_dup(v);
+		} else if (!strcmp(buf, "password"))
+			password = utf8_to_utf16_dup(v);
+	}
+}
+
+int main(int argc, char *argv[])
+{
+	const char *usage =
+	    "Usage: git credential-wincred <get|store|erase>\n";
+
+	if (!argv[1])
+		die(usage);
+
+	/* git use binary pipes to avoid CRLF-issues */
+	_setmode(_fileno(stdin), _O_BINARY);
+	_setmode(_fileno(stdout), _O_BINARY);
+
+	read_credential();
+
+	load_cred_funcs();
+
+	/* prepare 'target', the unique key for the credential */
+	if (!protocol || !host)
+		return 0;
+
+	strncpy(target_buf, "git::protocol=", sizeof(target_buf));
+	strncat(target_buf, protocol, sizeof(target_buf));
+	strncat(target_buf, "::host=", sizeof(target_buf));
+	strncat(target_buf, host, sizeof(target_buf));
+	if (username) {
+		strncat(target_buf, "::user=", sizeof(target_buf));
+		strncat(target_buf, username, sizeof(target_buf));
+	}
+	if (path) {
+		strncat(target_buf, "::path=", sizeof(target_buf));
+		strncat(target_buf, host, sizeof(target_buf));
+	}
+
+	target = utf8_to_utf16_dup(target_buf);
+
+	if (!strcmp(argv[1], "get"))
+		get_credential();
+	else if (!strcmp(argv[1], "store"))
+		store_credential();
+	else if (!strcmp(argv[1], "erase"))
+		erase_credential();
+	/* otherwise, ignore unknown action */
+	return 0;
+}
diff --git a/t/lib-credential.sh b/t/lib-credential.sh
index 4a37cd7..d30ae8a 100755
--- a/t/lib-credential.sh
+++ b/t/lib-credential.sh
@@ -8,6 +8,10 @@ check() {
 	read_chunk >expect-stdout &&
 	read_chunk >expect-stderr &&
 	test-credential "$@" <stdin >stdout 2>stderr &&
+	if test_have_prereq MINGW
+	then
+		dos2unix -q stderr
+	fi &&
 	test_cmp expect-stdout stdout &&
 	test_cmp expect-stderr stderr
 }
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH/RFC] contrib: add win32 credential-helper
  2012-03-19 23:06 [PATCH/RFC] contrib: add win32 credential-helper Erik Faye-Lund
@ 2012-03-23 21:10 ` Jeff King
  2012-04-02 15:53   ` Erik Faye-Lund
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2012-03-23 21:10 UTC (permalink / raw)
  To: Erik Faye-Lund; +Cc: git, msysgit

On Tue, Mar 20, 2012 at 12:06:54AM +0100, Erik Faye-Lund wrote:

> This one pretty much sucks. Mem-leaks and a sketchy deletion-filter.

Thanks for moving forward on this. I'm sorry I can't be much help on the
Windows-specific knowledge, but I'll answer what I can.

> Currently uses "::" as an attribute-separator, but this is not robust
> without encoding if the attribute values themselves contains "::".

Yeah. Can you store arbitrary bytes? If so, NUL would be a good
terminator. Otherwise, newline is a reasonable choice, as the protocol
already can't communicate usernames/passwords with newlines (a
limitation that I accepted to make the protocol much simpler for
scripting use).

That being said, I realized when writing credential-store that you can
encode all of the components in a URL, like:

  proto://user:password@host/path

You do have to get the URL encoding/decoding right, of course, but since
it is a standard format, you may have library or OS support (for the
stock helpers, I was able to cheat and just link against git's code).

> I'm not really sure how to make it less sucky in some regards, part
> of this I blame on lacking documentation of the credential-helper
> prococol :P

Heh. OK, I'll take the blame. :)

>  1) Encoding of usernames. I'm assuming this is supposed to be
>  UTF-8, because SecKeychainFindInternetPassword which is used by
>  the OSX-helper is documented to take accountName as UTF-8.

Like many other parts of git, we treat the data as binary goo as much as
possible. So git hands the helper whatever bytes the user provided, and
ships off whatever bytes are provided by the helper over http without
any further processing. The only two exceptions are:

  1. Fields cannot contain NUL (which means wide encodings like utf-16
     are pretty much out).

  2. Fields cannot contain newline (which effectively means that the
     encoding needs to be some ascii superset like utf8 or latin1).

In practice, I would expect most usernames and passwords only contain
ascii bytes, if only because charset issues between the client and
server would lead to insanity.

>  2) Encoding of passwords. I'm assuming UTF-8, as mixing encodings
>  here would be insane :P

Same as above.

>  3) How to match credentials for look-up when not supplying all
>  attributes. I've implemented a kind-of dodgy matching; I use a
>  string that packs all supplied attributes. This seems to work,
>  but I'm a bit worried about what happens when people try to mix
>  credentials saved with credential.useHttpPath=true with those
>  saved with credential.useHttpPath=false.

See how credential_match in credential.c does it. For a get, the input
to the helper is basically a template saying "this is what I want to
match". If you have more information in your database than is in the
template, that's OK; anything the template doesn't specify should not be
considered a restriction.

So if you get:

  protocol=http
  host=example.com

you should match an entry previously stored for:

  protocol=http
  host=example.com
  path=foo.git

because what is being asked is a more generic form (if you had two
entries, one for foo.git and one for bar.git, it's unclear which should
be chosen; my implementations just pick the first match they encounter).

The same thing happens for usernames. You might get asked for a
credential for a specific username, or you might get asked for a
credential of any username at that host. It's nice in the latter case to
be able to respond with a previously-stored credential from the former.

But note that none of this advanced matching is strictly necessary. It's
purely a quality-of-implementation issue in the helper, and that's why I
left the details out of the documentation (also, I hoped many helpers
could just rely on OS-provided libraries to do the heavy lifting of the
match; that's how it is for OS X, but it seems Windows doesn't have
anything equivalent).

So I'm just describing here what git does for credential-store,
credential-cache, and for the internal config matching. You could do a
lot less in the helper. For example, I keep my passwords in a password
wallet that is backed by GPG-encrypted storage, and my config looks
like:

  [credential "https://github.com"]
    username = peff
    helper = "!f() { test $1 = get && echo password=`pass -n github.web.password`; }; f"
  [credential]
    helper = cache

So note that the helper is totally dumb and relies on the internal
config-matching to restrict when it gets used.

Likewise, you could be a lot smarter. You could conflate http and https
protocol entries according to some internal policy. You could
prefix-match paths so that a stored credential for
"http://example.com/foo" would be applied to a request for
"http://example.com/foo/bar.git". In my helpers, I drew the line where I
thought it was reasonable, but there's no reason other helpers have to
follow the exact same rules.

>  4) How to match credentials for deletion when not supplying
>  all attributes. It seems from the test-suite that a credential is
>  expected to be deleted even if the username is not given.

Yeah, that's what I would expect. The input is a template, so delete
anything that matches the template.

> But what happens in such a case when there's two different credentials
> for one domain, with different username? The OSX-helper seems to only
> delete one of them (but I'm not entirely sure which one, due to
> lacking documentation from Apple's side).

I'd expect them both to be deleted if no username is given (and I'm
pretty sure that's what the stock helpers will do). It may be that the
OS X helper does not (I don't recall testing it explicitly). You could
consider it a bug, or merely a different design decision.

When git deletes a credential, it should provide the username (because
it only does so after trying to use the credential and getting a
rejection). Again, I intentionally left corner cases like this
unspecified, because I didn't want to force helpers to behave one way or
the other (especially when git itself doesn't care either way). I hoped
that helper writers would then be free to do whatever behavior made
sense to them and to their implementation.  But from the writer's
perspective, I guess it looks like the behavior is simply
under-specified. :)

>  The OSX-helper also have a comment saying "Require at least a
>  protocol and host for removal, which is what git will give us; if
>  you want to do  something more fancy, use the Keychain manager".
>  This comment puzzles me a bit; won't git also give us path in case
>  of credential.useHttpPath=true?

Yes; git will provide _at least_ the protocol and host. That code and
comment came out of discussion with Junio, where we realized that an
"empty" template would cause everything to be deleted. So doing this:

  git credential-foo erase </dev/null

would delete everything! Which maybe is something you would want to do,
but it seemed too easy to accidentally trigger this destructive
behavior. So I require some input as a sanity-check, and if you want to
do something like "delete all entries", you will have to use the
separate keychain manager program.

Does that make more sense?

>  5) How overwriting credentials work. If there's a credential where
>  all attributes except for the password match, will "store" just
>  overwrite it? Should it complain?

In the stock implementations, we will overwrite. I don't think it
matters to git, though. If the helper gives us a credential that works,
we will not bother parroting it back[1]. If we get one that does not
work, we will erase it.

[1] That was my intent, anyway. But I think it may parrot it back,
    anyway. We should probably set the "approved" flag on a credential
    given to us by a helper, as there is not much point in asking them
    to store it again afterwards.

> I'm guessing that a lot of these questions fall in the "who cares"
> category. For instance, git probably never tries to create a
> credential it already has stored. But at least saying "This cannot
> happen" in the documentation makes it a lot easier to write a
> credential-helper.

The problem is that it is not "this cannot happen", but rather "git does
not currently do this". I wanted to specify as little as possible so
that future versions could change those corner cases if they wanted to
(or somebody could write a more advanced tool to interact with the
helpers).

But maybe we should be more explicit about that in the api-credential
document.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH/RFC] contrib: add win32 credential-helper
  2012-03-23 21:10 ` Jeff King
@ 2012-04-02 15:53   ` Erik Faye-Lund
  2012-04-03  8:44     ` Erik Faye-Lund
  2012-04-04 10:10     ` Jeff King
  0 siblings, 2 replies; 8+ messages in thread
From: Erik Faye-Lund @ 2012-04-02 15:53 UTC (permalink / raw)
  To: Jeff King; +Cc: git, msysgit

On Fri, Mar 23, 2012 at 10:10 PM, Jeff King <peff@peff.net> wrote:
> On Tue, Mar 20, 2012 at 12:06:54AM +0100, Erik Faye-Lund wrote:
>
>> This one pretty much sucks. Mem-leaks and a sketchy deletion-filter.
>
> Thanks for moving forward on this. I'm sorry I can't be much help on the
> Windows-specific knowledge, but I'll answer what I can.
>

Thanks for following up :)

>> Currently uses "::" as an attribute-separator, but this is not robust
>> without encoding if the attribute values themselves contains "::".
>
> Yeah. Can you store arbitrary bytes? If so, NUL would be a good
> terminator.

No. TargetName is an LPWSTR, which is supposed to be zero-terminated.
There's no way of specifying it's length directly.

> Otherwise, newline is a reasonable choice, as the protocol
> already can't communicate usernames/passwords with newlines (a
> limitation that I accepted to make the protocol much simpler for
> scripting use).
>

This works, but it causes Windows 7's credential manager to glitch in
rendering the credential (adding all the newlines to the end of the
line, and stretching an icon - yuck), which is also a bit unfortunate.

So I'm thinking that escaping the string needs to be done. It can't be
that big of a deal ;)

> That being said, I realized when writing credential-store that you can
> encode all of the components in a URL, like:
>
>  proto://user:password@host/path
>
> You do have to get the URL encoding/decoding right, of course, but since
> it is a standard format, you may have library or OS support (for the
> stock helpers, I was able to cheat and just link against git's code).
>

Yeah, I guess UrlCanonicalize from WinAPI or something along those
lines can be used.

So far I've simply tried to put the stuff in an order that made simple
string-compare sufficient to pass the test. It seems I need to
implement proper matching instead.

>> I'm not really sure how to make it less sucky in some regards, part
>> of this I blame on lacking documentation of the credential-helper
>> prococol :P
>
> Heh. OK, I'll take the blame. :)
>
>>  1) Encoding of usernames. I'm assuming this is supposed to be
>>  UTF-8, because SecKeychainFindInternetPassword which is used by
>>  the OSX-helper is documented to take accountName as UTF-8.
>
> Like many other parts of git, we treat the data as binary goo as much as
> possible. So git hands the helper whatever bytes the user provided, and
> ships off whatever bytes are provided by the helper over http without
> any further processing. The only two exceptions are:
>
>  1. Fields cannot contain NUL (which means wide encodings like utf-16
>     are pretty much out).
>
>  2. Fields cannot contain newline (which effectively means that the
>     encoding needs to be some ascii superset like utf8 or latin1).
>
> In practice, I would expect most usernames and passwords only contain
> ascii bytes, if only because charset issues between the client and
> server would lead to insanity.
>

ASCII unsernames might be common in the UNIX-world, but in the Windows
world this is very much not the case. These functions can be used for
all kinds of services, so I don't think assuming ASCII makes much
sense.

And since OSX documents the encoding, I'm guessing that non-ASCII
usernames isn't entirely unheard of there either.

In general, I think the whole "let's try to get away with not
specifying encoding" is a bit dangerous. Without knowing what encoding
the input and output is, the helpers are pretty much useless for
people. Sure, saying "at least ASCII" helps, but it just takes us
halfway there. And, I think UTF-8 is the least insane option here.
After all, this is an internal protocol; if the credential helper
needs to store something else (like we do for Windows), we can
convert the string. Likewise, if the network protocol the caller is
going to pass this to assumes something else, convert.

>>  2) Encoding of passwords. I'm assuming UTF-8, as mixing encodings
>>  here would be insane :P
>
> Same as above.
>

Again, my objection is pretty much the same as above, modulo the
comment about OSX.

Binary blob password doesn't make sense; human beings have
text-strings as passwords. They don't remember binary blobs.

About the rest of my questions: I think what you're saying pretty much
makes sense. I think some details could be mentioned in
Documentation/technical/api-credentials.txt, and I'll have a look at
adding what I think makes sense after reading through your mail a bit
closer.

Luckily, I'll have some coding time this Easter, so hopefully I'll be
able to finish up a new version soon-ish.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH/RFC] contrib: add win32 credential-helper
  2012-04-02 15:53   ` Erik Faye-Lund
@ 2012-04-03  8:44     ` Erik Faye-Lund
  2012-04-04 10:10     ` Jeff King
  1 sibling, 0 replies; 8+ messages in thread
From: Erik Faye-Lund @ 2012-04-03  8:44 UTC (permalink / raw)
  To: Jeff King; +Cc: git, msysgit

On Mon, Apr 2, 2012 at 5:53 PM, Erik Faye-Lund <kusmabite@gmail.com> wrote:
> On Fri, Mar 23, 2012 at 10:10 PM, Jeff King <peff@peff.net> wrote:
>> On Tue, Mar 20, 2012 at 12:06:54AM +0100, Erik Faye-Lund wrote:
>>
>>> This one pretty much sucks. Mem-leaks and a sketchy deletion-filter.
>>
>> Thanks for moving forward on this. I'm sorry I can't be much help on the
>> Windows-specific knowledge, but I'll answer what I can.
>>
>
> Thanks for following up :)
>
>>> Currently uses "::" as an attribute-separator, but this is not robust
>>> without encoding if the attribute values themselves contains "::".
>>
>> Yeah. Can you store arbitrary bytes? If so, NUL would be a good
>> terminator.
>
> No. TargetName is an LPWSTR, which is supposed to be zero-terminated.
> There's no way of specifying it's length directly.
>
>> Otherwise, newline is a reasonable choice, as the protocol
>> already can't communicate usernames/passwords with newlines (a
>> limitation that I accepted to make the protocol much simpler for
>> scripting use).
>>
>
> This works, but it causes Windows 7's credential manager to glitch in
> rendering the credential (adding all the newlines to the end of the
> line, and stretching an icon - yuck), which is also a bit unfortunate.
>
> So I'm thinking that escaping the string needs to be done. It can't be
> that big of a deal ;)
>

Hmm, but I must have been a complete moron; the credential structure
in windows has a general purpose credential-attribute system, where I
can store the meta-data I want.

So I can probably use that and get away without any sort of encoding. Yay.

I'll look deeper into it when I find the time.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH/RFC] contrib: add win32 credential-helper
  2012-04-02 15:53   ` Erik Faye-Lund
  2012-04-03  8:44     ` Erik Faye-Lund
@ 2012-04-04 10:10     ` Jeff King
  2012-04-04 10:37       ` Erik Faye-Lund
  1 sibling, 1 reply; 8+ messages in thread
From: Jeff King @ 2012-04-04 10:10 UTC (permalink / raw)
  To: Erik Faye-Lund; +Cc: git, msysgit

On Mon, Apr 02, 2012 at 05:53:47PM +0200, Erik Faye-Lund wrote:

> > Otherwise, newline is a reasonable choice, as the protocol
> > already can't communicate usernames/passwords with newlines (a
> > limitation that I accepted to make the protocol much simpler for
> > scripting use).
> 
> This works, but it causes Windows 7's credential manager to glitch in
> rendering the credential (adding all the newlines to the end of the
> line, and stretching an icon - yuck), which is also a bit unfortunate.
>
> So I'm thinking that escaping the string needs to be done. It can't be
> that big of a deal ;)

Gross. I guess backslash-escaping or something similar would be pretty
easy to implement.

> > Like many other parts of git, we treat the data as binary goo as much as
> > possible. So git hands the helper whatever bytes the user provided, and
> > ships off whatever bytes are provided by the helper over http without
> > any further processing. The only two exceptions are:
> [...]
> 
> ASCII unsernames might be common in the UNIX-world, but in the Windows
> world this is very much not the case. These functions can be used for
> all kinds of services, so I don't think assuming ASCII makes much
> sense.
> 
> And since OSX documents the encoding, I'm guessing that non-ASCII
> usernames isn't entirely unheard of there either.

But what does it mean to have an encoding for credentials that will be
sent over http basic-auth? As far as I know, the binary bytes are simply
base64-encoded and sent to the server in a header, and there is no room
for an encoding or any normalization in the password. You just hope that
the remote side is using the same encoding as you, and that there are no
normalization issues (e.g., your username contains "e with accent", but
your input system provides "e" + "combining accent").

It looks like there is a draft rfc to try to help with this:

  http://tools.ietf.org/html/draft-reschke-basicauth-enc-05

but I have no idea if it is adopted at all yet.

> In general, I think the whole "let's try to get away with not
> specifying encoding" is a bit dangerous. Without knowing what encoding
> the input and output is, the helpers are pretty much useless for
> people.

Not necessarily. By being a pass-through, things will usually just work.
That is, if you hand the helper UTF-8 to store, then it should hand you
back UTF-8. Ditto for any other encoding. And if it doesn't work, it's
not the helper's problem, but a mismatch between the http client and
server. The helper's job is to reproduce some bytes.

> Sure, saying "at least ASCII" helps, but it just takes us
> halfway there. And, I think UTF-8 is the least insane option here.

Of course. UTF-8 is pretty much always the least insane option, and I
would hope that everybody is using it. But by being a pass-through, we
don't restrict it. If the client and server both want to use Shift JIS,
then the stock git helpers will work fine. On systems where the OS
mandates a particular encoding (I didn't even check for the OS X helper,
but let us imagine it will barf on non-UTF8 data), then I would expect
everybody to be using that encoding, and the pass-through just works.

If your helper storage really cares, then I think it's sane to assume
you'll get UTF-8 input, and to pass that back out. But I really didn't
want to open the can of worms that is having "getpass()" figure out what
encoding it is getting from the user.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH/RFC] contrib: add win32 credential-helper
  2012-04-04 10:10     ` Jeff King
@ 2012-04-04 10:37       ` Erik Faye-Lund
  2012-04-04 11:23         ` Jeff King
  0 siblings, 1 reply; 8+ messages in thread
From: Erik Faye-Lund @ 2012-04-04 10:37 UTC (permalink / raw)
  To: Jeff King; +Cc: git, msysgit

On Wed, Apr 4, 2012 at 12:10 PM, Jeff King <peff@peff.net> wrote:
> On Mon, Apr 02, 2012 at 05:53:47PM +0200, Erik Faye-Lund wrote:
>
>> > Otherwise, newline is a reasonable choice, as the protocol
>> > already can't communicate usernames/passwords with newlines (a
>> > limitation that I accepted to make the protocol much simpler for
>> > scripting use).
>>
>> This works, but it causes Windows 7's credential manager to glitch in
>> rendering the credential (adding all the newlines to the end of the
>> line, and stretching an icon - yuck), which is also a bit unfortunate.
>>
>> So I'm thinking that escaping the string needs to be done. It can't be
>> that big of a deal ;)
>
> Gross. I guess backslash-escaping or something similar would be pretty
> easy to implement.
>

As I said in a follow-up mail; Windows' credential stuff has a
general-purpose application-defined data-store. I can use that
instead, it turns out.

>> > Like many other parts of git, we treat the data as binary goo as much as
>> > possible. So git hands the helper whatever bytes the user provided, and
>> > ships off whatever bytes are provided by the helper over http without
>> > any further processing. The only two exceptions are:
>> [...]
>>
>> ASCII unsernames might be common in the UNIX-world, but in the Windows
>> world this is very much not the case. These functions can be used for
>> all kinds of services, so I don't think assuming ASCII makes much
>> sense.
>>
>> And since OSX documents the encoding, I'm guessing that non-ASCII
>> usernames isn't entirely unheard of there either.
>
> But what does it mean to have an encoding for credentials that will be
> sent over http basic-auth? As far as I know, the binary bytes are simply
> base64-encoded and sent to the server in a header, and there is no room
> for an encoding or any normalization in the password. You just hope that
> the remote side is using the same encoding as you, and that there are no
> normalization issues (e.g., your username contains "e with accent", but
> your input system provides "e" + "combining accent").
>
> It looks like there is a draft rfc to try to help with this:
>
>  http://tools.ietf.org/html/draft-reschke-basicauth-enc-05
>
> but I have no idea if it is adopted at all yet.
>

Sure, but this is a problem with HTTP-auth, we don't have to make it a
problem of our credential-api as well. For instance,  we could try to
send the credentials to the server as UTF-8 first, and if that fails
re-try with ISO-8859-1. If neither works, we probably need more info;
if it's worth it somebody could implement a per-remote locale or
something. I'm not saying we should, but it's something that could be
done if the HTTP-auth issue proves to be real.

Losing the encoding EARLY isn't any better than trying to preserve it
for as long as possible, and only lose it when we KNOW we're sending
this down a path where the encoding is undefined? And passing UTF-8
data when the encoding is undefined is probably one of the less insane
things to attempt, no?

>> In general, I think the whole "let's try to get away with not
>> specifying encoding" is a bit dangerous. Without knowing what encoding
>> the input and output is, the helpers are pretty much useless for
>> people.
>
> Not necessarily. By being a pass-through, things will usually just work.
> That is, if you hand the helper UTF-8 to store, then it should hand you
> back UTF-8. Ditto for any other encoding. And if it doesn't work, it's
> not the helper's problem, but a mismatch between the http client and
> server. The helper's job is to reproduce some bytes.
>

The problem is that for some helpers you need to know the encoding in
order to have a user-interface show the data. This goes both for OSX
and Windows. If you mess up the encoding, in the helper, then you'll
mess up the entries in the OS'es key-managers. While this might not be
a problem in practice for US-English users, we've seen for Git for
Windows that e.g. some Asians are really unhappy with their username
becoming some completely incomprehensible string.

Saying "Hey, I'm just passing on what I got here, don't blame the
messenger" isn't a solution, it's adding to the problem. I'm sorry if
I'm a bit blunt here, but being a pass-through is a sorry excuse. User
names and passwords are text-strings by definition; "name" and "word"
both imply it. No one has a binary blob of 0xbadf00d as a username or
password.

Now, my suggestion was to define that these are UTF-8; if we pass it
UTF-8 it will still give back UTF-8. But it can be converted
internally so the OS can show the credential in it's manager.

>> Sure, saying "at least ASCII" helps, but it just takes us
>> halfway there. And, I think UTF-8 is the least insane option here.
>
> Of course. UTF-8 is pretty much always the least insane option, and I
> would hope that everybody is using it. But by being a pass-through, we
> don't restrict it. If the client and server both want to use Shift JIS,
> then the stock git helpers will work fine. On systems where the OS
> mandates a particular encoding (I didn't even check for the OS X helper,
> but let us imagine it will barf on non-UTF8 data), then I would expect
> everybody to be using that encoding, and the pass-through just works.
>
> If your helper storage really cares, then I think it's sane to assume
> you'll get UTF-8 input, and to pass that back out.

UTF-8-by-convention is just pretending the issue wasn't there; why
can't we simply fix it? I really don't think we should actively try to
inherit HTTP-auth's problems...

> But I really didn't
> want to open the can of worms that is having "getpass()" figure out what
> encoding it is getting from the user.

What can of worms? Shouldn't getpass simply get the password in the
current system locale? When the user inputs the string is exactly the
moment we want to convert the "external locale" into a known encoding.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH/RFC] contrib: add win32 credential-helper
  2012-04-04 10:37       ` Erik Faye-Lund
@ 2012-04-04 11:23         ` Jeff King
  2012-04-04 17:44           ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Jeff King @ 2012-04-04 11:23 UTC (permalink / raw)
  To: Erik Faye-Lund; +Cc: git, msysgit

On Wed, Apr 04, 2012 at 12:37:25PM +0200, Erik Faye-Lund wrote:

> Sure, but this is a problem with HTTP-auth, we don't have to make it a
> problem of our credential-api as well. For instance,  we could try to
> send the credentials to the server as UTF-8 first, and if that fails
> re-try with ISO-8859-1. If neither works, we probably need more info;
> if it's worth it somebody could implement a per-remote locale or
> something. I'm not saying we should, but it's something that could be
> done if the HTTP-auth issue proves to be real.

Ugh. The retry thing is just hideous. I'd much rather wait for a real
solution like the draft rfc I mentioned, and then implement that (or
better yet, it is probably something that curl would implement). But in
the meantime, I haven't seen a single report on the list that what git
is doing is a problem, and I'd much rather come to the same solution
that browsers do than invent some git-specific junk.

> Losing the encoding EARLY isn't any better than trying to preserve it
> for as long as possible, and only lose it when we KNOW we're sending
> this down a path where the encoding is undefined?

I see what you're saying, but there are two points against this:

  1. We don't necessarily even know what the original encoding is.

  2. All code paths currently end in a place where we drop the encoding
     (http auth and certificate passwords). So it's sort of a non-issue
     at this point.

> And passing UTF-8 data when the encoding is undefined is probably one
> of the less insane things to attempt, no?

Not necessarily. If the user types latin1 and their webserver expects
latin1, then we will be breaking their setup. Preserving the bytes as-is
means we will do the right thing in that case. It means we will do the
wrong thing when the user types latin1 and the webserver expects utf-8,
but that is _already_ broken even without credential helpers. Have we
seen any bug reports?

> The problem is that for some helpers you need to know the encoding in
> order to have a user-interface show the data. This goes both for OSX
> and Windows. If you mess up the encoding, in the helper, then you'll
> mess up the entries in the OS'es key-managers. While this might not be
> a problem in practice for US-English users, we've seen for Git for
> Windows that e.g. some Asians are really unhappy with their username
> becoming some completely incomprehensible string.

How are they inputting their username? If they are seeing garbage, then
surely it is coming to git in some encoding that is different than what
the password manager expects. Then we give it to a helper, which hands
it to system storage, which assumes that what we give it is utf-8. So
the error there seems to me to be the layer between the helper and the
system storage, where the string is converted from "a stream of bytes"
to "a utf-8 encoded string".

So why is this a credential helper protocol problem, and not an
implementation issue for the specific helper? How does git find out the
encoding? Presumably by checking $LANG, or whatever system-specific
locale information there is. Can't the helper do the same thing?

> Saying "Hey, I'm just passing on what I got here, don't blame the
> messenger" isn't a solution, it's adding to the problem. I'm sorry if
> I'm a bit blunt here, but being a pass-through is a sorry excuse. User
> names and passwords are text-strings by definition; "name" and "word"
> both imply it. No one has a binary blob of 0xbadf00d as a username or
> password.

I never argued that people are putting in binary blobs. I do argue that
we don't necessarily know the encoding of the input, nor what encoding
is expected for the output. So by preserving bytes, we can at least not
make the situation worse.  The username git sees is some bytes from a
file handle. How do we know what encoding it's in? The concept of "text
string" is not well-defined in the Unix world, and data tends to be
considered as streams of bytes.

We can guess from $LANG, though I'm not sure how foolproof that is in
practice (more on that below).

> Now, my suggestion was to define that these are UTF-8; if we pass it
> UTF-8 it will still give back UTF-8. But it can be converted
> internally so the OS can show the credential in it's manager.

Which implies that git must be converting what the user gives us into
UTF-8. And then converting it back (to what?) when we get the data back
from the helper, before we send it to the webserver? Or do we always
send UTF-8 to the webserver? Won't that break setups that currently
work?

> What can of worms? Shouldn't getpass simply get the password in the
> current system locale? When the user inputs the string is exactly the
> moment we want to convert the "external locale" into a known encoding.

Is that always accurate on every system? If so, then why don't we do it
in every place that we take input from the user? I suspect the answer is
a fear that we would end up corrupting data as often as we end up
helping.

Maybe that fear is unfounded (it is not my decision, but rather general
git design). Maybe this is just a Unix thing, where the attitude is
usually that data is a stream of bytes. I don't know.

-Peff

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH/RFC] contrib: add win32 credential-helper
  2012-04-04 11:23         ` Jeff King
@ 2012-04-04 17:44           ` Junio C Hamano
  0 siblings, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2012-04-04 17:44 UTC (permalink / raw)
  To: Jeff King; +Cc: Erik Faye-Lund, git, msysgit

Jeff King <peff@peff.net> writes:

> On Wed, Apr 04, 2012 at 12:37:25PM +0200, Erik Faye-Lund wrote:
> ...
>> The problem is that for some helpers you need to know the encoding in
>> order to have a user-interface show the data. This goes both for OSX
>> and Windows. If you mess up the encoding, in the helper, then you'll
>> mess up the entries in the OS'es key-managers. While this might not be
>> a problem in practice for US-English users, we've seen for Git for
>> Windows that e.g. some Asians are really unhappy with their username
>> becoming some completely incomprehensible string.
>
> How are they inputting their username? If they are seeing garbage, then
> surely it is coming to git in some encoding that is different than what
> the password manager expects. Then we give it to a helper, which hands
> it to system storage, which assumes that what we give it is utf-8. So
> the error there seems to me to be the layer between the helper and the
> system storage, where the string is converted from "a stream of bytes"
> to "a utf-8 encoded string".
>
> So why is this a credential helper protocol problem, and not an
> implementation issue for the specific helper? How does git find out the
> encoding? Presumably by checking $LANG, or whatever system-specific
> locale information there is. Can't the helper do the same thing?

FWIW, this matches exactly my reaction.

If a helper on a specific system somehow *knows* the user means to give
UTF-8, uses system UI that interacts with the user and gives UTF-16 back,
and uses system store that takes UCS-2, it is entirely sensible for the
helper to convert between these encodings internally and use UTF-8 when
interacting with credential-helper API which only expects "stream of
bytes".  But that is just a specific implementation detail of the helper.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-04-04 17:44 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-19 23:06 [PATCH/RFC] contrib: add win32 credential-helper Erik Faye-Lund
2012-03-23 21:10 ` Jeff King
2012-04-02 15:53   ` Erik Faye-Lund
2012-04-03  8:44     ` Erik Faye-Lund
2012-04-04 10:10     ` Jeff King
2012-04-04 10:37       ` Erik Faye-Lund
2012-04-04 11:23         ` Jeff King
2012-04-04 17:44           ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.