All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
To: Victoria Dye <vdye@github.com>
Cc: Victoria Dye via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, derrickstolee@github.com,
	johannes.schindelin@gmx.de
Subject: Re: [PATCH 0/7] Generalize 'scalar diagnose' into 'git bugreport --diagnose'
Date: Wed, 03 Aug 2022 14:34:04 +0200	[thread overview]
Message-ID: <220803.86pmhhfpjb.gmgdl@evledraar.gmail.com> (raw)
In-Reply-To: <392ae6f1-c2f1-ab0f-c832-f8e53800506a@github.com>


On Tue, Aug 02 2022, Victoria Dye wrote:

> Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Mon, Aug 01 2022, Victoria Dye via GitGitGadget wrote:
>> 
>>> [...] I didn't see a major UX benefit of 'git diagnose' vs 'git
>>> bugreport --diagnose', so I went with the latter, simpler approach.
>> 
>> I really wanted to like this, but I find the end result here really
>> confusing from a UX perspective.
>> 
>> You can now run "git bugreport --diagnose", which creates a giant *.zip
>> file to go along with your *.txt, but your *.txt makes no reference to
>> it.
>> 
>> Should you ... attach it to your bug report to this mailing list, do
>> something else?
>> 
>> The documentation doesn't offer much in the way of hints, other than
>> suggesting (with --no-report) that this --diagnose is for something
>> entirely different (and that's how "scalar" uses it).
>> 
>> I know what it's really for after reading this series, but for "git
>> bugreport" in particular we should be really careful about not making
>> the UX confusing.
>> 
>> The generated *.zip contains some really deep info about your repo (and
>> not just metadata, e.g. copies of the index, various logs etc.), someone
>> e.g. in a proprietary setting really doesn't want to be sharing that
>> info.
>> 
>> So I would like to see real integration into "git bugreport", i.e. for
>> us to smartly report more repository metrics, e.g. approx number of
>> loose objects, the sort of state "__git_ps1" might report, etc.
>> 
>> But I think the end-state here makes things much more confusing for
>> users.
>> 
>
> The "confusing UX" you describe here doesn't seem to be an inherent issue
> with the implementation (nor is it as insurmountable as you're implying),

I'm not implying or saying that it's insurmountable.

I think in principle having such a mode in "git bugreport" would make
sense.

But the UX here seems to be an afterthought. So I wonder if we shouldn't
hold off on it and just have a new *--helper or something instead.

> it
> largely appears to be an issue of under-documentation. I'll improve that in
> V2 [1], but I want clarify what I was/am going for here as well.
>
> In the context of a bug report, the diagnostics are intended to be used as
> supplemental information to aid in debugging (i.e., attached with the report
> in the email to the list).

Per https://lore.kernel.org/git/?q=n%3Azip we don't block *.zip
attachments, but we have fairly low size limits (to the point of
blocking some large patches).

E.g. e448263716f (po/git.pot: this is now a generated file, 2022-05-26)
(https://lore.kernel.org/git/20220526145035.18958-7-worldhello.net@gmail.com/)
was blocked, and it's just over 500k.

Aside from the data-sharing issues that seems like a good addition to
git-bugreport, i.e. tell the user if the attachment would be blocked due
to its size...

> They're especially valuable when a bug reporter
> isn't very familiar with Git internals and they can't reproduce the issue. A
> lot of bugs can be investigated without those diagnostics, though, which is
> why '--diagnose' isn't "on" by default.
>
> There are also valid use-cases (beyond 'scalar diagnose') for '--no-report':
> someone requests more information after looking into an already-generated
> report, or a developer wants to investigate a bug on their own and use the
> diagnostics as a "starting point" to guide more in-depth debugging. 

Yes, it's useful in a lot of cases. I'm just saying that we really need
to bridge the gap of telling the user what they should be doing with
this new file....

> As for the proprietary data issue, I'd be open to having an option to
> configure which diagnostics a user wants (either something like '--diagnose
> <level>' or a separate option entirely). I'm pretty indifferent on the UI,
> though, so I'll defer to other contributors on 1) if they want that feature,
> and 2) what they think that should look like.

I think it's really important that a bug report feature doesn't submit
private data by default. The name of a "--diagnose" option heavily
implies some aggregate metrics etc.

We then attach what to the user are opaque binary files, but which
contain even the file contants of the repository (index files).

I think scalar was developer for "internal" use-cases where such sharing
wasn't an issue for anyone, but I don't want anyone to get in trouble
because they shared their proprietary source code on the ML while trying
to file a bug report, so defaults really matter.

Especially because if there's one thing we've learned from bug reports &
questions to this & git-users is that the one thing you can rely on is
that users routinely don't carefully scrutinize documentation before
trying out various features...

> [1] https://lore.kernel.org/git/f3235afe-25cc-21a4-fc35-56e35d6be0ce@github.com/
>
>>> An alternative implementation considered was creating a new 'git diagnose'
>>> builtin, but the new command would end up duplicating much of
>>> 'builtin/bugreport.c'.
>> 
>> It seems we always "return" from cmd_bugreport() quite quickly, and we
>> basically only share the code to create the output directory. Just
>> duplicating or sharing that seems like a much better approach for now
>> than creating the above UX confusion.
>> 
>> Note that you can also share code between multiple built-ins, even in
>> the same file (see e.g. builtin/{checkout,log}.c). So we could even
>> share something like the safe_create_leading_directories() calling code
>> in bugreport.c without libifying it.
>> 
>
> You deleted the part where I addressed this suggestion directly:

Yes, the "Although that..." sentence, but I commented on the UX
trade-off elsewhere.

>> Although that issue could be overcome with refactoring, I didn't see a
>> major UX benefit of 'git diagnose' vs 'git bugreport --diagnose', so I
>> went with the latter, simpler approach.
>
> And, in the process of writing down my thoughts on the UX above, I've become
> more convinced that including '--diagnose' in 'git bugreport' is the better
> way to present this functionality to users.

We're on the same page there, we're just discussing if/how to make the
end-to-end process clearer to users.

I don't think it's clear in v1, and is sorely missing something like the
discussion we have around "ANONYMIZING" in git-fast-export(1), and we
really should have a "safe by default". Everything I'm noting here would
be addressed by e.g.:

 * The *.txt output would say "you can additionally attach a diagnostic
   *.zip" file etc. etc., noting if/when the user would do that.

 * We'd have that --diagnostics be safe by default in the way
   --anonymize is, e.g. including stats about number of refs & the like,
   not their contents.

 * We could also have e.g. "--diagnostics --no-anonymize", or
   "--diagnostics=full" or whatever, which would be some approximation
   of the current output.

>>> Finally, despite 'scalar diagnose' now being nothing more than a wrapper for
>>> 'git bugreport --diagnose', it is not being deprecated in this series.
>>> Although deprecation -> removal could be a future cleanup effort, 'scalar
>>> diagnose' is kept around for now as an alias for users already accustomed to
>>> using it in 'scalar'.
>> 
>> We don't have a "make install" to get a "scalar" onto user's systems
>> yet, do we really need to worry about those users?
>> 
>> Or is this a reference to the out-of-tree version of "scalar", not
>> git.git's?
>> 
>
> In practice, it's the "out-of-tree Scalar" users that would care the most.
> That said, with Scalar in the Git tree (albeit 'contrib/' and not built by
> default), I think it's reasonable to want to avoid breaking changes if
> possible. The continued existence of 'scalar diagnose' wouldn't really be
> hurting anyone anyway, so there's no pressing need to deprecate it now.

I'm fine with keeping it, but just found the juxtaposition of doing that
& previous discussions about scalar to be surprising.

When it was submitted (from my recollection, I see
Documentation/technical/scalar.txt is rather light on the topic) the
idea was (and this is from my own recollection, I haven't dug up ML
references) that scalar would be "going away", and that the reason to
have it in-tree was to inform the design of the main "git" tooling for
the use-cases scalar is addressing.

So, a bit similar to Cogito (although that was never in-tree).

I'm fine with just keeping it as an alias for "git bugreport
--whatever", but think it would make even more sense to change the
interface a bit as we integrate it to the rest of git, and then "die()"
saying "'scalar xyz' is now 'git abc', mostly....".

Existing scalar users used to 'scalar xyz' would then try the new 'git
abc', and report if/how that fits their use-case.

Then at some point if we do that with all of scalar's features we'd be
at the point of 'git rm'-ing it, as it would have served its stated
purpose.

Just my 0.02, I think in any case having the scalar.txt doc's "Roadmap"
updated to address this would be very useful.

  reply	other threads:[~2022-08-03 13:02 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-01 21:14 [PATCH 0/7] Generalize 'scalar diagnose' into 'git bugreport --diagnose' Victoria Dye via GitGitGadget
2022-08-01 21:14 ` [PATCH 1/7] scalar: use "$GIT_UNZIP" in 'scalar diagnose' test Victoria Dye via GitGitGadget
2022-08-01 21:46   ` Junio C Hamano
2022-08-01 21:14 ` [PATCH 2/7] builtin/bugreport.c: create '--diagnose' option Victoria Dye via GitGitGadget
2022-08-01 22:16   ` Junio C Hamano
2022-08-02 15:40     ` Victoria Dye
2022-08-02  2:17   ` Ævar Arnfjörð Bjarmason
2022-08-01 21:14 ` [PATCH 3/7] builtin/bugreport.c: avoid size_t overflow Victoria Dye via GitGitGadget
2022-08-01 22:18   ` Junio C Hamano
2022-08-02 16:26     ` Victoria Dye
2022-08-02 20:51       ` Junio C Hamano
2022-08-02  2:03   ` Ævar Arnfjörð Bjarmason
2022-08-02 16:26     ` Victoria Dye
2022-08-03 12:25       ` Ævar Arnfjörð Bjarmason
2022-08-01 21:14 ` [PATCH 4/7] builtin/bugreport.c: add directory to archiver more gently Victoria Dye via GitGitGadget
2022-08-01 22:22   ` Junio C Hamano
2022-08-02 15:43     ` Victoria Dye
2022-08-01 21:14 ` [PATCH 5/7] builtin/bugreport.c: add '--no-report' option Victoria Dye via GitGitGadget
2022-08-01 22:31   ` Junio C Hamano
2022-08-02 19:46     ` Victoria Dye
2022-08-01 21:14 ` [PATCH 6/7] scalar: use 'git bugreport --diagnose' in 'scalar diagnose' Victoria Dye via GitGitGadget
2022-08-01 21:14 ` [PATCH 7/7] scalar: update technical doc roadmap Victoria Dye via GitGitGadget
2022-08-01 21:34 ` [PATCH 0/7] Generalize 'scalar diagnose' into 'git bugreport --diagnose' Junio C Hamano
2022-08-02  2:49 ` Ævar Arnfjörð Bjarmason
2022-08-02 19:48   ` Victoria Dye
2022-08-03 12:34     ` Ævar Arnfjörð Bjarmason [this message]
2022-08-04  1:45 ` [PATCH v2 00/10] Generalize 'scalar diagnose' into 'git diagnose' and " Victoria Dye via GitGitGadget
2022-08-04  1:45   ` [PATCH v2 01/10] scalar-diagnose: use "$GIT_UNZIP" in test Victoria Dye via GitGitGadget
2022-08-04  1:45   ` [PATCH v2 02/10] scalar-diagnose: avoid 32-bit overflow of size_t Victoria Dye via GitGitGadget
2022-08-04  1:45   ` [PATCH v2 03/10] scalar-diagnose: add directory to archiver more gently Victoria Dye via GitGitGadget
2022-08-04  6:19     ` Ævar Arnfjörð Bjarmason
2022-08-04 17:12       ` Junio C Hamano
2022-08-04 20:12         ` Ævar Arnfjörð Bjarmason
2022-08-04 21:09           ` Junio C Hamano
2022-08-04  1:45   ` [PATCH v2 04/10] scalar-diagnose: move 'get_disk_info()' to 'compat/' Victoria Dye via GitGitGadget
2022-08-04  1:45   ` [PATCH v2 05/10] scalar-diagnose: move functionality to common location Victoria Dye via GitGitGadget
2022-08-04  6:24     ` Ævar Arnfjörð Bjarmason
2022-08-04  1:45   ` [PATCH v2 06/10] builtin/diagnose.c: create 'git diagnose' builtin Victoria Dye via GitGitGadget
2022-08-04  6:27     ` Ævar Arnfjörð Bjarmason
2022-08-05 19:38       ` Derrick Stolee
2022-08-11 11:06         ` Ævar Arnfjörð Bjarmason
2022-08-05 19:11     ` Derrick Stolee
2022-08-04  1:45   ` [PATCH v2 07/10] builtin/diagnose.c: gate certain data behind '--all' Victoria Dye via GitGitGadget
2022-08-04  6:39     ` Ævar Arnfjörð Bjarmason
2022-08-04  1:45   ` [PATCH v2 08/10] builtin/bugreport.c: create '--diagnose' option Victoria Dye via GitGitGadget
2022-08-05 19:35     ` Derrick Stolee
2022-08-09 23:53       ` Victoria Dye
2022-08-10 12:52         ` Derrick Stolee
2022-08-10 16:13           ` Victoria Dye
2022-08-10 16:47             ` Derrick Stolee
2022-08-04  1:45   ` [PATCH v2 09/10] scalar-diagnose: use 'git diagnose --all' Victoria Dye via GitGitGadget
2022-08-04  6:54     ` Ævar Arnfjörð Bjarmason
2022-08-09 16:54       ` Victoria Dye
2022-08-04  1:45   ` [PATCH v2 10/10] scalar: update technical doc roadmap Victoria Dye via GitGitGadget
2022-08-04 17:22   ` [PATCH v2 00/10] Generalize 'scalar diagnose' into 'git diagnose' and 'git bugreport --diagnose' Junio C Hamano
2022-08-09 16:17     ` Victoria Dye
2022-08-09 16:50       ` Junio C Hamano
2022-08-10 23:34   ` [PATCH v3 00/11] " Victoria Dye via GitGitGadget
2022-08-10 23:34     ` [PATCH v3 01/11] scalar-diagnose: use "$GIT_UNZIP" in test Victoria Dye via GitGitGadget
2022-08-10 23:34     ` [PATCH v3 02/11] scalar-diagnose: avoid 32-bit overflow of size_t Victoria Dye via GitGitGadget
2022-08-10 23:34     ` [PATCH v3 03/11] scalar-diagnose: add directory to archiver more gently Victoria Dye via GitGitGadget
2022-08-10 23:34     ` [PATCH v3 04/11] scalar-diagnose: move 'get_disk_info()' to 'compat/' Victoria Dye via GitGitGadget
2022-08-10 23:34     ` [PATCH v3 05/11] scalar-diagnose: move functionality to common location Victoria Dye via GitGitGadget
2022-08-10 23:34     ` [PATCH v3 06/11] diagnose.c: add option to configure archive contents Victoria Dye via GitGitGadget
2022-08-11  0:16       ` Junio C Hamano
2022-08-12 17:00         ` Victoria Dye
2022-08-11 10:51       ` Ævar Arnfjörð Bjarmason
2022-08-11 15:43         ` Victoria Dye
2022-08-10 23:34     ` [PATCH v3 07/11] builtin/diagnose.c: create 'git diagnose' builtin Victoria Dye via GitGitGadget
2022-08-10 23:34     ` [PATCH v3 08/11] builtin/diagnose.c: add '--mode' option Victoria Dye via GitGitGadget
2022-08-10 23:34     ` [PATCH v3 09/11] builtin/bugreport.c: create '--diagnose' option Victoria Dye via GitGitGadget
2022-08-11 10:53       ` Ævar Arnfjörð Bjarmason
2022-08-11 15:40         ` Victoria Dye
2022-08-11 20:30           ` Ævar Arnfjörð Bjarmason
2022-08-10 23:34     ` [PATCH v3 10/11] scalar-diagnose: use 'git diagnose --mode=all' Victoria Dye via GitGitGadget
2022-08-10 23:34     ` [PATCH v3 11/11] scalar: update technical doc roadmap Victoria Dye via GitGitGadget
2022-08-12 20:10     ` [PATCH v4 00/11] Generalize 'scalar diagnose' into 'git diagnose' and 'git bugreport --diagnose' Victoria Dye via GitGitGadget
2022-08-12 20:10       ` [PATCH v4 01/11] scalar-diagnose: use "$GIT_UNZIP" in test Victoria Dye via GitGitGadget
2022-08-12 20:10       ` [PATCH v4 02/11] scalar-diagnose: avoid 32-bit overflow of size_t Victoria Dye via GitGitGadget
2022-08-12 20:10       ` [PATCH v4 03/11] scalar-diagnose: add directory to archiver more gently Victoria Dye via GitGitGadget
2022-08-12 20:10       ` [PATCH v4 04/11] scalar-diagnose: move 'get_disk_info()' to 'compat/' Victoria Dye via GitGitGadget
2022-08-12 20:10       ` [PATCH v4 05/11] scalar-diagnose: move functionality to common location Victoria Dye via GitGitGadget
2022-08-12 20:26         ` Junio C Hamano
2022-08-12 21:00           ` Victoria Dye
2022-08-12 21:20             ` Junio C Hamano
2022-08-12 20:10       ` [PATCH v4 06/11] diagnose.c: add option to configure archive contents Victoria Dye via GitGitGadget
2022-08-12 20:31         ` Junio C Hamano
2022-08-12 20:10       ` [PATCH v4 07/11] builtin/diagnose.c: create 'git diagnose' builtin Victoria Dye via GitGitGadget
2022-08-18 18:43         ` Ævar Arnfjörð Bjarmason
2022-08-18 19:12           ` Junio C Hamano
2022-08-12 20:10       ` [PATCH v4 08/11] builtin/diagnose.c: add '--mode' option Victoria Dye via GitGitGadget
2022-08-12 20:10       ` [PATCH v4 09/11] builtin/bugreport.c: create '--diagnose' option Victoria Dye via GitGitGadget
2022-08-12 20:10       ` [PATCH v4 10/11] scalar-diagnose: use 'git diagnose --mode=all' Victoria Dye via GitGitGadget
2022-08-12 20:10       ` [PATCH v4 11/11] scalar: update technical doc roadmap Victoria Dye via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=220803.86pmhhfpjb.gmgdl@evledraar.gmail.com \
    --to=avarab@gmail.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=johannes.schindelin@gmx.de \
    --cc=vdye@github.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.