git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Consistent terminology: cached/staged/index
@ 2011-02-13 19:20 Piotr Krukowiecki
  2011-02-13 19:37 ` Jonathan Nieder
  2011-03-01 10:29 ` Jonathan Nieder
  0 siblings, 2 replies; 65+ messages in thread
From: Piotr Krukowiecki @ 2011-02-13 19:20 UTC (permalink / raw)
  To: git

Hi,

is there a plan for using one term instead of three to describe
operations on index?

>From quick search:
* "add" mentions index and staging
* all commands except one take "--cached" only
* "diff" also takes "--staged"
* "diff" mentions index and staging
* "log" mentions index
* "reset" mentions index


-- 
Piotrek

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-13 19:20 Consistent terminology: cached/staged/index Piotr Krukowiecki
@ 2011-02-13 19:37 ` Jonathan Nieder
  2011-02-13 22:58   ` Junio C Hamano
  2011-03-01 10:29 ` Jonathan Nieder
  1 sibling, 1 reply; 65+ messages in thread
From: Jonathan Nieder @ 2011-02-13 19:37 UTC (permalink / raw)
  To: Piotr Krukowiecki; +Cc: git

Piotr Krukowiecki wrote:

> is there a plan for using one term instead of three to describe
> operations on index?

No.  But ideas (and especially patches) for improving the
documentation would be appreciated.

> From quick search:
> * "add" mentions index and staging
> * all commands except one take "--cached" only
> * "diff" also takes "--staged"
> * "diff" mentions index and staging
> * "log" mentions index
> * "reset" mentions index

If I understand correctly, the intended semantics are:

--index versus --cached
~~~~~~~~~~~~~~~~~~~~~~~
The place where changes for the next commit get registered is called
the "index file".

Commands that pay attention to the registered content of files rather
than the copies in the work tree use the option name "--cached".  This
is mostly for historical reasons --- early on, it was not obvious that
making the index not match the worktree was going to be useful.

Commands that update the registered content of files in addition to
the worktree use the option name "--index".

--staged
~~~~~~~~
diff takes --staged, but that is only to support some people's habits.

The term "to stage" is generally an abbreviation for "to stage in the
index", meaning "to mark for use in the next commit".  It is used to
paint a certain picture of the process in which one makes sure
everything is just right before committing to the result.

Hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-13 19:37 ` Jonathan Nieder
@ 2011-02-13 22:58   ` Junio C Hamano
  2011-02-14  2:05     ` Miles Bader
                       ` (2 more replies)
  0 siblings, 3 replies; 65+ messages in thread
From: Junio C Hamano @ 2011-02-13 22:58 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Piotr Krukowiecki, git

Jonathan Nieder <jrnieder@gmail.com> writes:

> If I understand correctly, the intended semantics are:
>
> --index versus --cached
> ~~~~~~~~~~~~~~~~~~~~~~~
> The place where changes for the next commit get registered is called
> the "index file".
>
> Commands that pay attention to the registered content of files rather
> than the copies in the work tree use the option name "--cached".  This
> is mostly for historical reasons --- early on, it was not obvious that
> making the index not match the worktree was going to be useful.
>
> Commands that update the registered content of files in addition to
> the worktree use the option name "--index".

Mostly correct, except the "early on, it was not obvious" part.  It was
very obvious from the early days that unlike "cvs commit" or "svn commit"
it was very useful that you can trust "git commit", after preparing the
index with what is and isn't to be included in the commit, won't pick up
debugging cruft you keep around in the working tree.

"cache" was an old name (and still established name in-use in the code)
for the index.  Some commands make sense to affect both the index and the
working tree (e.g. "apply") and you give --index to mean "both index and
the working tree" while some other operating modes that make sense only to
look at the index, ignoring the potential difference between the working
tree and the index (e.g. again "apply"), iow, taking only the cached
changes into account, are invoked with --cached to mean "look only at what
is recorded in the index".

Some people may find it a good idea to introduce new synonyms --index-only
vs --index-and-working-tree. I personally am not opposed to such a change,
as long as traditional --cached vs --index will keep working for people
who already learned the difference.  These hypothetical new synonyms would
be more descriptive; the necessity to differenciate the two concepts the
two options --cached vs --index try to tell apart is very real, but it was
a hack to use these two particular words --cached vs --index to do so
without trying harder to come up with better words.


> --staged
> ~~~~~~~~
> diff takes --staged, but that is only to support some people's habits.

This one actually needs more historical background to understand why it is
there, as the synonym is not necessary to understand how git works.

Originally, the way to say "what is in the current working tree for this
path is what I want to have in the next commit" was "update-index".  "What
I want to have in the next commit" is "the index", and the operation is
about "updating" that "What I want to have...", so the name of the command
made perfect sense.  "update-index" had a safety valve to prevent careless
invocation of "update-index *" to add all the cruft in the working tree
(there wasn't any .gitignore mechanism in the Porcelain nor in the
plumbing) and by default affected only the paths that are already in the
index.  You needed to say "update-index --add" to include paths that are
not in the index.

A more user friendly Porcelain "git add" was later implemented in terms of
"update-index --add", but originally it was to add new paths; updating the
contents was still done via "update-index" interface.

This changed in v1.5.0, around the beginning of 2007.  Nicolas Pitre among
others realized that git is about tracking contents, not paths, which
meant that "make the content in the working tree at this moment appear in
the next commit" is equivalent to saying "add this _content_ to the set of
contents that make up the next commit".  "git add" learned to accept both
new paths that were not in the index so far and also paths known to the
index that had old contents for them.

Before v1.5.0, we explained the concept as "we update the set of contents
to be in the next commit" (hence "update-index"); since v1.5.0, we explain
the concept as "we add what's in these paths to the set of contents to be
in the next commit" (hence "add").

Notice that there is no need for a new terminology "staged" in the above
description?

The semantics of the index didn't change ever since, modulo small tweaks
like "add -i" (I borrowed it from Darcs) that allows us to say "add parts
of the changed content" instead of the "what's in the file as a whole
right now" were added; these small tweaks didn't introduce any conceptual
change.

The term "stage" comes from "staging area", a term people used to explain
the concept of the index by saying "The index holds set of contents to be
made into the next commit; it is _like_ the staging area".

My feeling is that "to stage" is primarily used, outside "git" circle, as
a logistics term.  If you find it easier to visualize the concept of the
index with "staging area" ("an area where troops and equipment in transit
are assembled before a military operation", you may find it easier to say
"stage this path ('git add path')", instead of "adding to the set of
contents...".

Although I tried to use the word myself in earlier days, I have never felt
that "staging area" is a very widely known term for non-native speakers of
English, and personally have tended to avoid using it.  I find "adding to
the set of contents..." somewhat easier to understand regardless of your
language background, but it may be just me who is not a native speaker.

In short, "stage" is an unessential synonym that came much later, and that
is why we avoid advertising it even in the document of "git diff" too
heavily.  Unlike the hypothetical --index-only synonym for --cached I
mentioned earlier that adds real value by being more descriptive, "staged"
does not add much value over what it tried to replace.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-13 22:58   ` Junio C Hamano
@ 2011-02-14  2:05     ` Miles Bader
  2011-02-14  5:57       ` Junio C Hamano
  2011-02-14  3:09     ` Pete Harlan
  2011-02-14 22:32     ` Piotr Krukowiecki
  2 siblings, 1 reply; 65+ messages in thread
From: Miles Bader @ 2011-02-14  2:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Piotr Krukowiecki, git

Junio C Hamano <gitster@pobox.com> writes:
> Some people may find it a good idea to introduce new synonyms --index-only
> vs --index-and-working-tree. I personally am not opposed to such a change,

Those are so long that nobody will ever use them though...

One of my big peeves is simply that "git diff --cached" is too long, as
it's an extremely common command (the name isn't exactly intuitive, even
after many years of use, but it's just one of those things you
memorize).

Is there a reason a short version of --cached couldn't be added to
git-diff...?  E.g. "git diff -c"?

Thanks,

-Miles

-- 
Ocean, n. A body of water covering seven-tenths of a world designed for Man -
who has no gills.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-13 22:58   ` Junio C Hamano
  2011-02-14  2:05     ` Miles Bader
@ 2011-02-14  3:09     ` Pete Harlan
  2011-02-16 23:11       ` Drew Northup
  2011-02-27 21:16       ` Aghiles
  2011-02-14 22:32     ` Piotr Krukowiecki
  2 siblings, 2 replies; 65+ messages in thread
From: Pete Harlan @ 2011-02-14  3:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Piotr Krukowiecki, git

On 02/13/2011 02:58 PM, Junio C Hamano wrote:
>> --staged
>> ~~~~~~~~
>> diff takes --staged, but that is only to support some people's habits.
> The term "stage" comes from "staging area", a term people used to explain
> the concept of the index by saying "The index holds set of contents to be
> made into the next commit; it is _like_ the staging area".
> 
> My feeling is that "to stage" is primarily used, outside "git" circle, as
> a logistics term.  If you find it easier to visualize the concept of the
> index with "staging area" ("an area where troops and equipment in transit
> are assembled before a military operation", you may find it easier to say
> "stage this path ('git add path')", instead of "adding to the set of
> contents...".

FWIW, when teaching Git I have found that users immediately understand
"staging area", while "index" and "cache" confuse them.

"Index" means to them a numerical index into a data structure.
"Cache" is a local copy of something that exists remotely.  Neither
word describes the concept correctly from a user's perspective.

I learned long ago to type "index" and "cached", but when talking (and
thinking) about Git I find "the staging area" gets the point across
very clearly and moves Git from interesting techie-tool to
world-dominating SCM territory.  I'm surprised that that experience
isn't universal.

--Pete

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14  2:05     ` Miles Bader
@ 2011-02-14  5:57       ` Junio C Hamano
  2011-02-14  6:27         ` Miles Bader
  0 siblings, 1 reply; 65+ messages in thread
From: Junio C Hamano @ 2011-02-14  5:57 UTC (permalink / raw)
  To: Miles Bader; +Cc: Jonathan Nieder, Piotr Krukowiecki, git

Miles Bader <miles@gnu.org> writes:

> Is there a reason a short version of --cached couldn't be added to
> git-diff...?  E.g. "git diff -c"?

I'd suspect that we would like to keep the door open for "diff -c" to do
what the users naturally expect, namely, to produce a patch in the copied
context format.

I don't immediately plan to do so myself, though.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14  5:57       ` Junio C Hamano
@ 2011-02-14  6:27         ` Miles Bader
  2011-02-14  6:59           ` Johannes Sixt
  0 siblings, 1 reply; 65+ messages in thread
From: Miles Bader @ 2011-02-14  6:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, Piotr Krukowiecki, git

On Mon, Feb 14, 2011 at 5:57 AM, Junio C Hamano <gitster@pobox.com> wrote:
>> Is there a reason a short version of --cached couldn't be added to
>> git-diff...?  E.g. "git diff -c"?
>
> I'd suspect that we would like to keep the door open for "diff -c" to do
> what the users naturally expect, namely, to produce a patch in the copied
> context format.

hmm

"git diff -s"  ? ... since --staged is an alias for --cached :)

-miles

-- 
Cat is power.  Cat is peace.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14  6:27         ` Miles Bader
@ 2011-02-14  6:59           ` Johannes Sixt
  2011-02-14  7:07             ` Miles Bader
  0 siblings, 1 reply; 65+ messages in thread
From: Johannes Sixt @ 2011-02-14  6:59 UTC (permalink / raw)
  To: Miles Bader; +Cc: Junio C Hamano, Jonathan Nieder, Piotr Krukowiecki, git

Am 2/14/2011 7:27, schrieb Miles Bader:
> "git diff -s"  ? ... since --staged is an alias for --cached :)

git config --global alias.diffc "diff --cached"

?

-- Hannes

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14  6:59           ` Johannes Sixt
@ 2011-02-14  7:07             ` Miles Bader
  2011-02-14 10:42               ` Michael J Gruber
  0 siblings, 1 reply; 65+ messages in thread
From: Miles Bader @ 2011-02-14  7:07 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, Jonathan Nieder, Piotr Krukowiecki, git

On Mon, Feb 14, 2011 at 6:59 AM, Johannes Sixt <j.sixt@viscovery.net> wrote:
> Am 2/14/2011 7:27, schrieb Miles Bader:
>> "git diff -s"  ? ... since --staged is an alias for --cached :)
>
> git config --global alias.diffc "diff --cached"

"Git should be convenient by default (for commonly used operations)"

-miles

-- 
Cat is power.  Cat is peace.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14  7:07             ` Miles Bader
@ 2011-02-14 10:42               ` Michael J Gruber
  2011-02-14 11:04                 ` Miles Bader
  2011-02-14 13:14                 ` Nguyen Thai Ngoc Duy
  0 siblings, 2 replies; 65+ messages in thread
From: Michael J Gruber @ 2011-02-14 10:42 UTC (permalink / raw)
  To: Miles Bader
  Cc: Johannes Sixt, Junio C Hamano, Jonathan Nieder, Piotr Krukowiecki, git

Miles Bader venit, vidit, dixit 14.02.2011 08:07:
> On Mon, Feb 14, 2011 at 6:59 AM, Johannes Sixt <j.sixt@viscovery.net> wrote:
>> Am 2/14/2011 7:27, schrieb Miles Bader:
>>> "git diff -s"  ? ... since --staged is an alias for --cached :)
>>
>> git config --global alias.diffc "diff --cached"
> 
> "Git should be convenient by default (for commonly used operations)"
> 
> -miles
> 

git diff --ca<TAB>

;)

At least if "by default" includes using the default bash completion by
default.

Short options should really not be "wasted" easily. "-s" named after "to
stage" is really problematic, as outlined in this thread. It's mainly
used (and has been introduced, I think) by "the other git community", so
to say. I feel that sticking to established terminology (esp. that used
in man pages and command messages) is more helpful for newbies. That
does not exclude using new terms for explaining that terminology, of course.

The term "stage" is in git's documentation all over the place - and
denotes the different versions of a blob involved in a merge.
Admittedly, that's something recorded in the index.

Full disclaimer: I have an alias "staged" for "diff --cached" myself...

Michael

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 10:42               ` Michael J Gruber
@ 2011-02-14 11:04                 ` Miles Bader
  2011-02-14 17:12                   ` Junio C Hamano
  2011-02-14 13:14                 ` Nguyen Thai Ngoc Duy
  1 sibling, 1 reply; 65+ messages in thread
From: Miles Bader @ 2011-02-14 11:04 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: Johannes Sixt, Junio C Hamano, Jonathan Nieder, Piotr Krukowiecki, git

Michael J Gruber <git@drmicha.warpmail.net> writes:
> Short options should really not be "wasted" easily. "-s" named after "to
> stage" is really problematic, as outlined in this thread.

Er, but the point is that this is _such_ a common operation, that a
short option for it would not be "wasted" at all.  [The whole concept of
"wasting" short options doesn't even make sense unless you're willing to
then use the resulting "preserved" options eventually...]

Indeed it seems a little weird that there's not one for this already,
given how common short options are in git generally, often for far less
useful options than --cached/--staged; I can only guess that the reason
is basically historical accident.

As for the exact letter chosen, "-s" seems perfectly fine to me.  Short
options do not need to be "perfect" to be useful, and the connection
with --staged is a perfectly plausible memory aid for that short period
during which people memorize them.

-Miles

-- 
The secret to creativity is knowing how to hide your sources.
  --Albert Einstein

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 10:42               ` Michael J Gruber
  2011-02-14 11:04                 ` Miles Bader
@ 2011-02-14 13:14                 ` Nguyen Thai Ngoc Duy
  2011-02-14 13:43                   ` Michael J Gruber
  1 sibling, 1 reply; 65+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-14 13:14 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: Miles Bader, Johannes Sixt, Jonathan Nieder, Piotr Krukowiecki,
	git, Junio C Hamano

On Mon, Feb 14, 2011 at 5:42 PM, Michael J Gruber
<git@drmicha.warpmail.net> wrote:
> Full disclaimer: I have an alias "staged" for "diff --cached" myself...

Be careful with your fingers. There's a command named "git stage".
-- 
Duy

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 13:14                 ` Nguyen Thai Ngoc Duy
@ 2011-02-14 13:43                   ` Michael J Gruber
  2011-02-14 13:57                     ` Nguyen Thai Ngoc Duy
  2011-02-14 14:17                     ` Felipe Contreras
  0 siblings, 2 replies; 65+ messages in thread
From: Michael J Gruber @ 2011-02-14 13:43 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Miles Bader, Johannes Sixt, Jonathan Nieder, Piotr Krukowiecki,
	git, Junio C Hamano

Nguyen Thai Ngoc Duy venit, vidit, dixit 14.02.2011 14:14:
> On Mon, Feb 14, 2011 at 5:42 PM, Michael J Gruber
> <git@drmicha.warpmail.net> wrote:
>> Full disclaimer: I have an alias "staged" for "diff --cached" myself...
> 
> Be careful with your fingers. There's a command named "git stage".

I know. Can we remove it as part of 1.8.0? It's our only builtin alias.

Michael

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 13:43                   ` Michael J Gruber
@ 2011-02-14 13:57                     ` Nguyen Thai Ngoc Duy
  2011-02-14 14:17                     ` Felipe Contreras
  1 sibling, 0 replies; 65+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-14 13:57 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: Miles Bader, Johannes Sixt, Jonathan Nieder, Piotr Krukowiecki,
	git, Junio C Hamano

On Mon, Feb 14, 2011 at 8:43 PM, Michael J Gruber
<git@drmicha.warpmail.net> wrote:
> Nguyen Thai Ngoc Duy venit, vidit, dixit 14.02.2011 14:14:
>> On Mon, Feb 14, 2011 at 5:42 PM, Michael J Gruber
>> <git@drmicha.warpmail.net> wrote:
>>> Full disclaimer: I have an alias "staged" for "diff --cached" myself...
>>
>> Be careful with your fingers. There's a command named "git stage".
>
> I know. Can we remove it as part of 1.8.0? It's our only builtin alias.

It's out in the field. I don't think we can just simply remove it.
It'd be nice though to have a mechanism to override (or even remove,
in your case) builtin commands, or at least porcelain ones. A feature
with a big "your feet is expected to be shot by yourself" warning.
-- 
Duy

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 13:43                   ` Michael J Gruber
  2011-02-14 13:57                     ` Nguyen Thai Ngoc Duy
@ 2011-02-14 14:17                     ` Felipe Contreras
  2011-02-14 14:21                       ` Nguyen Thai Ngoc Duy
  2011-02-14 15:24                       ` Michael J Gruber
  1 sibling, 2 replies; 65+ messages in thread
From: Felipe Contreras @ 2011-02-14 14:17 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: Nguyen Thai Ngoc Duy, Miles Bader, Johannes Sixt,
	Jonathan Nieder, Piotr Krukowiecki, git, Junio C Hamano

On Mon, Feb 14, 2011 at 3:43 PM, Michael J Gruber
<git@drmicha.warpmail.net> wrote:
> Nguyen Thai Ngoc Duy venit, vidit, dixit 14.02.2011 14:14:
>> On Mon, Feb 14, 2011 at 5:42 PM, Michael J Gruber
>> <git@drmicha.warpmail.net> wrote:
>>> Full disclaimer: I have an alias "staged" for "diff --cached" myself...
>>
>> Be careful with your fingers. There's a command named "git stage".
>
> I know. Can we remove it as part of 1.8.0? It's our only builtin alias.

I have proposed before to extend 'git stage', so you can do 'git stage
diff', or if you alias 'git stage' to 'git s', just 'git s diff'. This
would not conflict with the old behavior of 'git stage $file'.

case "$1" in
add)
        shift
        git add $@
        ;;
rm)
        shift
        git rm --cached $@
        ;;
diff)
        shift
        git diff --cached $@
        ;;
import)
        shift
        git ls-files --modified --others --exclude-standard -z $@ | \
        git update-index --add --remove -z --stdin
        ;;
ls)
        shift
        git ls-files --stage $@
        ;;
*)
        git add $@
        ;;
esac

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 14:17                     ` Felipe Contreras
@ 2011-02-14 14:21                       ` Nguyen Thai Ngoc Duy
  2011-02-14 14:40                         ` Jakub Narebski
  2011-02-14 15:24                       ` Michael J Gruber
  1 sibling, 1 reply; 65+ messages in thread
From: Nguyen Thai Ngoc Duy @ 2011-02-14 14:21 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: Michael J Gruber, Miles Bader, Johannes Sixt, Jonathan Nieder,
	Piotr Krukowiecki, git, Junio C Hamano

On Mon, Feb 14, 2011 at 9:17 PM, Felipe Contreras
<felipe.contreras@gmail.com> wrote:
> On Mon, Feb 14, 2011 at 3:43 PM, Michael J Gruber
> <git@drmicha.warpmail.net> wrote:
>> Nguyen Thai Ngoc Duy venit, vidit, dixit 14.02.2011 14:14:
>>> On Mon, Feb 14, 2011 at 5:42 PM, Michael J Gruber
>>> <git@drmicha.warpmail.net> wrote:
>>>> Full disclaimer: I have an alias "staged" for "diff --cached" myself...
>>>
>>> Be careful with your fingers. There's a command named "git stage".
>>
>> I know. Can we remove it as part of 1.8.0? It's our only builtin alias.
>
> I have proposed before to extend 'git stage', so you can do 'git stage
> diff', or if you alias 'git stage' to 'git s', just 'git s diff'. This
> would not conflict with the old behavior of 'git stage $file'.

It does. What if I want to stage a file named "add", "rm" or "diff"?
-- 
Duy

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 14:21                       ` Nguyen Thai Ngoc Duy
@ 2011-02-14 14:40                         ` Jakub Narebski
  0 siblings, 0 replies; 65+ messages in thread
From: Jakub Narebski @ 2011-02-14 14:40 UTC (permalink / raw)
  To: Nguyen Thai Ngoc Duy
  Cc: Felipe Contreras, Michael J Gruber, Miles Bader, Johannes Sixt,
	Jonathan Nieder, Piotr Krukowiecki, git, Junio C Hamano

Nguyen Thai Ngoc Duy <pclouds@gmail.com> writes:
> On Mon, Feb 14, 2011 at 9:17 PM, Felipe Contreras
> <felipe.contreras@gmail.com> wrote:
>> On Mon, Feb 14, 2011 at 3:43 PM, Michael J Gruber
>> <git@drmicha.warpmail.net> wrote:
>>> Nguyen Thai Ngoc Duy venit, vidit, dixit 14.02.2011 14:14:
>>>> On Mon, Feb 14, 2011 at 5:42 PM, Michael J Gruber
>>>> <git@drmicha.warpmail.net> wrote:

>>>>> Full disclaimer: I have an alias "staged" for "diff --cached" myself...
>>>>
>>>> Be careful with your fingers. There's a command named "git stage".
>>>
>>> I know. Can we remove it as part of 1.8.0? It's our only builtin alias.
>>
>> I have proposed before to extend 'git stage', so you can do 'git stage
>> diff', or if you alias 'git stage' to 'git s', just 'git s diff'. This
>> would not conflict with the old behavior of 'git stage $file'.
> 
> It does. What if I want to stage a file named "add", "rm" or "diff"?

Then you would use

  $ git stage ./diff

or

  $ git stage -- diff

(or even "git add diff" ;-)).

P.S. I haven't checked that above work...

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 14:17                     ` Felipe Contreras
  2011-02-14 14:21                       ` Nguyen Thai Ngoc Duy
@ 2011-02-14 15:24                       ` Michael J Gruber
  2011-02-14 16:00                         ` Felipe Contreras
  1 sibling, 1 reply; 65+ messages in thread
From: Michael J Gruber @ 2011-02-14 15:24 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: Nguyen Thai Ngoc Duy, Miles Bader, Johannes Sixt,
	Jonathan Nieder, Piotr Krukowiecki, git, Junio C Hamano

Felipe Contreras venit, vidit, dixit 14.02.2011 15:17:
> On Mon, Feb 14, 2011 at 3:43 PM, Michael J Gruber
> <git@drmicha.warpmail.net> wrote:
>> Nguyen Thai Ngoc Duy venit, vidit, dixit 14.02.2011 14:14:
>>> On Mon, Feb 14, 2011 at 5:42 PM, Michael J Gruber
>>> <git@drmicha.warpmail.net> wrote:
>>>> Full disclaimer: I have an alias "staged" for "diff --cached" myself...
>>>
>>> Be careful with your fingers. There's a command named "git stage".
>>
>> I know. Can we remove it as part of 1.8.0? It's our only builtin alias.
> 
> I have proposed before to extend 'git stage', so you can do 'git stage
> diff', or if you alias 'git stage' to 'git s', just 'git s diff'. This
> would not conflict with the old behavior of 'git stage $file'.
> 
> case "$1" in
> add)
>         shift
>         git add $@
>         ;;
> rm)
>         shift
>         git rm --cached $@
>         ;;
> diff)
>         shift
>         git diff --cached $@
>         ;;
> import)
>         shift
>         git ls-files --modified --others --exclude-standard -z $@ | \
>         git update-index --add --remove -z --stdin
>         ;;
> ls)
>         shift
>         git ls-files --stage $@
>         ;;
> *)
>         git add $@
>         ;;
> esac
> 
> Cheers.
> 

In principle I like this a lot: a set of commands operating on/with the
stage/index/cache consistently. It think it's similar in (good) spirit
to our earlier attempts at INDEX and WORKTREE pseudo-revs, trying to
give that somewhat nebulous (for noobs) index a more concrete
"appearance", not hidden away in options (--index, --cached) and
defaults (diff against index by default).

In our case, however, I think the design principle deviates from our
common form:

git foo bar

usually means "do foo" to "bar", as most of our common commands are
verbs (being applied to the object "bar"). When it comes to subcommands
we do have inconsistencies already (double-dashed vs. undashed, e.g.),
but I'd prefer fewer ;)

Michael

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 15:24                       ` Michael J Gruber
@ 2011-02-14 16:00                         ` Felipe Contreras
  2011-02-14 16:04                           ` Michael J Gruber
  0 siblings, 1 reply; 65+ messages in thread
From: Felipe Contreras @ 2011-02-14 16:00 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: Nguyen Thai Ngoc Duy, Miles Bader, Johannes Sixt,
	Jonathan Nieder, Piotr Krukowiecki, git, Junio C Hamano

On Mon, Feb 14, 2011 at 5:24 PM, Michael J Gruber
<git@drmicha.warpmail.net> wrote:
> Felipe Contreras venit, vidit, dixit 14.02.2011 15:17:
>> On Mon, Feb 14, 2011 at 3:43 PM, Michael J Gruber
>> <git@drmicha.warpmail.net> wrote:
>>> Nguyen Thai Ngoc Duy venit, vidit, dixit 14.02.2011 14:14:
>>>> On Mon, Feb 14, 2011 at 5:42 PM, Michael J Gruber
>>>> <git@drmicha.warpmail.net> wrote:
>>>>> Full disclaimer: I have an alias "staged" for "diff --cached" myself...
>>>>
>>>> Be careful with your fingers. There's a command named "git stage".
>>>
>>> I know. Can we remove it as part of 1.8.0? It's our only builtin alias.
>>
>> I have proposed before to extend 'git stage', so you can do 'git stage
>> diff', or if you alias 'git stage' to 'git s', just 'git s diff'. This
>> would not conflict with the old behavior of 'git stage $file'.

[...]

> In principle I like this a lot: a set of commands operating on/with the
> stage/index/cache consistently. It think it's similar in (good) spirit
> to our earlier attempts at INDEX and WORKTREE pseudo-revs, trying to
> give that somewhat nebulous (for noobs) index a more concrete
> "appearance", not hidden away in options (--index, --cached) and
> defaults (diff against index by default).
>
> In our case, however, I think the design principle deviates from our
> common form:
>
> git foo bar
>
> usually means "do foo" to "bar", as most of our common commands are
> verbs (being applied to the object "bar"). When it comes to subcommands
> we do have inconsistencies already (double-dashed vs. undashed, e.g.),
> but I'd prefer fewer ;)

Except 'git branch', 'git tag', 'git remote', 'git stash', and 'git
submodule'. In fact, every logical object in git seems to have their
own command, except the stage.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 16:00                         ` Felipe Contreras
@ 2011-02-14 16:04                           ` Michael J Gruber
  2011-02-14 16:27                             ` Felipe Contreras
  0 siblings, 1 reply; 65+ messages in thread
From: Michael J Gruber @ 2011-02-14 16:04 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: Nguyen Thai Ngoc Duy, Miles Bader, Johannes Sixt,
	Jonathan Nieder, Piotr Krukowiecki, git, Junio C Hamano

Felipe Contreras venit, vidit, dixit 14.02.2011 17:00:
> On Mon, Feb 14, 2011 at 5:24 PM, Michael J Gruber
> <git@drmicha.warpmail.net> wrote:
>> Felipe Contreras venit, vidit, dixit 14.02.2011 15:17:
>>> On Mon, Feb 14, 2011 at 3:43 PM, Michael J Gruber
>>> <git@drmicha.warpmail.net> wrote:
>>>> Nguyen Thai Ngoc Duy venit, vidit, dixit 14.02.2011 14:14:
>>>>> On Mon, Feb 14, 2011 at 5:42 PM, Michael J Gruber
>>>>> <git@drmicha.warpmail.net> wrote:
>>>>>> Full disclaimer: I have an alias "staged" for "diff --cached" myself...
>>>>>
>>>>> Be careful with your fingers. There's a command named "git stage".
>>>>
>>>> I know. Can we remove it as part of 1.8.0? It's our only builtin alias.
>>>
>>> I have proposed before to extend 'git stage', so you can do 'git stage
>>> diff', or if you alias 'git stage' to 'git s', just 'git s diff'. This
>>> would not conflict with the old behavior of 'git stage $file'.
> 
> [...]
> 
>> In principle I like this a lot: a set of commands operating on/with the
>> stage/index/cache consistently. It think it's similar in (good) spirit
>> to our earlier attempts at INDEX and WORKTREE pseudo-revs, trying to
>> give that somewhat nebulous (for noobs) index a more concrete
>> "appearance", not hidden away in options (--index, --cached) and
>> defaults (diff against index by default).
>>
>> In our case, however, I think the design principle deviates from our
>> common form:
>>
>> git foo bar
>>
>> usually means "do foo" to "bar", as most of our common commands are
>> verbs (being applied to the object "bar"). When it comes to subcommands
>> we do have inconsistencies already (double-dashed vs. undashed, e.g.),
>> but I'd prefer fewer ;)
> 
> Except 'git branch', 'git tag', 'git remote', 'git stash', and 'git
> submodule'. In fact, every logical object in git seems to have their
> own command, except the stage.
> 

Yes, remote, stash and submodule are the ones with the different
subcommand handling I mentioned: the subcommand is the verb, and
specified undashed.

We have other commands with double-dashed (i.e. option) subcommands,
such as "brach --set-upstream", and others single-dashed, such as "tag -v".

Note that branch, tag and stash are verbs as well as nouns.

I just think that "git verb object" is the more prevalent order, so that
we should move in that direction if we want make things better. Other
than that I would have no objection against "git object verb".

Michael

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 16:04                           ` Michael J Gruber
@ 2011-02-14 16:27                             ` Felipe Contreras
  0 siblings, 0 replies; 65+ messages in thread
From: Felipe Contreras @ 2011-02-14 16:27 UTC (permalink / raw)
  To: Michael J Gruber
  Cc: Nguyen Thai Ngoc Duy, Miles Bader, Johannes Sixt,
	Jonathan Nieder, Piotr Krukowiecki, git, Junio C Hamano

On Mon, Feb 14, 2011 at 6:04 PM, Michael J Gruber
<git@drmicha.warpmail.net> wrote:
> Felipe Contreras venit, vidit, dixit 14.02.2011 17:00:
>> Except 'git branch', 'git tag', 'git remote', 'git stash', and 'git
>> submodule'. In fact, every logical object in git seems to have their
>> own command, except the stage.
>
> Yes, remote, stash and submodule are the ones with the different
> subcommand handling I mentioned: the subcommand is the verb, and
> specified undashed.
>
> We have other commands with double-dashed (i.e. option) subcommands,
> such as "brach --set-upstream", and others single-dashed, such as "tag -v".
>
> Note that branch, tag and stash are verbs as well as nouns.

So is stage.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 11:04                 ` Miles Bader
@ 2011-02-14 17:12                   ` Junio C Hamano
  2011-02-14 22:07                     ` Miles Bader
  0 siblings, 1 reply; 65+ messages in thread
From: Junio C Hamano @ 2011-02-14 17:12 UTC (permalink / raw)
  To: Miles Bader
  Cc: Michael J Gruber, Johannes Sixt, Jonathan Nieder, Piotr Krukowiecki, git

Miles Bader <miles@gnu.org> writes:

> Michael J Gruber <git@drmicha.warpmail.net> writes:
>> Short options should really not be "wasted" easily. "-s" named after "to
>> stage" is really problematic, as outlined in this thread.
>
> Er, but the point is that this is _such_ a common operation, that a
> short option for it would not be "wasted" at all.

True, but I am afraid "-c" is not it, as it would certainly be confusing
to users who know what "diff" does before they learn "git diff".

And I'd like to also keep "-i" open for "ignore case", which I actually
wished the other day while reviewing a topic.  Unlike "-c", I might
implement it myself not in a distant future when I find time.

Using "-I" (as an abbreviation for "index-only") is tempting, though.

Both "-i" and "-I" are GNU extensions, and the latter traditionally was
useful primarily to ignore cruft left in the file with use of "$Id$", but
we actively discourage its use in git controlled projects, so taking it
over might not be such a big issue.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 17:12                   ` Junio C Hamano
@ 2011-02-14 22:07                     ` Miles Bader
  2011-02-14 22:59                       ` Junio C Hamano
  0 siblings, 1 reply; 65+ messages in thread
From: Miles Bader @ 2011-02-14 22:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Michael J Gruber, Johannes Sixt, Jonathan Nieder, Piotr Krukowiecki, git

On Tue, Feb 15, 2011 at 2:12 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Miles Bader <miles@gnu.org> writes:
>> Michael J Gruber <git@drmicha.warpmail.net> writes:
>>> Short options should really not be "wasted" easily. "-s" named after "to
>>> stage" is really problematic, as outlined in this thread.
>>
>> Er, but the point is that this is _such_ a common operation, that a
>> short option for it would not be "wasted" at all.
>
> True, but I am afraid "-c" is not it, as it would certainly be confusing
> to users who know what "diff" does before they learn "git diff".

Er...?

Here we were talking about using "-s" (inspired by "--staged"), which
I suggested because you earlier objected to "-c"...

-miles

-- 
Cat is power.  Cat is peace.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-13 22:58   ` Junio C Hamano
  2011-02-14  2:05     ` Miles Bader
  2011-02-14  3:09     ` Pete Harlan
@ 2011-02-14 22:32     ` Piotr Krukowiecki
  2011-02-14 23:19       ` Jonathan Nieder
  2 siblings, 1 reply; 65+ messages in thread
From: Piotr Krukowiecki @ 2011-02-14 22:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jonathan Nieder, git

On Sun, Feb 13, 2011 at 11:58 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Jonathan Nieder <jrnieder@gmail.com> writes:
[...]

Thanks for the explanation.

My point is:
1. using multiple terms is confusing
2. using not descriptive terms is confusing (or at least increases learning
   curve)

Ideally only one should be used - the rest should be obsoleted/hidden from
end user.

Example from git-status:

   - 'git status' outputs <<use "git reset HEAD <file>..." to unstage>>
     But in the man page there is nothing about staging!

   - the output does not mention "index" at all - only files tracked,
     untracked, to be committed

   - man page talks about index or index file exclusively, e.g.:
     "differences between the index file and the current HEAD commit",
     "updated in index", "added to index"


In other places "index" is called "staging area" and act of updating the index
is called "staging in the index".

I ask: why do we need the "index" term at all?

   - instead of "index" use "staging" and "staging area"
   - instead of "listed in index" use "staged" or "tracked"

What is used internally is one thing, but what the end user (not git developer)
sees does not have to be related.

(I'm not sure about the "tracked vs staged" - maybe we should again get rid of
one of them, at least in some cases.)

In fact it's not that important how it is called, as long as it meets the
points from the beginning of the mail.


As you can see I'm advocating for the use of the "staging" term after all.
I'm new to git and a non-native English speaker. "Staging" seems most clear of
all of the terms. You may find it differently, but please take into
consideration that you are accustomed to it.

"Staging" gives me the feeling of changing states - from working tree to
real commit - which I believe is the purpose of it.


"Caching" means something used e.g. to improve performance. You can read the
cache, update it using original item - but the cache is just a function of the
original content.

Probably most common place when users meet "cache" is browser cache. You
clear the cache, you set the limit of cache size, but you don't expect it to
be important. Definitely unlike "cache" in git.


I didn't like the "index" at all. At first I could not understand why did you
have chosen such name. Additionally in many places it's called "index file".
It increased the confusion - why would I care if it's a file or not?

Now I see you can understand it as indexing files that should be managed by git,
or indexing changes to be introduced. But I still like  "staging" better.


I've updated docs for several basic commands to see how would it feel to have
"staging" area instead of "index file" - and it's not bad IMO. It was basically
automatic search&replace, so the result can be improved.



-- 8< --
From: Piotr Krukowiecki <piotr.krukowiecki.news@gmail.com>
Date: Mon, 14 Feb 2011 23:20:07 +0100
Subject: [PATCH] Changed index term to staging area

---
 Documentation/git-add.txt    |   66 +++++++++++++++++++++---------------------
 Documentation/git-apply.txt  |   40 ++++++++++++------------
 Documentation/git-commit.txt |   14 ++++----
 Documentation/git-diff.txt   |   22 +++++++-------
 Documentation/git-status.txt |   22 +++++++-------
 5 files changed, 82 insertions(+), 82 deletions(-)

diff --git a/Documentation/git-add.txt b/Documentation/git-add.txt
index a03448f..54a50b7 100644
--- a/Documentation/git-add.txt
+++ b/Documentation/git-add.txt
@@ -3,7 +3,7 @@ git-add(1)

 NAME
 ----
-git-add - Add file contents to the index
+git-add - Add file contents to the staging area

 SYNOPSIS
 --------
@@ -15,23 +15,23 @@ SYNOPSIS

 DESCRIPTION
 -----------
-This command updates the index using the current content found in
+This command updates the staging area using the current content found in
 the working tree, to prepare the content staged for the next commit.
 It typically adds the current content of existing paths as a whole,
 but with some options it can also be used to add content with
 only part of the changes made to the working tree files applied, or
 remove paths that do not exist in the working tree anymore.

-The "index" holds a snapshot of the content of the working tree, and it
+The "staging area" holds a snapshot of the content of the working tree, and it
 is this snapshot that is taken as the contents of the next commit.  Thus
 after making any changes to the working directory, and before running
 the commit command, you must use the `add` command to add any new or
-modified files to the index.
+modified files to the staging area.

 This command can be performed multiple times before a commit.  It only
 adds the content of the specified file(s) at the time the add command is
 run; if you want subsequent changes included in the next commit, then
-you must run `git add` again to add the new content to the index.
+you must run `git add` again to add the new content to the staging area.

 The `git status` command can be used to obtain a summary of which
 files have changes that are staged for the next commit.
@@ -72,39 +72,39 @@ OPTIONS
 -i::
 --interactive::
 	Add modified contents in the working tree interactively to
-	the index. Optional path arguments may be supplied to limit
+	the staging area. Optional path arguments may be supplied to limit
 	operation to a subset of the working tree. See ``Interactive
 	mode'' for details.

 -p::
 --patch::
-	Interactively choose hunks of patch between the index and the
-	work tree and add them to the index. This gives the user a chance
+	Interactively choose hunks of patch between the staging area and the
+	work tree and add them to the staging area. This gives the user a chance
 	to review the difference before adding modified contents to the
-	index.
+	staging area.
 +
 This effectively runs `add --interactive`, but bypasses the
 initial command menu and directly jumps to the `patch` subcommand.
 See ``Interactive mode'' for details.

 -e, \--edit::
-	Open the diff vs. the index in an editor and let the user
+	Open the diff vs. the staging area in an editor and let the user
 	edit it.  After the editor was closed, adjust the hunk headers
-	and apply the patch to the index.
+	and apply the patch to the staging area.
 +
 The intent of this option is to pick and choose lines of the patch to
 apply, or even to modify the contents of lines to be staged. This can be
 quicker and more flexible than using the interactive hunk selector.
 However, it is easy to confuse oneself and create a patch that does not
-apply to the index. See EDITING PATCHES below.
+apply to the staging area. See EDITING PATCHES below.

 -u::
 --update::
 	Only match <filepattern> against already tracked files in
-	the index rather than the working tree. That means that it
+	the staging area rather than the working tree. That means that it
 	will never stage new files, but that it will stage modified
 	new contents of tracked files and that it will remove files
-	from the index if the corresponding files in the working tree
+	from the staging area if the corresponding files in the working tree
 	have been removed.
 +
 If no <filepattern> is given, default to "."; in other words,
@@ -114,21 +114,21 @@ subdirectories.
 -A::
 --all::
 	Like `-u`, but match <filepattern> against files in the
-	working tree in addition to the index. That means that it
+	working tree in addition to the staging area. That means that it
 	will find new files as well as staging modified content and
 	removing files that are no longer in the working tree.

 -N::
 --intent-to-add::
 	Record only the fact that the path will be added later. An entry
-	for the path is placed in the index with no content. This is
+	for the path is placed in the staging area with no content. This is
 	useful for, among other things, showing the unstaged content of
 	such files with `git diff` and committing them with `git commit
 	-a`.

 --refresh::
 	Don't add the file(s), but only refresh their stat()
-	information in the index.
+	information in the staging area.

 --ignore-errors::
 	If some files could not be added because of errors indexing
@@ -205,8 +205,8 @@ The main command loop has 6 subcommands (plus help
and quit).

 status::

-   This shows the change between HEAD and index (i.e. what will be
-   committed if you say `git commit`), and between index and
+   This shows the change between HEAD and staging area (i.e. what will be
+   committed if you say `git commit`), and between staging area and
    working tree files (i.e. what you could stage further before
    `git commit` using `git add`) for each path.  A sample output
    looks like this:
@@ -219,11 +219,11 @@ status::
 +
 It shows that foo.png has differences from HEAD (but that is
 binary so line count cannot be shown) and there is no
-difference between indexed copy and the working tree
+difference between staged copy and the working tree
 version (if the working tree version were also different,
 'binary' would have been shown in place of 'nothing').  The
 other file, git-add{litdd}interactive.perl, has 403 lines added
-and 35 lines deleted if you commit what is in the index, but
+and 35 lines deleted if you commit what is in the staging area, but
 working tree file has further modifications (one addition and
 one deletion).

@@ -254,7 +254,7 @@ Update>> -2
 ------------
 +
 After making the selection, answer with an empty line to stage the
-contents of working tree files for selected paths in the index.
+contents of working tree files for selected paths in the staging area.

 revert::

@@ -265,12 +265,12 @@ revert::
 add untracked::

   This has a very similar UI to 'update' and
-  'revert', and lets you add untracked paths to the index.
+  'revert', and lets you add untracked paths to the staging area.

 patch::

   This lets you choose one path out of a 'status' like selection.
-  After choosing the path, it presents the diff between the index
+  After choosing the path, it presents the diff between the staging area
   and the working tree file and asks you if you want to stage
   the change of each hunk.  You can say:

@@ -290,12 +290,12 @@ patch::
        ? - print help
 +
 After deciding the fate for all hunks, if there is any hunk
-that was chosen, the index is updated with the selected hunks.
+that was chosen, the staging area is updated with the selected hunks.

 diff::

   This lets you review what will be committed (i.e. between
-  HEAD and index).
+  HEAD and staging area).


 EDITING PATCHES
@@ -303,10 +303,10 @@ EDITING PATCHES

 Invoking `git add -e` or selecting `e` from the interactive hunk
 selector will open a patch in your editor; after the editor exits, the
-result is applied to the index. You are free to make arbitrary changes
+result is applied to the staging area. You are free to make arbitrary changes
 to the patch, but note that some changes may have confusing results, or
 even result in a patch that cannot be applied.  If you want to abort the
-operation entirely (i.e., stage nothing new in the index), simply delete
+operation entirely (i.e., stage nothing new in the staging area), simply delete
 all lines of the patch. The list below describes some common things you
 may see in a patch, and which editing operations make sense on them.

@@ -327,13 +327,13 @@ Modified content is represented by "-" lines
(removing the old content)
 followed by "{plus}" lines (adding the replacement content). You can
 prevent staging the modification by converting "-" lines to " ", and
 removing "{plus}" lines. Beware that modifying only half of the pair is
-likely to introduce confusing changes to the index.
+likely to introduce confusing changes to the staging area.
 --

 There are also more complex operations that can be performed. But beware
-that because the patch is applied only to the index and not the working
-tree, the working tree will appear to "undo" the change in the index.
-For example, introducing a new line into the index that is in neither
+that because the patch is applied only to the staging area and not the working
+tree, the working tree will appear to "undo" the change in the staging area.
+For example, introducing a new line into the staging area that is in neither
 the HEAD nor the working tree will stage the new line for commit, but
 the line will appear to be reverted in the working tree.

@@ -342,7 +342,7 @@ Avoid using these constructs, or do so with extreme caution.
 --
 removing untouched content::

-Content which does not differ between the index and working tree may be
+Content which does not differ between the staging area and working tree may be
 shown on context lines, beginning with a " " (space).  You can stage
 context lines for removal by converting the space to a "-". The
 resulting working tree file will appear to re-add the content.
diff --git a/Documentation/git-apply.txt b/Documentation/git-apply.txt
index 881652f..9b5a037 100644
--- a/Documentation/git-apply.txt
+++ b/Documentation/git-apply.txt
@@ -3,16 +3,16 @@ git-apply(1)

 NAME
 ----
-git-apply - Apply a patch to files and/or to the index
+git-apply - Apply a patch to files and/or to the staging area


 SYNOPSIS
 --------
 [verse]
-'git apply' [--stat] [--numstat] [--summary] [--check] [--index]
+'git apply' [--stat] [--numstat] [--summary] [--check] [--staged]
 	  [--apply] [--no-add] [--build-fake-ancestor=<file>] [-R | --reverse]
 	  [--allow-binary-replacement | --binary] [--reject] [-z]
-	  [-p<n>] [-C<n>] [--inaccurate-eof] [--recount] [--cached]
+	  [-p<n>] [-C<n>] [--inaccurate-eof] [--recount] [--staged-only]
 	  [--ignore-space-change | --ignore-whitespace ]
 	  [--whitespace=(nowarn|warn|fix|error|error-all)]
 	  [--exclude=<path>] [--include=<path>] [--directory=<root>]
@@ -21,8 +21,8 @@ SYNOPSIS
 DESCRIPTION
 -----------
 Reads the supplied diff output (i.e. "a patch") and applies it to files.
-With the `--index` option the patch is also applied to the index, and
-with the `--cache` option the patch is only applied to the index.
+With the `--staged` option the patch is also applied to the staging area, and
+with the `--staged-only` option the patch is only applied to the staging area.
 Without these options, the command applies the patch only to files,
 and does not require them to be in a git repository.

@@ -55,32 +55,32 @@ OPTIONS

 --check::
 	Instead of applying the patch, see if the patch is
-	applicable to the current working tree and/or the index
-	file and detects errors.  Turns off "apply".
+	applicable to the current working tree and/or the staging
+	area and detects errors.  Turns off "apply".

---index::
+--staged::
 	When `--check` is in effect, or when applying the patch
 	(which is the default when none of the options that
 	disables it is in effect), make sure the patch is
-	applicable to what the current index file records.  If
+	applicable to what the current staging area records.  If
 	the file to be patched in the working tree is not
 	up-to-date, it is flagged as an error.  This flag also
-	causes the index file to be updated.
+	causes the staging area to be updated.

---cached::
+--staged-only::
 	Apply a patch without touching the working tree. Instead take the
-	cached data, apply the patch, and store the result in the index
-	without using the working tree. This implies `--index`.
+	sttaged data, apply the patch, and store the result in the staging area
+	without using the working tree. This implies `--staged`.

 --build-fake-ancestor=<file>::
-	Newer 'git diff' output has embedded 'index information'
+	Newer 'git diff' output has embedded 'staging area information'
 	for each blob to help identify the original version that
 	the patch applies to.  When this flag is given, and if
 	the original versions of the blobs are available locally,
-	builds a temporary index containing those blobs.
+	builds a temporary staging area containing those blobs.
 +
-When a pure mode change is encountered (which has no index information),
-the information is read from the current index instead.
+When a pure mode change is encountered (which has no staging area information),
+the information is read from the current staging area instead.

 -R::
 --reverse::
@@ -236,13 +236,13 @@ Submodules
 If the patch contains any changes to submodules then 'git apply'
 treats these changes as follows.

-If `--index` is specified (explicitly or implicitly), then the submodule
-commits must match the index exactly for the patch to apply.  If any
+If `--staged` is specified (explicitly or implicitly), then the submodule
+commits must match the staging area exactly for the patch to apply.  If any
 of the submodules are checked-out, then these check-outs are completely
 ignored, i.e., they are not required to be up-to-date or clean and they
 are not updated.

-If `--index` is not specified, then the submodule commits in the patch
+If `--staged` is not specified, then the submodule commits in the patch
 are ignored and only the absence or presence of the corresponding
 subdirectory is checked and (if possible) updated.

diff --git a/Documentation/git-commit.txt b/Documentation/git-commit.txt
index b586c0f..728b2cf 100644
--- a/Documentation/git-commit.txt
+++ b/Documentation/git-commit.txt
@@ -16,26 +16,26 @@ SYNOPSIS

 DESCRIPTION
 -----------
-Stores the current contents of the index in a new commit along
+Stores the current contents of the staging area in a new commit along
 with a log message from the user describing the changes.

 The content to be added can be specified in several ways:

 1. by using 'git add' to incrementally "add" changes to the
-   index before using the 'commit' command (Note: even modified
+   staging area before using the 'commit' command (Note: even modified
    files must be "added");

 2. by using 'git rm' to remove files from the working tree
-   and the index, again before using the 'commit' command;
+   and the staging area, again before using the 'commit' command;

 3. by listing files as arguments to the 'commit' command, in which
-   case the commit will ignore changes staged in the index, and instead
+   case the commit will ignore changes staged in the staging area, and instead
    record the current content of the listed files (which must already
    be known to git);

 4. by using the -a switch with the 'commit' command to automatically
    "add" changes from all known files (i.e. all files that are already
-   listed in the index) and to automatically "rm" files in the index
+   tracked) and to automatically "rm" tracked files
    that have been removed from the working tree, and then perform the
    actual commit;

@@ -273,8 +273,8 @@ EXAMPLES
 --------
 When recording your own work, the contents of modified files in
 your working tree are temporarily stored to a staging area
-called the "index" with 'git add'.  A file can be
-reverted back, only in the index but not in the working tree,
+ with 'git add'.  A file can be
+reverted back, only in the staging area but not in the working tree,
 to that of the last commit with `git reset HEAD -- <file>`,
 which effectively reverts 'git add' and prevents the changes to
 this file from participating in the next commit.  After building
diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index 4910510..eab118a 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -10,29 +10,29 @@ SYNOPSIS
 --------
 [verse]
 'git diff' [options] [<commit>] [--] [<path>...]
-'git diff' [options] --cached [<commit>] [--] [<path>...]
+'git diff' [options] --staged [<commit>] [--] [<path>...]
 'git diff' [options] <commit> <commit> [--] [<path>...]
-'git diff' [options] [--no-index] [--] <path> <path>
+'git diff' [options] [--not-staged] [--] <path> <path>

 DESCRIPTION
 -----------
-Show changes between the working tree and the index or a tree, changes
-between the index and a tree, changes between two trees, or changes
+Show changes between the working tree and the staging area or a tree, changes
+between the staging area and a tree, changes between two trees, or changes
 between two files on disk.

 'git diff' [--options] [--] [<path>...]::

 	This form is to view the changes you made relative to
-	the index (staging area for the next commit).  In other
+	the staging area for the next commit.  In other
 	words, the differences are what you _could_ tell git to
-	further add to the index but you still haven't.  You can
+	further add to the staging area but you still haven't.  You can
 	stage these changes by using linkgit:git-add[1].
 +
 If exactly two paths are given and at least one points outside
 the current repository, 'git diff' will compare the two files /
-directories. This behavior can be forced by --no-index.
+directories. This behavior can be forced by --not-staged.

-'git diff' [--options] --cached [<commit>] [--] [<path>...]::
+'git diff' [--options] --staged [<commit>] [--] [<path>...]::

 	This form is to view the changes you staged for the next
 	commit relative to the named <commit>.  Typically you
@@ -40,7 +40,7 @@ directories. This behavior can be forced by --no-index.
 	do not give <commit>, it defaults to HEAD.
 	If HEAD does not exist (e.g. unborned branches) and
 	<commit> is not given, it shows all staged changes.
-	--staged is a synonym of --cached.
+	--cached is a synonym of --staged, will be removed in version 2.0
(or whatever).

 'git diff' [--options] <commit> [--] [<path>...]::

@@ -102,12 +102,12 @@ Various ways to check your working tree::
 +
 ------------
 $ git diff            <1>
-$ git diff --cached   <2>
+$ git diff --staged   <2>
 $ git diff HEAD       <3>
 ------------
 +
 <1> Changes in the working tree not yet staged for the next commit.
-<2> Changes between the index and your last commit; what you
+<2> Changes between the staging area and your last commit; what you
 would be committing if you run "git commit" without "-a" option.
 <3> Changes in the working tree since your last commit; what you
 would be committing if you run "git commit -a"
diff --git a/Documentation/git-status.txt b/Documentation/git-status.txt
index dae190a..65aa798 100644
--- a/Documentation/git-status.txt
+++ b/Documentation/git-status.txt
@@ -12,9 +12,9 @@ SYNOPSIS

 DESCRIPTION
 -----------
-Displays paths that have differences between the index file and the
+Displays paths that have differences between the staging area and the
 current HEAD commit, paths that have differences between the working
-tree and the index file, and paths in the working tree that are not
+tree and the staging area, and paths in the working tree that are not
 tracked by git (and are not ignored by linkgit:gitignore[5]). The first
 are what you _would_ commit by running `git commit`; the second and
 third are what you _could_ commit by running 'git add' before running
@@ -91,7 +91,7 @@ In short-format, the status of each path is shown as

 where `PATH1` is the path in the `HEAD`, and ` -> PATH2` part is
 shown only when `PATH1` corresponds to a different path in the
-index/worktree (i.e. the file is renamed). The 'XY' is a two-letter
+staging area/worktree (i.e. the file is renamed). The 'XY' is a two-letter
 status code.

 The fields (including the `->`) are separated from each other by a
@@ -102,7 +102,7 @@ interior special characters backslash-escaped.

 For paths with merge conflicts, `X` and 'Y' show the modification
 states of each side of the merge. For paths that do not have merge
-conflicts, `X` shows the status of the index, and `Y` shows the status
+conflicts, `X` shows the status of the staging area, and `Y` shows the status
 of the work tree.  For untracked paths, `XY` are `??`.  Other status
 codes can be interpreted as follows:

@@ -119,13 +119,13 @@ Ignored files are not listed.
     X          Y     Meaning
     -------------------------------------------------
               [MD]   not updated
-    M        [ MD]   updated in index
-    A        [ MD]   added to index
-    D         [ M]   deleted from index
-    R        [ MD]   renamed in index
-    C        [ MD]   copied in index
-    [MARC]           index and work tree matches
-    [ MARC]     M    work tree changed since index
+    M        [ MD]   updated in staging area
+    A        [ MD]   added to staging area
+    D         [ M]   deleted from staging area
+    R        [ MD]   renamed in staging area
+    C        [ MD]   copied in staging area
+    [MARC]           staging area and work tree matches
+    [ MARC]     M    work tree changed since staging area
     [ MARC]     D    deleted in work tree
     -------------------------------------------------
     D           D    unmerged, both deleted
-- 
1.7.4.1.26.g00e6e

^ permalink raw reply related	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 22:07                     ` Miles Bader
@ 2011-02-14 22:59                       ` Junio C Hamano
  2011-02-14 23:47                         ` Miles Bader
  0 siblings, 1 reply; 65+ messages in thread
From: Junio C Hamano @ 2011-02-14 22:59 UTC (permalink / raw)
  To: Miles Bader
  Cc: Michael J Gruber, Johannes Sixt, Jonathan Nieder, Piotr Krukowiecki, git

Miles Bader <miles@gnu.org> writes:

> Er...?
>
> Here we were talking about using "-s" (inspired by "--staged"), which
> I suggested because you earlier objected to "-c"...

Not _we were_, but _you_ were.

I actually was hoping that it was obvious that -s is a no-starter from the
messages so far in this thread, as neither --cached nor its more
descriptive spelling --index-only has character 's' anywhere in it, and we
have been keeping --staged as a low-key synonym for a reason.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 22:32     ` Piotr Krukowiecki
@ 2011-02-14 23:19       ` Jonathan Nieder
  2011-02-15  8:29         ` Pete Harlan
                           ` (2 more replies)
  0 siblings, 3 replies; 65+ messages in thread
From: Jonathan Nieder @ 2011-02-14 23:19 UTC (permalink / raw)
  To: Piotr Krukowiecki; +Cc: Junio C Hamano, git

Hi again,

Piotr Krukowiecki wrote:

> In other places "index" is called "staging area" and act of updating the index
> is called "staging in the index".
>
> I ask: why do we need the "index" term at all?
>
>    - instead of "index" use "staging" and "staging area"
>    - instead of "listed in index" use "staged" or "tracked"

Unlike "staging area", the word "index" is unfamiliar and opaque.  So
there is a sense that there is something to learn.

When people talk about the staging area I tend to get confused.  I
think there's an idea that because it sounds more concrete, there is
less to explain --- or maybe I am just wired the wrong way.

There is a .git/index file, with a well defined file format.  And
there is an in-core copy of the index, too.  It contains:

 - mode and blob name for paths as requested by the user with
   "git add"

 - competing versions for paths whose proposed content is
   uncertain during a merge

 - stat(2) information to speed up comparison with the worktree

There are some other pieces, too --- "intent-to-add" entries added
with "git add -N", cached tree names for unmodified subtrees to
speed up "git commit", and so on.  But the 3 pieces listed above are
the main thing.

"Staging area" only describes the first.

All that said, I am not against formulations like "content of the next
commit" that might be more concrete from a user's point of view.

[...]
>  --refresh::
>  	Don't add the file(s), but only refresh their stat()
> -	information in the index.
> +	information in the staging area.

git add/update-index --refresh are precisely meant for _not_ changing
the content of the next commit, so this particular change seems
confusing.

Hoping that is clearer.  Thanks for caring.
Jonathan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 22:59                       ` Junio C Hamano
@ 2011-02-14 23:47                         ` Miles Bader
  2011-02-15  0:12                           ` Junio C Hamano
  0 siblings, 1 reply; 65+ messages in thread
From: Miles Bader @ 2011-02-14 23:47 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Michael J Gruber, Johannes Sixt, Jonathan Nieder, Piotr Krukowiecki, git

On Tue, Feb 15, 2011 at 7:59 AM, Junio C Hamano <gitster@pobox.com> wrote:
> I actually was hoping that it was obvious that -s is a no-starter from the
> messages so far in this thread, as neither --cached nor its more
> descriptive spelling --index-only has character 's' anywhere in it, and we
> have been keeping --staged as a low-key synonym for a reason.

It was not at all obvious.  Even if you like --cached more than
--staged, there's a difference between advocating "--staged", and
using "-s" as a short-option for the operation which --cached /
--staged invoke.

Short option names are often a compromise, because clearly there are
often conflicts.  That _doesn't_ mean that one should simply not have
a short option, when a "perfect" choice cannot be found.  If a
"perfect" short-option isn't available, then usually one turns to
somewhat less perfect choices, trying to at least find some heuristic
that can make them easier to memorize -- because in the end, short
options must be memorized (and if they are truly common operations,
this isn't generally difficult; it's memorizing _rarely_ used short
options that's hard).

Of the various choices, "-s" does at least have such a heuristic
connection to an appropriate long option ("-i" is arguably worse than
-s, because it doesn't have any such connection...).  Can you suggest
something better?

[BTW, isn't the name "--index-only" something of a misnomer?  If
something is called "--XXX-only", that implies that the default
operation uses "XXX + something else" instead of XXX, but that
otherwise they are the same.  However in fact the difference in
behavior resulting from --cached is more subtle: it changes _both_
sides of the diff (default: worktree<->index; --cached: index<->HEAD).
 The names --cached and --staged actually capture this well -- they
basically say "the default is worktree changes, and --cached/--staged
diffs cached/staged changes instead" -- but the name "--index-only"
does not.]

-Miles

-- 
Cat is power.  Cat is peace.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 23:47                         ` Miles Bader
@ 2011-02-15  0:12                           ` Junio C Hamano
  0 siblings, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2011-02-15  0:12 UTC (permalink / raw)
  To: Miles Bader
  Cc: Michael J Gruber, Johannes Sixt, Jonathan Nieder, Piotr Krukowiecki, git

Miles Bader <miles@gnu.org> writes:

> [BTW, isn't the name "--index-only" something of a misnomer?  If
> something is called "--XXX-only", that implies that the default
> operation uses "XXX + something else" instead of XXX, but that
> otherwise they are the same.  However in fact the difference in
> behavior resulting from --cached is more subtle: it changes _both_
> sides of the diff (default: worktree<->index; --cached: index<->HEAD).

Not really.

There are three entities involved: a tree-ish, the index, and the working
tree.  Because the index is a singleton, when you say "compare the index
with...", you only have two choices, either compare it against a tree-ish,
or compare it with the working tree.  If you want to do the latter, you
just use the command without --cached nor tree-ish.

The --cached form defaults to HEAD only because --cached mode is about
comparing the index against a tree-ish (think about "diff --cached HEAD^").

The same thing for --index-only.  The moment you said "compare the index
with...", there are only two other things to compare it against and that
is the only reason why you do not have to write HEAD.

This is a tangent, but the natural patch-flow is for you to prepare your
change in the working tree, add the changes to the index, and then build a
tree out of the index into a commit.

That is why "diff" shows changes in the working tree relative to what is
in the index, "diff --cached [<tree-ish>]" shows changes in the index
relative to the tree-ish (defaulting to HEAD).  The natural flow of the
development determines the natural direction of comparison between these
entities.

It does not make sense to compare in the other direction (i.e. how is the
index different compared to the working tree) _unless_ you are
contemplating to revert some changes you have made, and -R is there
exactly for that reason (here I am responding to the idea some people had
in an earlier incarnation of this thread of saying "diff INDEX HEAD",
"diff HEAD WORKTREE" etc., using pseudo <ref> syntax, and explaining why
it is not such a good idea---and why this is a tangent).

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 23:19       ` Jonathan Nieder
@ 2011-02-15  8:29         ` Pete Harlan
  2011-02-15  9:00           ` Jonathan Nieder
  2011-02-15 18:15         ` Piotr Krukowiecki
  2011-02-26 21:09         ` Felipe Contreras
  2 siblings, 1 reply; 65+ messages in thread
From: Pete Harlan @ 2011-02-15  8:29 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Piotr Krukowiecki, Junio C Hamano, git

On 02/14/2011 03:19 PM, Jonathan Nieder wrote:
> Hi again,
> 
> Piotr Krukowiecki wrote:
> 
>> In other places "index" is called "staging area" and act of updating the index
>> is called "staging in the index".
>>
>> I ask: why do we need the "index" term at all?
>>
>>    - instead of "index" use "staging" and "staging area"
>>    - instead of "listed in index" use "staged" or "tracked"
> 
> Unlike "staging area", the word "index" is unfamiliar and opaque.  So
> there is a sense that there is something to learn.
>
> When people talk about the staging area I tend to get confused.  I
> think there's an idea that because it sounds more concrete, there is
> less to explain --- or maybe I am just wired the wrong way.
> 
> There is a .git/index file, with a well defined file format.  And
> there is an in-core copy of the index, too.  It contains:
> 
>  - mode and blob name for paths as requested by the user with
>    "git add"
> 
>  - competing versions for paths whose proposed content is
>    uncertain during a merge
> 
>  - stat(2) information to speed up comparison with the worktree
> 
> There are some other pieces, too --- "intent-to-add" entries added
> with "git add -N", cached tree names for unmodified subtrees to
> speed up "git commit", and so on.  But the 3 pieces listed above are
> the main thing.

Thank you for that explanation.

> "Staging area" only describes the first.

...which to me means only that "staging area" isn't enough to fully
describe what Git can do.

>From the user's perspective, merge conflict resolution is a separate
process from staging a commit; where does Git's usability benefit from
blending the two concepts by referring (in command syntax and
manpages) to their common internal data structure?

One of Git's charms is the simplicity of blobs, trees, commits and
tags and how those ingredients prove tremendously useful in developing
software.  And I don't think anyone can use Git well without fully
understanding what those structures are (and are not).

But I believe the rest of Git's internals are in a different category.
Regardless of how elegant the solution may be, as a user I can use Git
well without knowing _how_ Git can tell that foo.c contains staged and
unstaged changes.  Nor do I need to know how it knows that bar.c is in
conflict.  I don't need to know precisely how it implements its packed
object database to use it effectively.

Part of the issue could be that one intimately familiar with Git's
internals may find a process oriented interface irritating ("Why must
it say 'staging area' when it's just updating the index?"), while one
unfamiliar with the internals has the opposite reaction ("Why must it
make me use the internal name of the staging area?").

Someone suggested using a different top-level name for Git to allow
for completely rewriting the interface.  I expect that it's this
difference of perspective that makes that appear necessary.  I believe
that a rewrite is the wrong approach, but I believe that abstractions
like "staging area" move the user-interface a little more toward the
user and that there's value in that.

--Pete

> All that said, I am not against formulations like "content of the next
> commit" that might be more concrete from a user's point of view.
> 
> [...]
>>  --refresh::
>>  	Don't add the file(s), but only refresh their stat()
>> -	information in the index.
>> +	information in the staging area.
> 
> git add/update-index --refresh are precisely meant for _not_ changing
> the content of the next commit, so this particular change seems
> confusing.
> 
> Hoping that is clearer.  Thanks for caring.
> Jonathan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-15  8:29         ` Pete Harlan
@ 2011-02-15  9:00           ` Jonathan Nieder
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Nieder @ 2011-02-15  9:00 UTC (permalink / raw)
  To: Pete Harlan; +Cc: Piotr Krukowiecki, Junio C Hamano, git

Hi Pete,

Pete Harlan wrote:

> Part of the issue could be that one intimately familiar with Git's
> internals may find a process oriented interface irritating ("Why must
> it say 'staging area' when it's just updating the index?")

No, no.  I agree there's a problem to solve here.  The current
documentation for git (e.g., the user manual) has a nice, coherent,
user-oriented narrative about trees, commits, and blobs, and meanwhile
it is hard to find a clear story about the index.

Such a story would have to describe the conflict resolution process.
When you encounter a merge conflict, how do you resolve it?  The best
I can do for now is to point to the user manual[1].

http://www.kernel.org/pub/software/scm/git/docs/user-manual.html#conflict-resolution

I even think it is okay to say "The index is a sort of staging area
for your next commit".  Because that is true.  But it is not the full
story, so if one wants to give the index a new name --- which is a
costly thing to do, anyway --- then I do not think "the staging area"
works.

I feel bad to only be presenting complications instead of an alternate
solution.  I do consider workflow oriented explanations very useful.
I've been giving technical explanations in this thread as background
for future storytelling, in the hope that someone more talented than I
am can digest it into a good narrative.

Jonathan

[1] Maybe the process is overdesigned.  After all, what would we lose
by saying

 - an unmerged path justs gets an "unmerged" flag set, meaning that
   flag is not ready for commit yet
 - to get the copy from the common ancestor, use
	git show $(git merge-base HEAD MERGE_HEAD):path/to/file
 - to get the copy from HEAD, use
	git show HEAD:path/to/file
 - likewise to get the copy from MERGE_HEAD

And while I can give answers about why that is a bad interface
(recomputing the merge base is a waste of time; in a recursive merge
the merge base is not a real commit; if there were renames, the copy
from HEAD could be HEAD:other/path and it is hard to find what
other/path is), are those answers enough to justify learning this new
trick?

So we need a better story.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 23:19       ` Jonathan Nieder
  2011-02-15  8:29         ` Pete Harlan
@ 2011-02-15 18:15         ` Piotr Krukowiecki
  2011-02-15 18:38           ` Jonathan Nieder
  2011-02-26 21:09         ` Felipe Contreras
  2 siblings, 1 reply; 65+ messages in thread
From: Piotr Krukowiecki @ 2011-02-15 18:15 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Junio C Hamano, git

On Tue, Feb 15, 2011 at 12:19 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> Hi again,
>
> Piotr Krukowiecki wrote:
>>  --refresh::
>>       Don't add the file(s), but only refresh their stat()
>> -     information in the index.
>> +     information in the staging area.
>
> git add/update-index --refresh are precisely meant for _not_ changing
> the content of the next commit, so this particular change seems
> confusing.

If there is no staging - no commit, then you're right. But then you don't
have to mention index at all:

  --refresh::
       Don't add the file(s), but only refresh their stat()
       information.

I completely agree with Pete Harlan - for normal user git internals are
not relevant - index is just part of git. How or where the stat information is
refreshed does not matter.

In the same way you don't write that it's done by function refresh_index().


> Hoping that is clearer.  Thanks for caring.
> Jonathan

Thanks for explanation.


-- 
Piotrek

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-15 18:15         ` Piotr Krukowiecki
@ 2011-02-15 18:38           ` Jonathan Nieder
  0 siblings, 0 replies; 65+ messages in thread
From: Jonathan Nieder @ 2011-02-15 18:38 UTC (permalink / raw)
  To: Piotr Krukowiecki; +Cc: Junio C Hamano, git

Piotr Krukowiecki wrote:
>> Piotr Krukowiecki wrote:

>>>  --refresh::
>>>       Don't add the file(s), but only refresh their stat()
>>> -     information in the index.
>>> +     information in the staging area.
[...]
> If there is no staging - no commit, then you're right. But then you don't
> have to mention index at all:
>
>   --refresh::
>        Don't add the file(s), but only refresh their stat()
>        information.

Yes, that sounds like an improvement.  Though I'd suggest something
like:

  --refresh::
	Don't add the files' content and mode, but refresh their stat(2)
	information if it is out of date.  For example, you'd want to
	do this after restoring a repository from backup, to link up
	the stat index details with the proper files.

The exact wording could use tweaking, but hopefully the idea is clear
(to explain what the option is actually used for).

> index is just part of git. How or where the stat information is
> refreshed does not matter.

I agree with that.  That this is (1) specific to that index, so the
operation needs to be repeated if you use GIT_INDEX_FILE to work with
a second index and (2) has as its only purpose speeding up operations
that compare the index to the worktree are relevant, though.

Anyway, I don't want to argue.  Many of the places pointed out in
the manual could use help.  It could even involve inserting the
phrase "a staging area".

Hopefully I have made clear why excising the word "index" from git
vocabulary (like the word "current directory cache" was eventually
eliminated over time in the past) does not seem like a good idea when
we don't even have a good alternative for it.  As the original post
mentioned, using three terms in documentation for fundamentally the
same thing is going to get confusing after a while.  Why not just use
one ("the index")?

Sorry for the ramble.
Jonathan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14  3:09     ` Pete Harlan
@ 2011-02-16 23:11       ` Drew Northup
  2011-02-26 20:36         ` Felipe Contreras
  2011-02-27 21:16       ` Aghiles
  1 sibling, 1 reply; 65+ messages in thread
From: Drew Northup @ 2011-02-16 23:11 UTC (permalink / raw)
  To: Pete Harlan; +Cc: Junio C Hamano, Jonathan Nieder, Piotr Krukowiecki, git


On Sun, 2011-02-13 at 19:09 -0800, Pete Harlan wrote:
> On 02/13/2011 02:58 PM, Junio C Hamano wrote:
> >> --staged
> >> ~~~~~~~~
> >> diff takes --staged, but that is only to support some people's habits.
> > The term "stage" comes from "staging area", a term people used to explain
> > the concept of the index by saying "The index holds set of contents to be
> > made into the next commit; it is _like_ the staging area".
> > 
> > My feeling is that "to stage" is primarily used, outside "git" circle, as
> > a logistics term.  If you find it easier to visualize the concept of the
> > index with "staging area" ("an area where troops and equipment in transit
> > are assembled before a military operation", you may find it easier to say
> > "stage this path ('git add path')", instead of "adding to the set of
> > contents...".
> 
> FWIW, when teaching Git I have found that users immediately understand
> "staging area", while "index" and "cache" confuse them.
> 
> "Index" means to them a numerical index into a data structure.
> "Cache" is a local copy of something that exists remotely.  Neither
> word describes the concept correctly from a user's perspective.

According to the dictionary (actually, more than one) "cache" is a
hidden storage space. I'm pretty sure that's the sense most global and
therefore most appropriate to thinking about Git. (It certainly
describes correctly what web browser cache and on-CPU cache is doing.)
One would only think the definition you gave applied if they didn't know
that squirrels "cache" nuts. I don't think that the problem is the
idiom.

> I learned long ago to type "index" and "cached", but when talking (and
> thinking) about Git I find "the staging area" gets the point across
> very clearly and moves Git from interesting techie-tool to
> world-dominating SCM territory.  I'm surprised that that experience
> isn't universal.

Perhaps that helps you associate it with other SCM/VCS software, but it
didn't help me. When I realized that the "index" is called that BECAUSE
IT IS AN INDEX (of content/data states for a pending commit operation)
the sky cleared and the sun came out.

In all reality the closest thing Git has to an actual staging area is
all of the objects in .git/objects only recorded by the index itself.
Git-stored objects not compressed into pack files could technically be
described as "cached" using the standard definition--they aren't visible
in the working directory. Unfortunately this probably just muddies the
water for all too many users.

So, in summary--the index is real, objects "cached" pending
commit/cleanup/packing are real; any "staging area" is a rhetorical
combination of the two. Given that rhetorical device may not work in all
languages (as Junio mentioned earlier) I don't recommend that we rely on
it.

-- 
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-16 23:11       ` Drew Northup
@ 2011-02-26 20:36         ` Felipe Contreras
  2011-02-27 15:30           ` Drew Northup
  0 siblings, 1 reply; 65+ messages in thread
From: Felipe Contreras @ 2011-02-26 20:36 UTC (permalink / raw)
  To: Drew Northup
  Cc: Pete Harlan, Junio C Hamano, Jonathan Nieder, Piotr Krukowiecki, git

On Thu, Feb 17, 2011 at 1:11 AM, Drew Northup <drew.northup@maine.edu> wrote:
>
> On Sun, 2011-02-13 at 19:09 -0800, Pete Harlan wrote:
>> On 02/13/2011 02:58 PM, Junio C Hamano wrote:
>> >> --staged
>> >> ~~~~~~~~
>> >> diff takes --staged, but that is only to support some people's habits.
>> > The term "stage" comes from "staging area", a term people used to explain
>> > the concept of the index by saying "The index holds set of contents to be
>> > made into the next commit; it is _like_ the staging area".
>> >
>> > My feeling is that "to stage" is primarily used, outside "git" circle, as
>> > a logistics term.  If you find it easier to visualize the concept of the
>> > index with "staging area" ("an area where troops and equipment in transit
>> > are assembled before a military operation", you may find it easier to say
>> > "stage this path ('git add path')", instead of "adding to the set of
>> > contents...".
>>
>> FWIW, when teaching Git I have found that users immediately understand
>> "staging area", while "index" and "cache" confuse them.
>>
>> "Index" means to them a numerical index into a data structure.
>> "Cache" is a local copy of something that exists remotely.  Neither
>> word describes the concept correctly from a user's perspective.
>
> According to the dictionary (actually, more than one) "cache" is a
> hidden storage space. I'm pretty sure that's the sense most global and
> therefore most appropriate to thinking about Git. (It certainly
> describes correctly what web browser cache and on-CPU cache is doing.)
> One would only think the definition you gave applied if they didn't know
> that squirrels "cache" nuts. I don't think that the problem is the
> idiom.

Not really. If a squirrel "caches" nuts, it means a squirrel is
putting them in a hidden place to save them for future use. So, in the
future, if said squirrel wants a nut, it doesn't have to look for it
in the trees, just go to the cache. So the cache makes it easier to
access whatever your want.

IOW; if you don't cache something, you would have more trouble getting
it, but you still can.

That's not what Git is doing. Git is not putting changes in a place so
the can be more easily accessed in the future. It is using a temporary
device that allows the commit to be built through an extended period
of time. It's not a cache.

>> I learned long ago to type "index" and "cached", but when talking (and
>> thinking) about Git I find "the staging area" gets the point across
>> very clearly and moves Git from interesting techie-tool to
>> world-dominating SCM territory.  I'm surprised that that experience
>> isn't universal.
>
> Perhaps that helps you associate it with other SCM/VCS software, but it
> didn't help me. When I realized that the "index" is called that BECAUSE
> IT IS AN INDEX (of content/data states for a pending commit operation)
> the sky cleared and the sun came out.

That's not an index. An index is a guide of pointers to something
else. It allows you to find whatever you are looking for by looking in
small table of pointers instead of looking through all the samples.

IOW; if you don't index something, you would have more trouble finding
it, but you still can.

That's not what Git is doing.

> In all reality the closest thing Git has to an actual staging area is
> all of the objects in .git/objects only recorded by the index itself.
> Git-stored objects not compressed into pack files could technically be
> described as "cached" using the standard definition--they aren't visible
> in the working directory. Unfortunately this probably just muddies the
> water for all too many users.

That's irrelevant. You can implement the same functionality in many
other ways. How it is implement doesn't matter, what matters is what
the user experiences.

> So, in summary--the index is real, objects "cached" pending
> commit/cleanup/packing are real; any "staging area" is a rhetorical
> combination of the two. Given that rhetorical device may not work in all
> languages (as Junio mentioned earlier) I don't recommend that we rely on
> it.

Branches and tags are "rthetorical" devices as well. But behind scenes
they are just refs. Shall we disregard 'branch' and 'tag'?

No. What Git does behind scenes is irrelevant to the user. What
matters is what the device does, not how it is implemented; the
implementation might change. "Stage" is the perfect word; both verb
and a noun that express a temporary space where things are prepared
for their final form.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14 23:19       ` Jonathan Nieder
  2011-02-15  8:29         ` Pete Harlan
  2011-02-15 18:15         ` Piotr Krukowiecki
@ 2011-02-26 21:09         ` Felipe Contreras
  2011-02-26 21:51           ` Jonathan Nieder
                             ` (2 more replies)
  2 siblings, 3 replies; 65+ messages in thread
From: Felipe Contreras @ 2011-02-26 21:09 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Piotr Krukowiecki, Junio C Hamano, git

On Tue, Feb 15, 2011 at 1:19 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> When people talk about the staging area I tend to get confused.  I
> think there's an idea that because it sounds more concrete, there is
> less to explain --- or maybe I am just wired the wrong way.

I don't like the phrase "staging area". A "stage" already has an area.
You put things on the stage. Sometimes there are multiple stages.

> There is a .git/index file, with a well defined file format.  And
> there is an in-core copy of the index, too.  It contains:
>
>  - mode and blob name for paths as requested by the user with
>   "git add"

A commit stage.

>  - competing versions for paths whose proposed content is
>   uncertain during a merge

Multiple commit stages.

>  - stat(2) information to speed up comparison with the worktree

If only a subset of the files are there, it's an 'index', if not, then
I'd say it's a 'registry'. Anyway, it's something the user shouldn't
care about.

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-26 21:09         ` Felipe Contreras
@ 2011-02-26 21:51           ` Jonathan Nieder
  2011-02-27  0:01             ` Miles Bader
  2011-02-27  0:16             ` Felipe Contreras
  2011-02-27  8:43           ` Jeff King
  2011-02-27 18:46           ` Phil Hord
  2 siblings, 2 replies; 65+ messages in thread
From: Jonathan Nieder @ 2011-02-26 21:51 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Piotr Krukowiecki, Junio C Hamano, git

Hi Felipe et al,

Felipe Contreras wrote:
> On Tue, Feb 15, 2011 at 1:19 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:

>>  - mode and blob name for paths as requested by the user with
>>   "git add"
>
> A commit stage.
>
>>  - competing versions for paths whose proposed content is
>>   uncertain during a merge
>
> Multiple commit stages.
>
>>  - stat(2) information to speed up comparison with the worktree
>
> If only a subset of the files are there, it's an 'index', if not, then
> I'd say it's a 'registry'.

These terms you suggest aren't the established ones (as I'm sure you
know).  Just as with everyday language, there is some resistance to
moving to new terms that have not been established for a while.  In
everyday language, many terms gained popularity by

 - appearing in some document that people read for another reason
 - describing the notion they are meant to describe clearly (or
   having some other feature that makes them likeable)

This is how "staging area" has been gaining popularity, I think ---
some (out-of-tree) documentation that is good for other reasons uses
it, and it really does seem to be a clearer term than "index" for
"place where the next commit is being prepared".  Unfortunately, I do
not think it is a clearer term than "index" for "the git index, which
contains stat() information and pointers to blobs that either belong
in the next commit or are participating in a merge conflict".  So it
does not seem to justify rewriting everything to use it.

Which suggests one way forward --- if you believe you have terms that
do describe those concepts clearly, one way to promote them is to
write some good, clear (out-of-tree, to begin with) documentation
using them.  Presumably this documentation would also mention that
other people use other terms to avoid confusing the reader.

Hope that helps,
Jonathan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-26 21:51           ` Jonathan Nieder
@ 2011-02-27  0:01             ` Miles Bader
  2011-02-27  0:16             ` Felipe Contreras
  1 sibling, 0 replies; 65+ messages in thread
From: Miles Bader @ 2011-02-27  0:01 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Felipe Contreras, Piotr Krukowiecki, Junio C Hamano, git

Jonathan Nieder <jrnieder@gmail.com> writes:
> This is how "staging area" has been gaining popularity, I think ---
> some (out-of-tree) documentation that is good for other reasons uses
> it, and it really does seem to be a clearer term than "index" for
> "place where the next commit is being prepared".

Also "magit" uses the label "Staging area:" for the list of files to be
committed -- and the key-binding to add a file to that list is "s"...

-Miles

-- 
Christian, n. One who follows the teachings of Christ so long as they are not
inconsistent with a life of sin.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-26 21:51           ` Jonathan Nieder
  2011-02-27  0:01             ` Miles Bader
@ 2011-02-27  0:16             ` Felipe Contreras
  2011-02-27  0:46               ` Jonathan Nieder
  2011-02-27  8:15               ` Junio C Hamano
  1 sibling, 2 replies; 65+ messages in thread
From: Felipe Contreras @ 2011-02-27  0:16 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: Piotr Krukowiecki, Junio C Hamano, git

On Sat, Feb 26, 2011 at 11:51 PM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> These terms you suggest aren't the established ones (as I'm sure you
> know).  Just as with everyday language, there is some resistance to
> moving to new terms that have not been established for a while.  In
> everyday language, many terms gained popularity by
>
>  - appearing in some document that people read for another reason
>  - describing the notion they are meant to describe clearly (or
>   having some other feature that makes them likeable)

There's always resistance, but 1.8 is supposed to contain stuff as "if
git was written from scratch". I think this makes sense as one of
them.

> This is how "staging area" has been gaining popularity, I think ---
> some (out-of-tree) documentation that is good for other reasons uses
> it, and it really does seem to be a clearer term than "index" for
> "place where the next commit is being prepared".  Unfortunately, I do
> not think it is a clearer term than "index" for "the git index, which
> contains stat() information and pointers to blobs that either belong
> in the next commit or are participating in a merge conflict".  So it
> does not seem to justify rewriting everything to use it.

Why should the users care about the stat() information? Or how the
merge conflicts are being tracked? That's plumbing, not porcelain.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-27  0:16             ` Felipe Contreras
@ 2011-02-27  0:46               ` Jonathan Nieder
  2011-02-27  8:15               ` Junio C Hamano
  1 sibling, 0 replies; 65+ messages in thread
From: Jonathan Nieder @ 2011-02-27  0:46 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Piotr Krukowiecki, Junio C Hamano, git

Hi,

Felipe Contreras wrote:
[out of order for convenience]

> Why should the users care about the stat() information? Or how the
> merge conflicts are being tracked?

The second question is very easy to answer (depending on what "how"
means, of course).  Because people integrating changes from multiple
places need to be able to resolve a conflicted merge.

> That's plumbing, not porcelain.

I don't disagree.  The analogy is almost perfect.

And the thing is, in the real world, people know about plumbing.  They
don't care about the details, but they know there are these things
called pipes, and that water tends to flow downward, and that if one
of them freezes, it will burst.  This knowledge is useful.

Likewise, it is useful to know:

 - After you use "cp -a" to copy a repository, the first operation
   you perform is going to be slower.  The cached stat() information
   is stale.

 - Until you run "git add", there is only one copy of your data, in
   the worktree.  After you run "git add", there are two copies.
   Once you run "git commit", that second copy will last at least
   as long as your commit does.

   So there is some chance of recovery from fat-finger mistakes,
   even before a commit.

 - During a merge, you can mark your progress by collapsing index
   entries with 'git add'.  "git diff" will show the state of the
   merge.  You can read the competing versions of a file with
   "git show :2:path/to/file" and "git show :3:path/to/file".

 - Index-only operations tend to be faster, since

    (1) the cached blobs are not changing, so we can save time
        stat(2)-ing and read(2)-ing files
    (2) blobs are compressed: less I/O.  Longstanding blobs are
        in pack files: good caching and I/O patterns.

   So you can speed up your slow "git grep" by using
   "git grep --cached".

 - When scripting, you can use a temporary index file to avoid
   affecting the remembered worktree state.

But so what?  I have nothing against clearer terms.  I am just saying
that (1) we should be explaining these things somewhere and (2) a
global s/index/only one of the things the index does/ is a bad idea,
because it would make the documentation *wrong*.

> There's always resistance, but 1.8 is supposed to contain stuff as "if
> git was written from scratch".

I thought 1.8 was supposed to provide an opportunity to correct some
long-known mistakes that we had been holding back on for backward
compatibility reasons.  That doesn't mean we should forget the cost of
change.

Thanks for your work, and hope that helps.
Jonathan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-27  0:16             ` Felipe Contreras
  2011-02-27  0:46               ` Jonathan Nieder
@ 2011-02-27  8:15               ` Junio C Hamano
  1 sibling, 0 replies; 65+ messages in thread
From: Junio C Hamano @ 2011-02-27  8:15 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Jonathan Nieder, Piotr Krukowiecki, git

Felipe Contreras <felipe.contreras@gmail.com> writes:

> There's always resistance, but 1.8 is supposed to contain stuff as "if
> git was written from scratch".

Yes, the 1.8.0 is indeed an opportunity to rethink, based on the wisdom we
have gained over the years since the current git was written.

If there has already been a clear consensus that we would have done
something differently if we knew better, it is an opportunity to first
discuss if there is a way to correct these earlier mistakes in a way that
does not have to introduce incompatibility, and if it is not feasible,
discuss a plan to ease incompatible changes in without hurting existing
users too much.

A new discussion or proposal is fine, but you should be able to see that
an effort to start building consensus from now is very much outside the
scope of the discussion for the 1.8.0 we have been having.

Besides, taking what other people said already in the thread also into
account, it looks to me that what you are advocating is too premature to
be called a consensus yet.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-26 21:09         ` Felipe Contreras
  2011-02-26 21:51           ` Jonathan Nieder
@ 2011-02-27  8:43           ` Jeff King
  2011-02-27  9:21             ` Miles Bader
  2011-02-27 15:34             ` Drew Northup
  2011-02-27 18:46           ` Phil Hord
  2 siblings, 2 replies; 65+ messages in thread
From: Jeff King @ 2011-02-27  8:43 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Jonathan Nieder, Piotr Krukowiecki, Junio C Hamano, git

On Sat, Feb 26, 2011 at 11:09:14PM +0200, Felipe Contreras wrote:

> On Tue, Feb 15, 2011 at 1:19 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> > When people talk about the staging area I tend to get confused.  I
> > think there's an idea that because it sounds more concrete, there is
> > less to explain --- or maybe I am just wired the wrong way.
> 
> I don't like the phrase "staging area". A "stage" already has an area.
> You put things on the stage. Sometimes there are multiple stages.

As a native English speaker, this makes no sense to me. A stage as a
noun is either:

  1. a raised platform where you give performances

  2. a phase that some process goes through (e.g., "the early stages of
     Alzheimer's disease")

Whereas the term "staging area" is a stopping point on a journey for
collecting and organizing items. I couldn't find a definite etymology
online, but it seems to be military in origin (e.g., you would send all
your tanks to a staging area, then once assembled and organized, begin
your attack). You can't just call it "staging", which is not a noun, and
the term "stage" is not a synonym. "Staging area" has a very particular
meaning.

So the term "staging area" makes perfect sense to me; it is where we
collect changes to make a commit. I am willing to accept that does not
to others (native English speakers or no), and that we may need to come
up with a better term. But I think just calling it "the stage" is even
worse; it loses the concept that it is a place for collecting and
organizing.

-Peff

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-27  8:43           ` Jeff King
@ 2011-02-27  9:21             ` Miles Bader
  2011-02-27 22:28               ` Jon Seymour
  2011-02-27 15:34             ` Drew Northup
  1 sibling, 1 reply; 65+ messages in thread
From: Miles Bader @ 2011-02-27  9:21 UTC (permalink / raw)
  To: Jeff King
  Cc: Felipe Contreras, Jonathan Nieder, Piotr Krukowiecki,
	Junio C Hamano, git

Jeff King <peff@peff.net> writes:
> So the term "staging area" makes perfect sense to me; it is where we
> collect changes to make a commit. I am willing to accept that does not
> to others (native English speakers or no), and that we may need to come
> up with a better term. But I think just calling it "the stage" is even
> worse; it loses the concept that it is a place for collecting and
> organizing.

Agreed.

"Staging area" is a good noun (phrase) for this.  "Stage" is a good verb
(for "move into the staging area"), but isn't intuitive as a noun.

-miles

-- 
In New York, most people don't have cars, so if you want to kill a person, you
have to take the subway to their house.  And sometimes on the way, the train
is delayed and you get impatient, so you have to kill someone on the subway.
  [George Carlin]

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-26 20:36         ` Felipe Contreras
@ 2011-02-27 15:30           ` Drew Northup
  0 siblings, 0 replies; 65+ messages in thread
From: Drew Northup @ 2011-02-27 15:30 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: Pete Harlan, Junio C Hamano, Jonathan Nieder, Piotr Krukowiecki, git


On Sat, 2011-02-26 at 22:36 +0200, Felipe Contreras wrote:
> On Thu, Feb 17, 2011 at 1:11 AM, Drew Northup <drew.northup@maine.edu> wrote:
> >
> > On Sun, 2011-02-13 at 19:09 -0800, Pete Harlan wrote:
> >> On 02/13/2011 02:58 PM, Junio C Hamano wrote:
> >> >> --staged
> >> >> ~~~~~~~~
> >> >> diff takes --staged, but that is only to support some people's habits.
> >> > The term "stage" comes from "staging area", a term people used to explain
> >> > the concept of the index by saying "The index holds set of contents to be
> >> > made into the next commit; it is _like_ the staging area".
> >> >
> >> > My feeling is that "to stage" is primarily used, outside "git" circle, as
> >> > a logistics term.  If you find it easier to visualize the concept of the
> >> > index with "staging area" ("an area where troops and equipment in transit
> >> > are assembled before a military operation", you may find it easier to say
> >> > "stage this path ('git add path')", instead of "adding to the set of
> >> > contents...".
> >>
> >> FWIW, when teaching Git I have found that users immediately understand
> >> "staging area", while "index" and "cache" confuse them.
> >>
> >> "Index" means to them a numerical index into a data structure.
> >> "Cache" is a local copy of something that exists remotely.  Neither
> >> word describes the concept correctly from a user's perspective.
> >
> > According to the dictionary (actually, more than one) "cache" is a
> > hidden storage space. I'm pretty sure that's the sense most global and
> > therefore most appropriate to thinking about Git. (It certainly
> > describes correctly what web browser cache and on-CPU cache is doing.)
> > One would only think the definition you gave applied if they didn't know
> > that squirrels "cache" nuts. I don't think that the problem is the
> > idiom.
> 
> Not really. If a squirrel "caches" nuts, it means a squirrel is
> putting them in a hidden place to save them for future use. So, in the
> future, if said squirrel wants a nut, it doesn't have to look for it
> in the trees, just go to the cache. So the cache makes it easier to
> access whatever your want.
> 
> IOW; if you don't cache something, you would have more trouble getting
> it, but you still can.
> 
> That's not what Git is doing. Git is not putting changes in a place so
> the can be more easily accessed in the future. It is using a temporary
> device that allows the commit to be built through an extended period
> of time. It's not a cache.

As I noted earlier, "cache" classically has nothing whatsoever to do
with temporality, it is a descriptor of visibility. Any notion of
temporality or intentionality is imposed by the reader. THAT'S THE
PROBLEM. 

> >> I learned long ago to type "index" and "cached", but when talking (and
> >> thinking) about Git I find "the staging area" gets the point across
> >> very clearly and moves Git from interesting techie-tool to
> >> world-dominating SCM territory.  I'm surprised that that experience
> >> isn't universal.
> >
> > Perhaps that helps you associate it with other SCM/VCS software, but it
> > didn't help me. When I realized that the "index" is called that BECAUSE
> > IT IS AN INDEX (of content/data states for a pending commit operation)
> > the sky cleared and the sun came out.
> 
> That's not an index. An index is a guide of pointers to something
> else. It allows you to find whatever you are looking for by looking in
> small table of pointers instead of looking through all the samples.
> 
> IOW; if you don't index something, you would have more trouble finding
> it, but you still can.
> 
> That's not what Git is doing.

Index: "That which guides, points out, informs, or directs" [1913
Edition Webster's Dictionary--new one says something pretty similar if
not the same].
As far as I can tell Git is using the "Index" to do just that. Again, I
am discarding all notions of connotation here and focusing solely on the
denotation of the word. Besides, it is still possible to build a commit
with git without the "Index"; it is a real royal pain--and not the least
advisable for day-to-day use.

> > In all reality the closest thing Git has to an actual staging area is
> > all of the objects in .git/objects only recorded by the index itself.
> > Git-stored objects not compressed into pack files could technically be
> > described as "cached" using the standard definition--they aren't visible
> > in the working directory. Unfortunately this probably just muddies the
> > water for all too many users.
> 
> That's irrelevant. You can implement the same functionality in many
> other ways. How it is implement doesn't matter, what matters is what
> the user experiences.

Please re-read what I said, more slowly and without notion of previous
disagreement if you can muster it. We both agree that the notion of
caching here is superfluous to most users. Alas, I am not one to say
that what any one user experiences should dictate to us who all users
SHOULD experience Git. It is fairly clear to me that isn't what is
currently happening and any efforts to force the matter thus far haven't
helped matters much if at all.

> > So, in summary--the index is real, objects "cached" pending
> > commit/cleanup/packing are real; any "staging area" is a rhetorical
> > combination of the two. Given that rhetorical device may not work in all
> > languages (as Junio mentioned earlier) I don't recommend that we rely on
> > it.
> 
> Branches and tags are "rthetorical" devices as well. But behind scenes
> they are just refs. Shall we disregard 'branch' and 'tag'?
> 
> No. What Git does behind scenes is irrelevant to the user. What
> matters is what the device does, not how it is implemented; the
> implementation might change. "Stage" is the perfect word; both verb
> and a noun that express a temporary space where things are prepared
> for their final form.

Yes they (branches and tags) are. They also have a "physical"
manifestation. A "staging area" does not. This obviously is of little
importance to you (as a user--I know you do more than that), but would
matter a great deal to somebody like myself currently mulling over how
to craft a contribution to this project.

Alas, as Junio pointed out earlier, "stage" is a metaphor of limited
utility (it also means a large number of things in English alone--I tend
to think of theaters and not states when I read it). In fact, it opens
up more questions: "Staged where? In a cache. Where is the cache? It
doesn't really exist, but it is a combination of the Index and
under-referenced objects in the object store acting as a cache. Why? How
does it do that?....." We are therefore where we started. Users are just
as confused as they were before, and we're looking for a good watering
hole to cluster at and come up with a better way to explain it without
getting into the gritty details.

Details sometimes matter, sometimes they don't, and much more often the
reality is halfway between the two. Currently I think that Git is in
that middle state. Discarding outright the notion of the Index and of
caching doesn't make sense (as, at some level, that's what's happening),
yet staging isn't perfect either. That's my point.

(Please also see my pending reply to Jeff's missive from 8:43 UTC
today.)

-- 
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-27  8:43           ` Jeff King
  2011-02-27  9:21             ` Miles Bader
@ 2011-02-27 15:34             ` Drew Northup
  2011-02-28 23:03               ` Jeff King
       [not found]               ` <878466.93199.1298934204331.JavaMail.trustmail@mail1.terreactive.ch>
  1 sibling, 2 replies; 65+ messages in thread
From: Drew Northup @ 2011-02-27 15:34 UTC (permalink / raw)
  To: Jeff King
  Cc: Felipe Contreras, Jonathan Nieder, Piotr Krukowiecki,
	Junio C Hamano, git


On Sun, 2011-02-27 at 03:43 -0500, Jeff King wrote:
> On Sat, Feb 26, 2011 at 11:09:14PM +0200, Felipe Contreras wrote:
> 
> > On Tue, Feb 15, 2011 at 1:19 AM, Jonathan Nieder <jrnieder@gmail.com> wrote:
> > > When people talk about the staging area I tend to get confused.  I
> > > think there's an idea that because it sounds more concrete, there is
> > > less to explain --- or maybe I am just wired the wrong way.
> > 
> > I don't like the phrase "staging area". A "stage" already has an area.
> > You put things on the stage. Sometimes there are multiple stages.
> 
> As a native English speaker, this makes no sense to me. A stage as a
> noun is either:
> 
>   1. a raised platform where you give performances
> 
>   2. a phase that some process goes through (e.g., "the early stages of
>      Alzheimer's disease")

I definitely appreciate this notion. The equivalence of "stage ===
status of something, given place and or time" is itself metaphorical in
nature. I don't know how translatable the idiom is.

> Whereas the term "staging area" is a stopping point on a journey for
> collecting and organizing items. I couldn't find a definite etymology
> online, but it seems to be military in origin (e.g., you would send all
> your tanks to a staging area, then once assembled and organized, begin
> your attack). You can't just call it "staging", which is not a noun, and
> the term "stage" is not a synonym. "Staging area" has a very particular
> meaning.

I would have to check, but I believe you would find it linked to
metaphorical language about the "stage on which a battle is
fought" (battleground) and the fact that forces are sometimes organized
into formation--as they would appear upon a stage--in such an area
(before a parade or a march, for instance).

> So the term "staging area" makes perfect sense to me; it is where we
> collect changes to make a commit. I am willing to accept that does not
> to others (native English speakers or no), and that we may need to come
> up with a better term. But I think just calling it "the stage" is even
> worse; it loses the concept that it is a place for collecting and
> organizing.
> 
> -Peff

The concept of a "staging area" is definitely of limited use for many of
us attempting to learn how git works. The very fact that the object
cache and the Index (or multiple, as is useful at times) are distinct
elements is useful and should be mentioned somewhere. Alas, creating in
the user's mind that there is a distinct unified "staging area" acts
against this dissemination of knowledge. It definitely didn't help me.

If we use "staging area made up of the object store and information kept
in the Index" then we tie a knot on everything, make it clear that it
may be more complex than that--and you don't have to care, and we do not
foreclose on the possibility of more complete explanation later. That
does not bother me. We do however need to recognize that "staging area"
is an idiom of limited portability and deal with that appropriately. 

A particular Three Stooges episode comes to mind here for me. The Three,
in one scene, are getting dressed up to go to an estate (a relative of
one of them has died) to collect an inheritance. They are jumping up and
down yelling "We're gonna get rich!" in the English original. However,
the only thing the only timing appropriate thing the translator could
think of when producing the Spanish voice-over was "Vamos a
vestirse" (we're going to get dressed). Obviously this made them seem
like more utter fools than the were, but equally obviously the meaning
of the idiom "gonna get rich" was lost on the translator. This is what
has been replaying in my mind since Junio brought up the limited
portability of the notion of a "staging area" a little while back. He's
right--many idioms do not not survive translation. This is why we need
to make the documentation robust and technically correct while also
attempting to be nice to new users.

-- 
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-26 21:09         ` Felipe Contreras
  2011-02-26 21:51           ` Jonathan Nieder
  2011-02-27  8:43           ` Jeff King
@ 2011-02-27 18:46           ` Phil Hord
  2 siblings, 0 replies; 65+ messages in thread
From: Phil Hord @ 2011-02-27 18:46 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: Jonathan Nieder, Piotr Krukowiecki, Junio C Hamano, git

On 02/26/2011 04:09 PM, Felipe Contreras wrote:
> I don't like the phrase "staging area". A "stage" already has an area.
> You put things on the stage. Sometimes there are multiple stages.

A "staging area" (idiomatically, perhaps) is a location where things are
collected to be organized before deployment.  Sounds a lot like our index.

http://en.wikipedia.org/wiki/Staging_area

> If only a subset of the files are there, it's an 'index', if not, then
> I'd say it's a 'registry'. Anyway, it's something the user shouldn't
> care about.

When we pack up our kayak club for a trip, we stage equipment we're
bringing.  Eventually we make a decision about which equipment is going
and which is staying.  The decision is codified by the equipment we
leave in the staging area versus the equipment we remove to local
storage.  Everyone seems to understand the term when we use it in this
context.

I think the parade analogy is also pretty common.

I like "staging area(n)/stage(v)" better than "index" or "cache" because
of the connotation in English.  But if it doesn't translate well, the
search may need to go on.  Maybe we can fall back on stdc methods and
invent generic terms like strcpy.  How about "xnar"?

Phil

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-14  3:09     ` Pete Harlan
  2011-02-16 23:11       ` Drew Northup
@ 2011-02-27 21:16       ` Aghiles
  2011-02-28 20:53         ` Drew Northup
  1 sibling, 1 reply; 65+ messages in thread
From: Aghiles @ 2011-02-27 21:16 UTC (permalink / raw)
  To: Pete Harlan; +Cc: Junio C Hamano, Jonathan Nieder, Piotr Krukowiecki, git

> FWIW, when teaching Git I have found that users immediately understand
> "staging area", while "index" and "cache" confuse them.

FWIW, same here.

-- aghiles

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-27  9:21             ` Miles Bader
@ 2011-02-27 22:28               ` Jon Seymour
  2011-02-27 23:57                 ` Junio C Hamano
  0 siblings, 1 reply; 65+ messages in thread
From: Jon Seymour @ 2011-02-27 22:28 UTC (permalink / raw)
  To: Miles Bader
  Cc: Jeff King, Felipe Contreras, Jonathan Nieder, Piotr Krukowiecki,
	Junio C Hamano, git

On Sun, Feb 27, 2011 at 8:21 PM, Miles Bader <miles@gnu.org> wrote:
> Jeff King <peff@peff.net> writes:
>> So the term "staging area" makes perfect sense to me; it is where we
>> collect changes to make a commit. I am willing to accept that does not
>> to others (native English speakers or no), and that we may need to come
>> up with a better term. But I think just calling it "the stage" is even
>> worse; it loses the concept that it is a place for collecting and
>> organizing.
>
> Agreed.
>
> "Staging area" is a good noun (phrase) for this.  "Stage" is a good verb
> (for "move into the staging area"), but isn't intuitive as a noun.
>

When used to describe a pre-production environment, the noun in my experience
is inevitably 'staging' (short for staging environment) rather than
'stage' which
is consistent with the origin Jeff posits.

I guess the noun 'stage' does have a use in git-speak to refer to the
different arms of
an unresolved merge.

jon.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-27 22:28               ` Jon Seymour
@ 2011-02-27 23:57                 ` Junio C Hamano
  2011-02-28  9:38                   ` Michael J Gruber
  0 siblings, 1 reply; 65+ messages in thread
From: Junio C Hamano @ 2011-02-27 23:57 UTC (permalink / raw)
  To: Jon Seymour
  Cc: Miles Bader, Jeff King, Felipe Contreras, Jonathan Nieder,
	Piotr Krukowiecki, Junio C Hamano, git

Jon Seymour <jon.seymour@gmail.com> writes:

> I guess the noun 'stage' does have a use in git-speak to refer to the
> different arms of an unresolved merge.

That is correct.

For some historical background around "cache" and "index", this

  http://thread.gmane.org/gmane.comp.version-control.git/780/focus=924

may shed some light.

    From: Linus Torvalds <torvalds@osdl.org>
    Subject: Re: [RFC] Possible strategy cleanup for git add/remove/diff etc.
    Date: Tue, 19 Apr 2005 18:51:06 -0700 (PDT)
    Message-ID: <Pine.LNX.4.58.0504191846290.6467@ppc970.osdl.org>

    That is indeed the whole point of the index file. In my world-view, the
    index file does _everything_. It's the staging area ("work file"), it's
    the merging area ("merge directory") and it's the cache file ("stat
    cache").

And this one:

  http://thread.gmane.org/gmane.comp.version-control.git/6670/focus=6863

is even more illuminating.

Notice that the word "staging area" is used in the old article as a way to
explain one of the three important aspects of the index, and the other
article that is about nailing down the terminology, the word does not even
come into the picture at all (one reason being that it will confuse
readers if "staging area" is used too casually in a document to precisely
define terminology, which needs to explain the merge stage(s) in the
index).

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-27 23:57                 ` Junio C Hamano
@ 2011-02-28  9:38                   ` Michael J Gruber
  0 siblings, 0 replies; 65+ messages in thread
From: Michael J Gruber @ 2011-02-28  9:38 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jon Seymour, Miles Bader, Jeff King, Felipe Contreras,
	Jonathan Nieder, Piotr Krukowiecki, git

Junio C Hamano venit, vidit, dixit 28.02.2011 00:57:
> Jon Seymour <jon.seymour@gmail.com> writes:
> 
>> I guess the noun 'stage' does have a use in git-speak to refer to the
>> different arms of an unresolved merge.
> 
> That is correct.
> 
> For some historical background around "cache" and "index", this
> 
>   http://thread.gmane.org/gmane.comp.version-control.git/780/focus=924
> 
> may shed some light.
> 
>     From: Linus Torvalds <torvalds@osdl.org>
>     Subject: Re: [RFC] Possible strategy cleanup for git add/remove/diff etc.
>     Date: Tue, 19 Apr 2005 18:51:06 -0700 (PDT)
>     Message-ID: <Pine.LNX.4.58.0504191846290.6467@ppc970.osdl.org>
> 
>     That is indeed the whole point of the index file. In my world-view, the
>     index file does _everything_. It's the staging area ("work file"), it's
>     the merging area ("merge directory") and it's the cache file ("stat
>     cache").
> 
> And this one:
> 
>   http://thread.gmane.org/gmane.comp.version-control.git/6670/focus=6863
> 
> is even more illuminating.
> 
> Notice that the word "staging area" is used in the old article as a way to
> explain one of the three important aspects of the index, and the other
> article that is about nailing down the terminology, the word does not even
> come into the picture at all (one reason being that it will confuse
> readers if "staging area" is used too casually in a document to precisely
> define terminology, which needs to explain the merge stage(s) in the
> index).

Oh, the classics :)

Thanks for an illuminating and entertaining read!

Michael

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-27 21:16       ` Aghiles
@ 2011-02-28 20:53         ` Drew Northup
  0 siblings, 0 replies; 65+ messages in thread
From: Drew Northup @ 2011-02-28 20:53 UTC (permalink / raw)
  To: Aghiles
  Cc: Pete Harlan, Junio C Hamano, Jonathan Nieder, Piotr Krukowiecki, git


On Sun, 2011-02-27 at 16:16 -0500, Aghiles wrote:
> > FWIW, when teaching Git I have found that users immediately understand
> > "staging area", while "index" and "cache" confuse them.
> 
> FWIW, same here.

I would really like to hear the actual presentation. What one says in
person in front of a classroom and what one puts in a manpage are
frequently not the same thing--and there's a good reason for that. 
If nothing else, we could come up with a better presentation at the end!

-- 
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-27 15:34             ` Drew Northup
@ 2011-02-28 23:03               ` Jeff King
  2011-03-01  9:11                 ` David
       [not found]               ` <878466.93199.1298934204331.JavaMail.trustmail@mail1.terreactive.ch>
  1 sibling, 1 reply; 65+ messages in thread
From: Jeff King @ 2011-02-28 23:03 UTC (permalink / raw)
  To: Drew Northup
  Cc: Felipe Contreras, Jonathan Nieder, Piotr Krukowiecki,
	Junio C Hamano, git

On Sun, Feb 27, 2011 at 10:34:00AM -0500, Drew Northup wrote:

> The concept of a "staging area" is definitely of limited use for many of
> us attempting to learn how git works. The very fact that the object
> cache and the Index (or multiple, as is useful at times) are distinct
> elements is useful and should be mentioned somewhere.

Now your terminology has _me_ confused. What is the "object cache"?

> Alas, creating in the user's mind that there is a distinct unified
> "staging area" acts against this dissemination of knowledge. It
> definitely didn't help me.

I'm not sure what you mean by "distint unified staging area". It is a
conceptual idea that you will put your changes somewhere, and when they
look good to you, then you will finalize them in some way.

But note that it is a mental model. The fact that it is implemented
inside the index, along with the stat cache, doesn't need to be relevant
to the user. And the fact that the actual content is in the object
store, with sha1-identifiers in the index, is not relevant either. At
least I don't think so, and I am usually of the opinion that we should
expose the data structures to the user, so that their mental model can
match what is actually happening. But in this case, I think they can
still have a pretty useful but simpler mental model.

> If we use "staging area made up of the object store and information kept
> in the Index" then we tie a knot on everything, make it clear that it
> may be more complex than that--and you don't have to care, and we do not
> foreclose on the possibility of more complete explanation later. That
> does not bother me. We do however need to recognize that "staging area"
> is an idiom of limited portability and deal with that appropriately.

Sure, I'm willing to accept that the specific words of the idiom aren't
good for people with different backgrounds.

One analogy I like for the index is that it's a bucket. It starts out
full of files from the last commit. You can put new, changed files in
the bucket. When it looks good, you dump the bucket into a commit. You
can have multiple buckets if you want. You can pull files from other
commits and put them in the bucket. You can take files out of the bucket
and put them in your work tree.

So maybe it should just be called "the bucket"?

I'm not sure that's a good idea, because while the analogy makes sense,
it doesn't by itself convey any meaning. That is, knowing the concept, I
can see that bucket is a fine term. But hearing about git's bucket, I
have no clue what it means. Whereas "staging area" I think is a bit more
specific, _if_ you know what a staging area is.

So there are two questions:

  1. Is there a more universal term that means something like "staging
     area"?

  2. Is the term "staging area", while meaningful to some, actually
     _worse_ to others than a term like "bucket"? That is, does it sound
     complex and scary, when it is really a simple thing. And while
     people won't know what the "git bucket" is off the bat, it is
     relatively easy to learn.

     And obviously, replace "bucket" here with whatever term makes more
     sense.

> A particular Three Stooges episode comes to mind here for me.

Wow, 180,000 messages and this is somehow the first Three Stooges
analogy on the git list.

-Peff

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
       [not found]               ` <878466.93199.1298934204331.JavaMail.trustmail@mail1.terreactive.ch>
@ 2011-03-01  8:43                 ` Victor Engmark
  0 siblings, 0 replies; 65+ messages in thread
From: Victor Engmark @ 2011-03-01  8:43 UTC (permalink / raw)
  To: Jeff King
  Cc: Drew Northup, Felipe Contreras, Jonathan Nieder,
	Piotr Krukowiecki, Junio C Hamano, git

On 03/01/2011 12:03 AM, Jeff King wrote:
> On Sun, Feb 27, 2011 at 10:34:00AM -0500, Drew Northup wrote:

> One analogy I like for the index is that it's a bucket. It starts out
> full of files from the last commit. You can put new, changed files in
> the bucket. When it looks good, you dump the bucket into a commit. You
> can have multiple buckets if you want. You can pull files from other
> commits and put them in the bucket. You can take files out of the bucket
> and put them in your work tree.
> 
> So maybe it should just be called "the bucket"?
> 
> I'm not sure that's a good idea, because while the analogy makes sense,
> it doesn't by itself convey any meaning. That is, knowing the concept, I
> can see that bucket is a fine term. But hearing about git's bucket, I
> have no clue what it means. Whereas "staging area" I think is a bit more
> specific, _if_ you know what a staging area is.
> 
> So there are two questions:
> 
>   1. Is there a more universal term that means something like "staging
>      area"?
> 
>   2. Is the term "staging area", while meaningful to some, actually
>      _worse_ to others than a term like "bucket"? That is, does it sound
>      complex and scary, when it is really a simple thing. And while
>      people won't know what the "git bucket" is off the bat, it is
>      relatively easy to learn.

I like the name "git bucket", as in "a git bit bucket", but semantically
the connection is just "a container". Especially for beginners this can
result in the wrong connotations:
* Limited size. A modern harddisk is vastly larger than most Git
repositories, likening it more to a container ship than a bucket.
* Definite size. Harddisk space availability varies with time, unlike
most containers.
* Non-linear use. A full physical bucket could be used for many
different things, but a full git bucket can either be forgotten (with
checkout), remembered temporarily (with stash), or remembered
permanently (with commit).
* Container-specific features irrelevant for git: Handles, translucency
(or not), depth, material, dimensions of the opening...

How about a metaphor like "plan"? You either cancel/undo it (git
checkout), postpone / shelf it (git stash), resume/continue it (git
stash apply) or commit to it. Coming from the desktop metaphor, I
personally like `git undo`, `git postpone/resume` and `git commit` -
They give a clear sense of direction towards the commit, and much
clearer verbs for those new to VC in general.

-- 
Victor Engmark

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-28 23:03               ` Jeff King
@ 2011-03-01  9:11                 ` David
  2011-03-01  9:15                   ` Matthieu Moy
  2011-03-01  9:27                   ` Alexey Feldgendler
  0 siblings, 2 replies; 65+ messages in thread
From: David @ 2011-03-01  9:11 UTC (permalink / raw)
  To: Jeff King
  Cc: Drew Northup, Felipe Contreras, Jonathan Nieder,
	Piotr Krukowiecki, Junio C Hamano, git

On 1 March 2011 10:03, Jeff King <peff@peff.net> wrote:
> On Sun, Feb 27, 2011 at 10:34:00AM -0500, Drew Northup wrote:
>
> I'm not sure what you mean by "distint unified staging area". It is a
> conceptual idea that you will put your changes somewhere, and when they
> look good to you, then you will finalize them in some way.
>
> But note that it is a mental model. The fact that it is implemented
> inside the index, along with the stat cache, doesn't need to be relevant
> to the user. And the fact that the actual content is in the object
> store, with sha1-identifiers in the index, is not relevant either. At
> least I don't think so, and I am usually of the opinion that we should
> expose the data structures to the user, so that their mental model can
> match what is actually happening. But in this case, I think they can
> still have a pretty useful but simpler mental model.
>
>> If we use "staging area made up of the object store and information kept
>> in the Index" then we tie a knot on everything, make it clear that it
>> may be more complex than that--and you don't have to care, and we do not
>> foreclose on the possibility of more complete explanation later. That
>> does not bother me. We do however need to recognize that "staging area"
>> is an idiom of limited portability and deal with that appropriately.
>
> Sure, I'm willing to accept that the specific words of the idiom aren't
> good for people with different backgrounds.
>
> One analogy I like for the index is that it's a bucket. It starts out
> full of files from the last commit. You can put new, changed files in
> the bucket. When it looks good, you dump the bucket into a commit. You
> can have multiple buckets if you want. You can pull files from other
> commits and put them in the bucket. You can take files out of the bucket
> and put them in your work tree.
>
> So maybe it should just be called "the bucket"?
>
> I'm not sure that's a good idea, because while the analogy makes sense,
> it doesn't by itself convey any meaning. That is, knowing the concept, I
> can see that bucket is a fine term. But hearing about git's bucket, I
> have no clue what it means. Whereas "staging area" I think is a bit more
> specific, _if_ you know what a staging area is.
>
> So there are two questions:
>
>  1. Is there a more universal term that means something like "staging
>     area"?
>
>  2. Is the term "staging area", while meaningful to some, actually
>     _worse_ to others than a term like "bucket"? That is, does it sound
>     complex and scary, when it is really a simple thing. And while
>     people won't know what the "git bucket" is off the bat, it is
>     relatively easy to learn.
>
>     And obviously, replace "bucket" here with whatever term makes more
>     sense.

A suggestion: could your conceptual bucket be named as "the precommit".

Motives for this suggestion are:
1)  I imagine this word will be readily translatable;
2) Using an invented word like this neatly avoids the complication of
the various different connotations associated with existing words like
"index", "cache", and "stage" that others have raised.

The "precommit" would be a user concept that merely specifies the
content of the next commit. Its purpose is to simplify the user
interface and the documentation. For example, man git-status would
read like this:

"git status displays paths that have differences between the precommit
and the current HEAD commit, paths that have differences between the
working tree and the precommit, and paths in the working tree that are
not tracked by git."

The "precommit" is not to be associated to any specific data structure
in the implementation. For users who want more understanding, it can
be explained that the precommit is implemented by a combination of
data structures. Which are then free to be named anything appropriate
to their individual function (eg "the index file") without triggering
all the issues that give rise to this thread.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-03-01  9:11                 ` David
@ 2011-03-01  9:15                   ` Matthieu Moy
  2011-03-01  9:32                     ` Alexei Sholik
  2011-03-01  9:27                   ` Alexey Feldgendler
  1 sibling, 1 reply; 65+ messages in thread
From: Matthieu Moy @ 2011-03-01  9:15 UTC (permalink / raw)
  To: David
  Cc: Jeff King, Drew Northup, Felipe Contreras, Jonathan Nieder,
	Piotr Krukowiecki, Junio C Hamano, git

David <bouncingcats@gmail.com> writes:

> A suggestion: could your conceptual bucket be named as "the
> precommit".

I actually like it.

Maybe "precommit area", or "precommit something", because "precommit"
could be seen either as an action (like the pre-commit hook) or as a
place to put stuff.

As a non-native speaker, I didn't know what "staging area" really meant
in english, but the "area" part of the expression immediately made sense
to me. Had it been called the "foobar-ing area", I would have found it
more intuitive than cache or index ;-).

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-03-01  9:11                 ` David
  2011-03-01  9:15                   ` Matthieu Moy
@ 2011-03-01  9:27                   ` Alexey Feldgendler
  2011-03-01 16:46                     ` Drew Northup
  1 sibling, 1 reply; 65+ messages in thread
From: Alexey Feldgendler @ 2011-03-01  9:27 UTC (permalink / raw)
  To: git

On Tue, 01 Mar 2011 10:11:11 +0100, David <bouncingcats@gmail.com> wrote:

> A suggestion: could your conceptual bucket be named as "the precommit".
>
> Motives for this suggestion are:
> 1)  I imagine this word will be readily translatable;

Less so than “staging area”, at least into Russian.

Just my two cents.


-- 
Alexey Feldgendler
Software Developer, Desktop Team, Opera Software ASA
[ICQ: 115226275] http://my.opera.com/feldgendler/

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-03-01  9:15                   ` Matthieu Moy
@ 2011-03-01  9:32                     ` Alexei Sholik
  2011-03-01 17:02                       ` Drew Northup
  0 siblings, 1 reply; 65+ messages in thread
From: Alexei Sholik @ 2011-03-01  9:32 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: David, Jeff King, Drew Northup, Felipe Contreras,
	Jonathan Nieder, Piotr Krukowiecki, Junio C Hamano, git

On 1 March 2011 11:15, Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> wrote:
> David <bouncingcats@gmail.com> writes:
>
>> A suggestion: could your conceptual bucket be named as "the
>> precommit".
>
> ...
>
> As a non-native speaker, I didn't know what "staging area" really meant
> in english, but the "area" part of the expression immediately made sense
> to me. Had it been called the "foobar-ing area", I would have found it
> more intuitive than cache or index ;-).

Hello everyone,
I'm not a very experienced git-user and I still remember how it felt
when I started learning git. I don't recall the exact tutorial I used
(probably it was the 'Pro Git' Book), but anyway, it used the term
"staging area" and "to stage changes" from the outset. I'm also not a
native English speaker and I hadn't even heard of the term "to stage"
before, but managed to grasp at once what "to stage changes" meant.

As of such names as "bucket" and "precommit", I don't think they will
do. There is a lot of resources for beginners on the internet already,
many of them already use "staging area" and "index". There's no need
to rename the staging area. The only source of confusion as I see it
comes from the interchangeable usage of the terms "staging area" and
"index" ("staged" and "cached" being the other confusing pair of
words).

I guess, people who are friendly with git using the word "index"
because it's easier to type. But it confuses an unprepared reader. The
solution of the problem with confusion must be relevant to these
points:
 - clarify that "index" means the same thing as the "staging area" (in
man if it isn't there already?)
 - replace "cached" with "staged" for consistency with the term
"staging area" (I guess none of you would like to replace ot with
"indexed" instead :-P)

Best regards,
Alexei Sholik

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-02-13 19:20 Consistent terminology: cached/staged/index Piotr Krukowiecki
  2011-02-13 19:37 ` Jonathan Nieder
@ 2011-03-01 10:29 ` Jonathan Nieder
  1 sibling, 0 replies; 65+ messages in thread
From: Jonathan Nieder @ 2011-03-01 10:29 UTC (permalink / raw)
  To: Piotr Krukowiecki
  Cc: git, Alexei Sholik, Matthieu Moy, David, Jeff King, Drew Northup,
	Felipe Contreras, Jonathan Nieder, Junio C Hamano

Hi again,

Piotr Krukowiecki wrote:

> is there a plan for using one term

To summarize: everyone knows what the staging area is, no one seems to
know what the index is, and the --cached options are confusing.

We need a new description (terminology, or better yet, story) for
"git's view of the work tree", since just saying "the index! the
index!" without a myth behind it confuses people.

Various commands take --cached (porcelain):

. git diff --cached	- view staged changes relative to the named tree.
. git grep --cached	- search in the staging area instead of the worktree.
. git rm --cached	- only remove from the index.

(plumbing):

. git apply --cached	- apply a patch without touching the worktree.
. git ls-files --cached	- list paths that will have content in the next commit.

It would be reasonable to introduce a synonym --index-only.  That can
be confusing if you don't view the staging area as representing git's
deluded idea of what's in the work tree, though.  For the same reason
and some others, --no-worktree / --ignore-worktree wouldn't work so
well (e.g., "git ls-files --no-worktree" would be terribly confusing).
So, um, we're stuck?

Various commands take --index or related options (porcelain):

. git filter-branch --index-filter	- let hook tweak index before commit
. git stash apply --index	- revive the stashed index changes, too
. git stash save --keep-index	- do not stash changes already added to index

(toys):

. git grep --no-index	- just act as a better "grep"; do not look for .git
. git diff --no-index	- just act as a better "diff"; do not look for .git

(plumbing):

. git apply --index	- next commit will have the patch applied, too
. git checkout-index --index	- update stat() cache while at it
. git read-tree --index-output	- write output to a different index file
. git update-index --index-info	- apply changes in ls-tree or ls-files format
. GIT_INDEX_FILE	- where information about the worktree goes

It would be possible to introduce synonyms along the lines of
GIT_STAGING_AREA_FILE, keeping in mind that they also affect the
merging process (and some of them also affect the stat() cache), if
that seems like the right thing to do.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-03-01  9:27                   ` Alexey Feldgendler
@ 2011-03-01 16:46                     ` Drew Northup
  2011-03-04 17:18                       ` Felipe Contreras
  0 siblings, 1 reply; 65+ messages in thread
From: Drew Northup @ 2011-03-01 16:46 UTC (permalink / raw)
  To: Alexey Feldgendler
  Cc: git, David, Jeff King, Felipe Contreras, Jonathan Nieder,
	Piotr Krukowiecki, Junio C Hamano, Michael J Gruber, Jon Seymour,
	Miles Bader


On Tue, 2011-03-01 at 10:27 +0100, Alexey Feldgendler wrote:
> On Tue, 01 Mar 2011 10:11:11 +0100, David <bouncingcats@gmail.com> wrote:
> 
> > A suggestion: could your conceptual bucket be named as "the precommit".
> >
> > Motives for this suggestion are:
> > 1)  I imagine this word will be readily translatable;
> 
> Less so than “staging area”, at least into Russian.
> 
> Just my two cents.

I was starting to think about "commit preparation area" this morning,
but it sounds horribly long. Would "Prep area" work provided that the
longer version has already been introduced into the discussion? This
provides a similar language metaphor to "staging area" hopefully without
the translation problem.

Also, I still think that it is important to note somewhere that the way
that git handles commits is not the way that most users are likely to
imagine (the Index doesn't contain the blob objects itself; a finalized
commit is not just a bundled collection of everything as somebody might
expect; etc) so this "Prep area" is a logical space not completely
analogous to stuff found in the ".git" directory. Pretending that
complexity does not exist will not help; letting the users know that
they don't need to grok all of the details to get started is, on the
other hand, quite important.

(Reconstructing the CC list... let me know if I left you out, spammed
you, etc...)

-- 
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-03-01  9:32                     ` Alexei Sholik
@ 2011-03-01 17:02                       ` Drew Northup
  2011-03-01 17:30                         ` Alexei Sholik
  0 siblings, 1 reply; 65+ messages in thread
From: Drew Northup @ 2011-03-01 17:02 UTC (permalink / raw)
  To: Alexei Sholik
  Cc: Matthieu Moy, David, Jeff King, Felipe Contreras,
	Jonathan Nieder, Piotr Krukowiecki, Junio C Hamano, git


On Tue, 2011-03-01 at 11:32 +0200, Alexei Sholik wrote:

> I guess, people who are friendly with git using the word "index"
> because it's easier to type. But it confuses an unprepared reader. The
> solution of the problem with confusion must be relevant to these
> points:
>  - clarify that "index" means the same thing as the "staging area" (in
> man if it isn't there already?)

Alas, this isn't quite true. Blobs are copied to the .git/objects
directory (which I referred to earlier as an object store without proper
qualification) with each "git add" action AND are noted in the Index at
the same time. Therefore the Index is quite literally containing
information about the blobs to be committed without containing the blobs
themselves. This is why I find any specific equivalence between Index
and "staging area" distasteful--it is misleading. 

(Yes, I made that mistake as well--helped along by a lot of third-party
documentation referring to a specific cache or a specific "staging area"
without noting that those were tools to understand the logical function
of git but did not have anything to do with implementation. When you
claim to be explaining "how something works" you should be doing just
that.)

-- 
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-03-01 17:02                       ` Drew Northup
@ 2011-03-01 17:30                         ` Alexei Sholik
  2011-03-01 17:41                           ` Drew Northup
  0 siblings, 1 reply; 65+ messages in thread
From: Alexei Sholik @ 2011-03-01 17:30 UTC (permalink / raw)
  To: Drew Northup
  Cc: Matthieu Moy, David, Jeff King, Felipe Contreras,
	Jonathan Nieder, Piotr Krukowiecki, Junio C Hamano, git

On 1 March 2011 19:02, Drew Northup <drew.northup@maine.edu> wrote:
>
> On Tue, 2011-03-01 at 11:32 +0200, Alexei Sholik wrote:
>
>> I guess, people who are friendly with git using the word "index"
>> because it's easier to type. But it confuses an unprepared reader. The
>> solution of the problem with confusion must be relevant to these
>> points:
>>  - clarify that "index" means the same thing as the "staging area" (in
>> man if it isn't there already?)
>
> Alas, this isn't quite true. Blobs are copied to the .git/objects
> directory (which I referred to earlier as an object store without proper
> qualification) with each "git add" action AND are noted in the Index at
> the same time. Therefore the Index is quite literally containing
> information about the blobs to be committed without containing the blobs
> themselves. This is why I find any specific equivalence between Index
> and "staging area" distasteful--it is misleading.

There's no reason to make it more confusing by telling all the
implementation details users are not interested in.

Once I add a modified file to index (via 'git add') or even add a new
file, its content is already tracked by git. This is the most relevant
part.

It is not relevant from the user's point of view whether it's already
in .git/objects or not. Once I've staged a file, I can rm it and then
'git checkout' it again to the version that's remembered in the
staging area, i.e. I will not lose it's contents once it's been
staged.

If what you're trying to say is that new users think of the 'staging
area' as some place where the content is stored before a subsequent
commit, there's nothing bad about it. If they will try to find out
about it's concrete location in the fs, they'll eventually find out
about index and its true nature in terms of implementation.

--
Best regards,
Alexei Sholik

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-03-01 17:30                         ` Alexei Sholik
@ 2011-03-01 17:41                           ` Drew Northup
  0 siblings, 0 replies; 65+ messages in thread
From: Drew Northup @ 2011-03-01 17:41 UTC (permalink / raw)
  To: Alexei Sholik
  Cc: Matthieu Moy, David, Jeff King, Felipe Contreras,
	Jonathan Nieder, Piotr Krukowiecki, Junio C Hamano, git


On Tue, 2011-03-01 at 19:30 +0200, Alexei Sholik wrote:
> On 1 March 2011 19:02, Drew Northup <drew.northup@maine.edu> wrote:
> >
> > On Tue, 2011-03-01 at 11:32 +0200, Alexei Sholik wrote:
> >
> >> I guess, people who are friendly with git using the word "index"
> >> because it's easier to type. But it confuses an unprepared reader. The
> >> solution of the problem with confusion must be relevant to these
> >> points:
> >>  - clarify that "index" means the same thing as the "staging area" (in
> >> man if it isn't there already?)
> >
> > Alas, this isn't quite true. Blobs are copied to the .git/objects
> > directory (which I referred to earlier as an object store without proper
> > qualification) with each "git add" action AND are noted in the Index at
> > the same time. Therefore the Index is quite literally containing
> > information about the blobs to be committed without containing the blobs
> > themselves. This is why I find any specific equivalence between Index
> > and "staging area" distasteful--it is misleading.
> 
> There's no reason to make it more confusing by telling all the
> implementation details users are not interested in.

I am not advocating that.

> Once I add a modified file to index (via 'git add') or even add a new
> file, its content is already tracked by git. This is the most relevant
> part.

Agreed.

> It is not relevant from the user's point of view whether it's already
> in .git/objects or not. Once I've staged a file, I can rm it and then
> 'git checkout' it again to the version that's remembered in the
> staging area, i.e. I will not lose it's contents once it's been
> staged.
> 
> If what you're trying to say is that new users think of the 'staging
> area' as some place where the content is stored before a subsequent
> commit, there's nothing bad about it. If they will try to find out
> about it's concrete location in the fs, they'll eventually find out
> about index and its true nature in terms of implementation.

My argument is that we should use "staging area" or "preparation area"
or whatever we end up using as tools to explain the USAGE of Git without
inferring that IT WORKS THAT WAY DEEP INSIDE. That's why I don't want to
claim that the Index is (or means the same thing as) a staging area--we
shouldn't be bothering beginner users with the Index yet anyway. Just
saying that the information gets put into files in .git that act as a
"staging area" is good enough--we don't need to extricate all mentions
of "Index" or "cache" from the documentation.
Unfortunately, if this is not done carefully we end up with people
complaining that the documentation is inconsistent when it is often just
blunt and indelicately worded.

-- 
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-03-01 16:46                     ` Drew Northup
@ 2011-03-04 17:18                       ` Felipe Contreras
  2011-03-05  4:53                         ` Miles Bader
  0 siblings, 1 reply; 65+ messages in thread
From: Felipe Contreras @ 2011-03-04 17:18 UTC (permalink / raw)
  To: Drew Northup
  Cc: Alexey Feldgendler, git, David, Jeff King, Jonathan Nieder,
	Piotr Krukowiecki, Junio C Hamano, Michael J Gruber, Jon Seymour,
	Miles Bader

On Tue, Mar 1, 2011 at 6:46 PM, Drew Northup <drew.northup@maine.edu> wrote:
>
> On Tue, 2011-03-01 at 10:27 +0100, Alexey Feldgendler wrote:
>> On Tue, 01 Mar 2011 10:11:11 +0100, David <bouncingcats@gmail.com> wrote:
>>
>> > A suggestion: could your conceptual bucket be named as "the precommit".
>> >
>> > Motives for this suggestion are:
>> > 1)  I imagine this word will be readily translatable;
>>
>> Less so than “staging area”, at least into Russian.
>>
>> Just my two cents.
>
> I was starting to think about "commit preparation area" this morning,
> but it sounds horribly long. Would "Prep area" work provided that the
> longer version has already been introduced into the discussion? This
> provides a similar language metaphor to "staging area" hopefully without
> the translation problem.
>
> Also, I still think that it is important to note somewhere that the way
> that git handles commits is not the way that most users are likely to
> imagine (the Index doesn't contain the blob objects itself; a finalized
> commit is not just a bundled collection of everything as somebody might
> expect; etc) so this "Prep area" is a logical space not completely
> analogous to stuff found in the ".git" directory. Pretending that
> complexity does not exist will not help; letting the users know that
> they don't need to grok all of the details to get started is, on the
> other hand, quite important.

First I liked this proposal, but then I thought about 'git diff
--preped' (doesn't really sound right). I think the term should:

 1) Have a nice noun version; staging area, preparation area
 2) Have a nice verb version; to stage, to prep
 3) Have a nice past-participle; staged, cached

Casting? Forging? I don't know, staging always seems right.

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-03-04 17:18                       ` Felipe Contreras
@ 2011-03-05  4:53                         ` Miles Bader
  2011-03-05  5:00                           ` Jonathan Nieder
  2011-03-06 12:44                           ` Drew Northup
  0 siblings, 2 replies; 65+ messages in thread
From: Miles Bader @ 2011-03-05  4:53 UTC (permalink / raw)
  To: Felipe Contreras
  Cc: Drew Northup, Alexey Feldgendler, git, David, Jeff King,
	Jonathan Nieder, Piotr Krukowiecki, Junio C Hamano,
	Michael J Gruber, Jon Seymour

2011/3/5 Felipe Contreras <felipe.contreras@gmail.com>:
> First I liked this proposal, but then I thought about 'git diff
> --preped' (doesn't really sound right). I think the term should:
>
>  1) Have a nice noun version; staging area, preparation area
>  2) Have a nice verb version; to stage, to prep
>  3) Have a nice past-participle; staged, cached
>
> Casting? Forging? I don't know, staging always seems right.

I agree.

I don't why so many people seem to be trying so hard to come with
alternatives to "staged" and "staging area", when the latter are
actually quite good; so far all the suggestions have been much more
awkward and less intuitive.

It's true that "staging area" and "stage" as a verb are most intuitive
for native english speakers, but so far none of the alternatives
really seem any better for non-native speakers.  _All_ of these terms
are "learned" to some degree, and in that sense are arbitrary, but the
smoothness and intuitiveness of "staging area"/"stage" for english
speakers is a real plus I think.

As for translations, is it even an issue?  If term "XXX" is the
optimum term in some other language, then that should be the
translation for that langage, _regardless_ of what the english term
used is.

-miles

-- 
Cat is power.  Cat is peace.

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-03-05  4:53                         ` Miles Bader
@ 2011-03-05  5:00                           ` Jonathan Nieder
  2011-03-06 12:44                           ` Drew Northup
  1 sibling, 0 replies; 65+ messages in thread
From: Jonathan Nieder @ 2011-03-05  5:00 UTC (permalink / raw)
  To: Miles Bader
  Cc: Felipe Contreras, Drew Northup, Alexey Feldgendler, git, David,
	Jeff King, Piotr Krukowiecki, Junio C Hamano, Michael J Gruber,
	Jon Seymour

Miles Bader wrote:

> I don't why so many people seem to be trying so hard to come with
> alternatives to "staged" and "staging area", when the latter are
> actually quite good; so far all the suggestions have been much more
> awkward and less intuitive.

*nod*  Actually, "staging area" is intuitive on first reading and
"stage" less so to me, while the alternatives tend to be more
confusing for what it's worth.

I think this thread has outlived its usefulness.  Could people with
concrete proposals (e.g., "the description of 'git add' is confusing
in such-and-such way; this rewording makes it clearer", or "the
glossary does not describe the staging area very well; how about
this description") please start new threads?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 65+ messages in thread

* Re: Consistent terminology: cached/staged/index
  2011-03-05  4:53                         ` Miles Bader
  2011-03-05  5:00                           ` Jonathan Nieder
@ 2011-03-06 12:44                           ` Drew Northup
  1 sibling, 0 replies; 65+ messages in thread
From: Drew Northup @ 2011-03-06 12:44 UTC (permalink / raw)
  To: Miles Bader
  Cc: Felipe Contreras, Alexey Feldgendler, git, David, Jeff King,
	Jonathan Nieder, Piotr Krukowiecki, Junio C Hamano,
	Michael J Gruber, Jon Seymour


On Sat, 2011-03-05 at 13:53 +0900, Miles Bader wrote:
> 2011/3/5 Felipe Contreras <felipe.contreras@gmail.com>:
> > First I liked this proposal, but then I thought about 'git diff
> > --preped' (doesn't really sound right). I think the term should:
> >
> >  1) Have a nice noun version; staging area, preparation area
> >  2) Have a nice verb version; to stage, to prep
> >  3) Have a nice past-participle; staged, cached
> >
> > Casting? Forging? I don't know, staging always seems right.
> 
> I agree.
> 
> I don't why so many people seem to be trying so hard to come with
> alternatives to "staged" and "staging area", when the latter are
> actually quite good; so far all the suggestions have been much more
> awkward and less intuitive.
> 
> It's true that "staging area" and "stage" as a verb are most intuitive
> for native english speakers, but so far none of the alternatives
> really seem any better for non-native speakers.  _All_ of these terms
> are "learned" to some degree, and in that sense are arbitrary, but the
> smoothness and intuitiveness of "staging area"/"stage" for english
> speakers is a real plus I think.

It has already been pointed out that this isn't always quite as
intuitive as it sounds to many. I think we'd be flogging a dead horse to
continue discussing that.

> As for translations, is it even an issue?  If term "XXX" is the
> optimum term in some other language, then that should be the
> translation for that langage, _regardless_ of what the english term
> used is.
> 
> -miles

Having translated stuff before, and having helped clean-up / finish
translations from other languages to English, I can say that it most
certainly DOES MATTER what the idiom used in the source language is.
Unless I the translator know more about how something works than the
core developers that wrote it I am highly dependent on the explanations
they have used. That is why it is important to have a complete and
portable metaphor. In fact, that's exactly what I was thinking about
when I suggested "commit preparation area" earlier in this thread--the
translation to Spanish is a tad verbose but it is entirely clear without
further jiggering or expectation of specific cultural knowledge. 

I'm not sure why that "fails" the equally arbitrary participle-mapping
test... It sure has one, but as a native English speaker and a brutal
editor I am perfectly comfortable with the notion that not all verbs
have natural noun forms and vice-versa.

-- 
-Drew Northup
________________________________________________
"As opposed to vegetable or mineral error?"
-John Pescatore, SANS NewsBites Vol. 12 Num. 59

^ permalink raw reply	[flat|nested] 65+ messages in thread

end of thread, other threads:[~2011-03-06 12:46 UTC | newest]

Thread overview: 65+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-13 19:20 Consistent terminology: cached/staged/index Piotr Krukowiecki
2011-02-13 19:37 ` Jonathan Nieder
2011-02-13 22:58   ` Junio C Hamano
2011-02-14  2:05     ` Miles Bader
2011-02-14  5:57       ` Junio C Hamano
2011-02-14  6:27         ` Miles Bader
2011-02-14  6:59           ` Johannes Sixt
2011-02-14  7:07             ` Miles Bader
2011-02-14 10:42               ` Michael J Gruber
2011-02-14 11:04                 ` Miles Bader
2011-02-14 17:12                   ` Junio C Hamano
2011-02-14 22:07                     ` Miles Bader
2011-02-14 22:59                       ` Junio C Hamano
2011-02-14 23:47                         ` Miles Bader
2011-02-15  0:12                           ` Junio C Hamano
2011-02-14 13:14                 ` Nguyen Thai Ngoc Duy
2011-02-14 13:43                   ` Michael J Gruber
2011-02-14 13:57                     ` Nguyen Thai Ngoc Duy
2011-02-14 14:17                     ` Felipe Contreras
2011-02-14 14:21                       ` Nguyen Thai Ngoc Duy
2011-02-14 14:40                         ` Jakub Narebski
2011-02-14 15:24                       ` Michael J Gruber
2011-02-14 16:00                         ` Felipe Contreras
2011-02-14 16:04                           ` Michael J Gruber
2011-02-14 16:27                             ` Felipe Contreras
2011-02-14  3:09     ` Pete Harlan
2011-02-16 23:11       ` Drew Northup
2011-02-26 20:36         ` Felipe Contreras
2011-02-27 15:30           ` Drew Northup
2011-02-27 21:16       ` Aghiles
2011-02-28 20:53         ` Drew Northup
2011-02-14 22:32     ` Piotr Krukowiecki
2011-02-14 23:19       ` Jonathan Nieder
2011-02-15  8:29         ` Pete Harlan
2011-02-15  9:00           ` Jonathan Nieder
2011-02-15 18:15         ` Piotr Krukowiecki
2011-02-15 18:38           ` Jonathan Nieder
2011-02-26 21:09         ` Felipe Contreras
2011-02-26 21:51           ` Jonathan Nieder
2011-02-27  0:01             ` Miles Bader
2011-02-27  0:16             ` Felipe Contreras
2011-02-27  0:46               ` Jonathan Nieder
2011-02-27  8:15               ` Junio C Hamano
2011-02-27  8:43           ` Jeff King
2011-02-27  9:21             ` Miles Bader
2011-02-27 22:28               ` Jon Seymour
2011-02-27 23:57                 ` Junio C Hamano
2011-02-28  9:38                   ` Michael J Gruber
2011-02-27 15:34             ` Drew Northup
2011-02-28 23:03               ` Jeff King
2011-03-01  9:11                 ` David
2011-03-01  9:15                   ` Matthieu Moy
2011-03-01  9:32                     ` Alexei Sholik
2011-03-01 17:02                       ` Drew Northup
2011-03-01 17:30                         ` Alexei Sholik
2011-03-01 17:41                           ` Drew Northup
2011-03-01  9:27                   ` Alexey Feldgendler
2011-03-01 16:46                     ` Drew Northup
2011-03-04 17:18                       ` Felipe Contreras
2011-03-05  4:53                         ` Miles Bader
2011-03-05  5:00                           ` Jonathan Nieder
2011-03-06 12:44                           ` Drew Northup
     [not found]               ` <878466.93199.1298934204331.JavaMail.trustmail@mail1.terreactive.ch>
2011-03-01  8:43                 ` Victor Engmark
2011-02-27 18:46           ` Phil Hord
2011-03-01 10:29 ` Jonathan Nieder

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).