All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Torsten Bögershausen" <tboegi@web.de>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH v2 1/1] Introduce git add --renormalize .
Date: Tue, 7 Nov 2017 18:26:59 +0100	[thread overview]
Message-ID: <20171107172659.GA11119@tor.lan> (raw)
In-Reply-To: <xmqqvaimeixm.fsf@gitster.mtv.corp.google.com>

[snip]
> 
> > @@ -175,6 +175,12 @@ for "git add --no-all <pathspec>...", i.e. ignored removed files.
> >  	warning (e.g., if you are manually performing operations on
> >  	submodules).
> >  
> > +--renormalize::
> > +	Normalizes the line endings from CRLF to LF of tracked files.
> > +	This applies to files which are either "text" or "text=auto"
> > +	in .gitattributes (or core.autocrlf is true or input)
> > +	--renormalize implies -u
> > +
> 
> OK.

I think the fact, that clean filters are re-run, and re-evaluated
in case they are changed, should be made more clear here.
I don't know how to explain it better that CRLF conversion and/or filters are
re-applied, this is an attempt:


--renormalize::
	Normalizes the line endings from CRLF to LF of tracked files,
	if the .gitattributes or core.autocrlf say so.
	Additionally the clean and ident filters, if any, are re-run.
	--renormalize implies -u





> 
> > +static int renormalize_tracked_files(const struct pathspec *pathspec, int flags)
> > +{
> > +	int i, retval = 0;
> > +
> > +	for (i = 0; i < active_nr; i++) {
> > +		struct cache_entry *ce = active_cache[i];
> > +
> > +		if (ce_stage(ce))
> > +			continue; /* do not touch unmerged paths */
> > +		if (!S_ISREG(ce->ce_mode) && !S_ISLNK(ce->ce_mode))
> > +			continue; /* do not touch non blobs */
> > +		if (pathspec && !ce_path_match(ce, pathspec, NULL))
> > +			continue;
> > +		retval |= add_file_to_cache(ce->name, flags | HASH_RENORMALIZE);
> 
> We are removing the entry and then adding the same entry under the
> same name, and iteration over the active_cache[] from 0 through
> active_nr should be safe, I guess.
> 
> > ...
> > -	add_new_files = !take_worktree_changes && !refresh_only;
> > +	add_new_files = !take_worktree_changes && !refresh_only && !add_renormalize;
> 
> If renormalize is given, we will *not* take new files, good.
> 
> > @@ -500,7 +521,10 @@ int cmd_add(int argc, const char **argv, const char *prefix)
> >  
> >  	plug_bulk_checkin();
> >  
> > -	exit_status |= add_files_to_cache(prefix, &pathspec, flags);
> > +	if (add_renormalize)
> > +		exit_status |= renormalize_tracked_files(&pathspec, flags);
> > +	else
> > +		exit_status |= add_files_to_cache(prefix, &pathspec, flags);
> >  
> >  	if (add_new_files)
> >  		exit_status |= add_files(&dir, flags);
> 
> OK.
> 
> > ...
> >  	int add_option = (ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE|
> >  			  (intent_only ? ADD_CACHE_NEW_ONLY : 0));
> > +	int newflags = HASH_WRITE_OBJECT;
> > +
> > +	if (flags & HASH_RENORMALIZE)
> > +		newflags |= HASH_RENORMALIZE;
> > ...
> > @@ -678,19 +682,23 @@ int add_to_index(struct index_state *istate, const char *path, struct stat *st,
> >  	if (ignore_case) {
> >  		adjust_dirname_case(istate, ce->name);
> >  	}
> > +	if (!(flags & HASH_RENORMALIZE)) {
> > +		alias = index_file_exists(istate, ce->name,
> > +					  ce_namelen(ce), ignore_case);
> > +		if (alias &&
> > +		    !ce_stage(alias) &&
> > +		    !ie_match_stat(istate, alias, st, ce_option)) {
> > +			/* Nothing changed, really */
> > +			if (!S_ISGITLINK(alias->ce_mode))
> > +				ce_mark_uptodate(alias);
> > +			alias->ce_flags |= CE_ADDED;
> 
> OK, so RENORMALIZE option forces the code to skip the "does the path
> exist already?  maybe we can do without adding it?" check.
> 
> >  	if (!intent_only) {
> > -		if (index_path(&ce->oid, path, st, HASH_WRITE_OBJECT)) {
> > +		if (index_path(&ce->oid, path, st, newflags)) {
> 
> And then we do hash it.  We later do add_index_entry() on this thing
> and we have OK_TO_REPLACE bit in the add_option, so we are good to go.
> 
> > diff --git a/sha1_file.c b/sha1_file.c
> > index 10c3a0083d..15abb184c2 100644
> > --- a/sha1_file.c
> > +++ b/sha1_file.c
> > @@ -74,6 +74,18 @@ static struct cached_object *find_cached_object(const unsigned char *sha1)
> >  	return NULL;
> >  }
> >  
> > +
> > +static enum safe_crlf get_safe_crlf(unsigned flags)
> > +{
> > +	if (flags & HASH_RENORMALIZE)
> > +		return SAFE_CRLF_RENORMALIZE;
> > +	else if (flags & HASH_WRITE_OBJECT)
> > +		return safe_crlf;
> > +	else
> > +		return SAFE_CRLF_FALSE;
> > +}
> > +
> > +
> >  int mkdir_in_gitdir(const char *path)
> >  {
> >  	if (mkdir(path, 0777)) {
> > @@ -1680,7 +1692,7 @@ static int index_mem(unsigned char *sha1, void *buf, size_t size,
> >  	if ((type == OBJ_BLOB) && path) {
> >  		struct strbuf nbuf = STRBUF_INIT;
> >  		if (convert_to_git(&the_index, path, buf, size, &nbuf,
> > -				   write_object ? safe_crlf : SAFE_CRLF_FALSE)) {
> > +				   get_safe_crlf(flags))) {
> >  			buf = strbuf_detach(&nbuf, &size);
> >  			re_allocated = 1;
> >  		}
> > @@ -1714,7 +1726,7 @@ static int index_stream_convert_blob(unsigned char *sha1, int fd,
> >  	assert(would_convert_to_git_filter_fd(path));
> >  
> >  	convert_to_git_filter_fd(&the_index, path, fd, &sbuf,
> > -				 write_object ? safe_crlf : SAFE_CRLF_FALSE);
> > +				 get_safe_crlf(flags));
> >  
> >  	if (write_object)
> >  		ret = write_sha1_file(sbuf.buf, sbuf.len, typename(OBJ_BLOB),
> 
> OK.  We used to force CRLF_FALSE when we are not writing it out; now
> we have three choices, and a new helper helps us isolating the logic
> to make that choice.
> 
> > diff --git a/t/t0025-crlf-renormalize.sh b/t/t0025-crlf-renormalize.sh
> > new file mode 100755
> > index 0000000000..fb1ed631d2
> > --- /dev/null
> > +++ b/t/t0025-crlf-renormalize.sh
> > @@ -0,0 +1,30 @@
> > +#!/bin/sh
> > +
> > +test_description='CRLF renormalization'
> > +
> > +. ./test-lib.sh
> > +
> > +test_expect_success setup '
> > +	git config core.autocrlf false &&
> > +	printf "LINEONE\nLINETWO\nLINETHREE"     >LF.txt &&
> > +	printf "LINEONE\r\nLINETWO\r\nLINETHREE" >CRLF.txt &&
> > +	printf "LINEONE\r\nLINETWO\nLINETHREE"   >CRLF_mix_LF.txt &&
> 
> Did you mean to make all these files end with an incomplete line?  I
> think it does not hurt but it is misleading---the reader would try
> to see if the incomplete lines are significant and necessary part of
> the test, which is not, and would end up wasting time, no?
> 
> > +	git add . &&
> > +	git commit -m initial
> > +'
> > +
> > +test_expect_success 'renormalize CRLF in repo' '
> > +	echo "*.txt text=auto" >.gitattributes &&
> > +	git add --renormalize "*.txt" &&
> > +cat >expect <<EOF &&
> > +i/lf w/crlf attr/text=auto CRLF.txt
> > +i/lf w/lf attr/text=auto LF.txt
> > +i/lf w/mixed attr/text=auto CRLF_mix_LF.txt
> > +EOF
> 
> Perhaps use the <<-\EOF pattern?
> 
> I'd suggest squashing this in (or I can do so myself if there is no
> other change needed).
> 
> Thanks.  Looks mostly good.

Thanks for review and proposed fixes and help.

[snip the TC. Adding line endings is good)

  reply	other threads:[~2017-11-07 17:27 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-03 15:00 Line ending normalization doesn't work as expected Robert Dailey
2017-10-03 16:26 ` Torsten Bögershausen
2017-10-03 17:23   ` Robert Dailey
2017-10-03 19:19     ` Torsten Bögershausen
2017-10-04  2:00       ` Junio C Hamano
2017-10-04 16:26         ` Robert Dailey
2017-10-04 16:59           ` Jonathan Nieder
2017-10-04 18:03             ` Robert Dailey
2017-10-05  1:31             ` Junio C Hamano
2017-10-05  1:46               ` Jonathan Nieder
2017-10-04 21:17           ` Torsten Bögershausen
2017-10-05  1:38             ` Junio C Hamano
2017-10-05  3:31               ` Junio C Hamano
2017-10-05 21:42                 ` Torsten Bögershausen
2017-10-06  0:33                   ` Junio C Hamano
2017-10-06 17:58                     ` Torsten Bögershausen
2017-10-16 16:49                 ` [PATCH v1 1/1] Introduce git add --renormalize tboegi
2017-10-16 17:34                   ` Junio C Hamano
2017-10-30 16:29                     ` [PATCH v2 " tboegi
2017-11-07  5:50                       ` Junio C Hamano
2017-11-07 17:26                         ` Torsten Bögershausen [this message]
2017-11-08  0:37                           ` Junio C Hamano
2017-11-09 18:47                             ` Torsten Bögershausen
2017-11-10  0:22                               ` Junio C Hamano
2017-11-12 20:08                                 ` Torsten Bögershausen
2017-11-16 16:38                     ` [PATCH v3 " tboegi
2017-11-17  1:24                       ` Junio C Hamano
2017-11-17 20:44                       ` Eric Sunshine
2017-11-18  1:47                         ` Junio C Hamano
2018-02-15 15:24         ` Line ending normalization doesn't work as expected Robert Dailey
2018-02-15 19:16           ` Junio C Hamano
2018-02-15 21:47             ` Robert Dailey
2018-02-16 16:34           ` Torsten Bögershausen
2018-02-16 17:19             ` Robert Dailey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171107172659.GA11119@tor.lan \
    --to=tboegi@web.de \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.