All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: git@vger.kernel.org
Cc: Taylor Blau <me@ttaylorr.com>, Toon Claes <toon@iotcl.com>,
	Phillip Wood <phillip.wood123@gmail.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: [PATCH 3/5] strbuf: provide CRLF-aware helper to read until a specified delimiter
Date: Fri, 2 Jun 2023 15:02:38 +0200	[thread overview]
Message-ID: <8127eeac97200da9aafccdf16cb7ba06f68b0121.1685710884.git.ps@pks.im> (raw)
In-Reply-To: <cover.1685710884.git.ps@pks.im>

[-- Attachment #1: Type: text/plain, Size: 3114 bytes --]

Many of our commands support reading input that is separated either via
newlines or via NUL characters. Furthermore, in order to be a better
cross platform citizen, these commands typically know to strip the CRLF
sequence so that we also support reading newline-separated inputs on
e.g. the Windows platform. This results in the following kind of awkward
pattern:

```
struct strbuf input = STRBUF_INIT;

while (1) {
	int ret;

	if (nul_terminated)
		ret = strbuf_getline_nul(&input, stdin);
	else
		ret = strbuf_getline(&input, stdin);
	if (ret)
		break;

	...
}
```

Introduce a new CRLF-aware helper function that can read up to a user
specified delimiter. If the delimiter is `\n` the function knows to also
strip CRLF, otherwise it will only strip the specified delimiter. This
results in the following, much more readable code pattern:

```
struct strbuf input = STRBUF_INIT;

while (strbuf_getdelim_strip_crlf(&input, stdin, delim) != EOF) {
	...
}
```

The new function will be used in a subsequent commit.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 strbuf.c | 11 ++++++++---
 strbuf.h | 12 ++++++++++++
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/strbuf.c b/strbuf.c
index 08eec8f1d8..31dc48c0ab 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -721,11 +721,11 @@ static int strbuf_getdelim(struct strbuf *sb, FILE *fp, int term)
 	return 0;
 }
 
-int strbuf_getline(struct strbuf *sb, FILE *fp)
+int strbuf_getdelim_strip_crlf(struct strbuf *sb, FILE *fp, int term)
 {
-	if (strbuf_getwholeline(sb, fp, '\n'))
+	if (strbuf_getwholeline(sb, fp, term))
 		return EOF;
-	if (sb->buf[sb->len - 1] == '\n') {
+	if (term == '\n' && sb->buf[sb->len - 1] == '\n') {
 		strbuf_setlen(sb, sb->len - 1);
 		if (sb->len && sb->buf[sb->len - 1] == '\r')
 			strbuf_setlen(sb, sb->len - 1);
@@ -733,6 +733,11 @@ int strbuf_getline(struct strbuf *sb, FILE *fp)
 	return 0;
 }
 
+int strbuf_getline(struct strbuf *sb, FILE *fp)
+{
+	return strbuf_getdelim_strip_crlf(sb, fp, '\n');
+}
+
 int strbuf_getline_lf(struct strbuf *sb, FILE *fp)
 {
 	return strbuf_getdelim(sb, fp, '\n');
diff --git a/strbuf.h b/strbuf.h
index 3dfeadb44c..0e69b656bc 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -475,6 +475,18 @@ int strbuf_readlink(struct strbuf *sb, const char *path, size_t hint);
  */
 ssize_t strbuf_write(struct strbuf *sb, FILE *stream);
 
+/**
+ * Read from a FILE * until the specified terminator is encountered,
+ * overwriting the existing contents of the strbuf.
+ *
+ * Reading stops after the terminator or at EOF.  The terminator is
+ * removed from the buffer before returning.  If the terminator is LF
+ * and if it is preceded by a CR, then the whole CRLF is stripped.
+ * Returns 0 unless there was nothing left before EOF, in which case
+ * it returns `EOF`.
+ */
+int strbuf_getdelim_strip_crlf(struct strbuf *sb, FILE *fp, int term);
+
 /**
  * Read a line from a FILE *, overwriting the existing contents of
  * the strbuf.  The strbuf_getline*() family of functions share
-- 
2.40.1


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  parent reply	other threads:[~2023-06-02 13:02 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-02 13:02 [PATCH 0/5] cat-file: introduce NUL-terminated output format Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 1/5] t1006: don't strip timestamps from expected results Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 2/5] t1006: modernize test style to use `test_cmp` Patrick Steinhardt
2023-06-02 13:02 ` Patrick Steinhardt [this message]
2023-06-02 13:02 ` [PATCH 4/5] cat-file: simplify reading from standard input Patrick Steinhardt
2023-06-02 13:02 ` [PATCH 5/5] cat-file: Introduce new option to delimit output with NUL characters Patrick Steinhardt
2023-06-05 15:47   ` Phillip Wood
2023-06-05 23:54     ` Junio C Hamano
2023-06-06  4:52       ` Patrick Steinhardt
2023-06-06  5:22         ` Junio C Hamano
2023-06-06  5:31           ` Patrick Steinhardt
2023-06-12 19:12             ` Junio C Hamano
2023-06-06  5:00     ` Patrick Steinhardt
2023-06-06  1:23   ` Junio C Hamano
2023-06-03  1:44 ` [PATCH 0/5] cat-file: introduce NUL-terminated output format Junio C Hamano
2023-06-06  5:19 ` [PATCH v2 0/5] catfile: " Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 1/5] t1006: don't strip timestamps from expected results Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 2/5] t1006: modernize test style to use `test_cmp` Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 3/5] strbuf: provide CRLF-aware helper to read until a specified delimiter Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 4/5] cat-file: simplify reading from standard input Patrick Steinhardt
2023-06-06  5:19   ` [PATCH v2 5/5] cat-file: introduce option to delimit input and output with NUL Patrick Steinhardt
2023-06-12 20:43   ` [PATCH v2 0/5] catfile: introduce NUL-terminated output format Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8127eeac97200da9aafccdf16cb7ba06f68b0121.1685710884.git.ps@pks.im \
    --to=ps@pks.im \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=me@ttaylorr.com \
    --cc=phillip.wood123@gmail.com \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.