git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stephen Hicks <stephenhicks@gmail.com>
To: git@vger.kernel.org
Subject: Prematurely-closed stdin using async NodeJS to smudge large file on Mac
Date: Wed, 14 Dec 2022 22:59:05 -0500	[thread overview]
Message-ID: <CAKNkOnNAvV0tTkQXyRXjTiiUqPpu2cJSg5emrODu3AoGm79v+A@mail.gmail.com> (raw)

I recently observed some odd behavior when setting up a repo on a new
computer.  I isolated it to some weird behavior where git seems to be
closing stdin prematurely.

Repro steps:
```
# Make a new repo
mkdir smudge-test
cd smudge-test
git init

# Grab the first directory on the path (NOTE: we'll write to $BIN/smudge)
BIN=${PATH%%:*}

# Set up the smudge filter
echo 'file filter=smudge' > .gitattributes
cat >> .git/config <<EOF
[filter "smudge"]
  clean = smudge
  smudge = smudge
  required
EOF

# Write the smudge script into $BIN/smudge
cat > $BIN/smudge <<EOF
#!/usr/bin/env node
const fs = require('fs');
fs.promises.readFile('/dev/stdin').then(src => {
  console.error('Read ' + src.length + ' bytes');
  process.exit(1);
});
EOF
chmod +x $BIN/smudge

# Make a 2mb file to smudge
head -c 2000000 /dev/random > file

# Add it to the repo
git add .
```

What I see is that every time I run `git add .`, it quotes me a
different number of bytes read.  If I truncate the file down to (say)
1mb, it works consistently every time (note: the add still fails
because smudge exits 1, but the bytes read is consistent, which is
what I'm looking for).  The problems seem to start somewhere around
1.2mb or 1.3mb for me.

Changing the NodeJS script to use `fs.readFileSync` seems to fix it,
so it appears to be something peculiar to how NodeJS handles the main
task exiting before all of its async work is done (FWIW this is my
current workaround).  Piping the file to smudge directly (rather than
via git) works consistently as well, so it's apparently an interaction
with how git is handling the pipe.  I also put together a quick shell
script as a wrapper.  If the shell script `cat`s stdin to a temp file
first and redirects the temp file to NodeJS, it works consistently.
If it just redirects /dev/stdin directly to NodeJS, it's inconsistent.

I'm running git 2.39.0 freshly-installed via MacPorts on macOS
Monterey 12.6.1, Darwin Kernel 21.6.0, Node v18.12.1.

                 reply	other threads:[~2022-12-15  3:59 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAKNkOnNAvV0tTkQXyRXjTiiUqPpu2cJSg5emrODu3AoGm79v+A@mail.gmail.com \
    --to=stephenhicks@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).