All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Elijah Newren <newren@gmail.com>
Cc: Elijah Newren via GitGitGadget <gitgitgadget@gmail.com>,
	Git Mailing List <git@vger.kernel.org>,
	blees@dcon.de, Junio C Hamano <gitster@pobox.com>,
	Kyle Meyer <kyle@kyleam.com>, Samuel Lijin <sxlijin@gmail.com>
Subject: Re: [PATCH v2 6/8] dir: fix checks on common prefix directory
Date: Tue, 17 Dec 2019 01:04:36 +0100 (CET)	[thread overview]
Message-ID: <nycvar.QRO.7.76.6.1912170101230.46@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <CABPp-BGoC_D6LzzMNyf30wFssTU2WA1kTLmFvJ2Do+Tfg4+YQA@mail.gmail.com>

Hi Elijah,

On Mon, 16 Dec 2019, Elijah Newren wrote:

> On Mon, Dec 16, 2019 at 5:51 AM Elijah Newren <newren@gmail.com> wrote:
> >
> > On Sun, Dec 15, 2019 at 2:29 AM Johannes Schindelin
> > <Johannes.Schindelin@gmx.de> wrote:
> > >
> > > Hi Elijah,
> > >
> > > I have not had time to dive deeply into this, but I know that it _does_
> > > cause a ton of segmentation faults in the `shears/pu` branch (where all of
> > > Git for Windows' patches are rebased on top of `pu`):
> >
> > Weird.  If it's going to cause segmentation faults at all, it would
> > certainly do it all over the place, but I tested the patches on the
> > major platforms using your Azure Pipelines setup on git.git so it
> > should be good on all the platforms.  Did your shears/pu branch make
> > some other changes to the setup?

Not really.

> > > On Tue, 10 Dec 2019, Elijah Newren via GitGitGadget wrote:
> > >
> > > > diff --git a/dir.c b/dir.c
> > > > index 645b44ea64..9c71a9ac21 100644
> > > > --- a/dir.c
> > > > +++ b/dir.c
> > > > @@ -2102,37 +2102,69 @@ static int treat_leading_path(struct dir_struct *dir,
> > > >                             const struct pathspec *pathspec)
> > > >  {
> > > >       struct strbuf sb = STRBUF_INIT;
> > > > -     int baselen, rc = 0;
> > > > +     int prevlen, baselen;
> > > >       const char *cp;
> > > > +     struct cached_dir cdir;
> > > > +     struct dirent de;
> > > > +     enum path_treatment state = path_none;
> > > > +
> > > > +     /*
> > > > +      * For each directory component of path, we are going to check whether
> > > > +      * that path is relevant given the pathspec.  For example, if path is
> > > > +      *    foo/bar/baz/
> > > > +      * then we will ask treat_path() whether we should go into foo, then
> > > > +      * whether we should go into bar, then whether baz is relevant.
> > > > +      * Checking each is important because e.g. if path is
> > > > +      *    .git/info/
> > > > +      * then we need to check .git to know we shouldn't traverse it.
> > > > +      * If the return from treat_path() is:
> > > > +      *    * path_none, for any path, we return false.
> > > > +      *    * path_recurse, for all path components, we return true
> > > > +      *    * <anything else> for some intermediate component, we make sure
> > > > +      *        to add that path to the relevant list but return false
> > > > +      *        signifying that we shouldn't recurse into it.
> > > > +      */
> > > >
> > > >       while (len && path[len - 1] == '/')
> > > >               len--;
> > > >       if (!len)
> > > >               return 1;
> > > > +
> > > > +     memset(&cdir, 0, sizeof(cdir));
> > > > +     memset(&de, 0, sizeof(de));
> > > > +     cdir.de = &de;
> > > > +     de.d_type = DT_DIR;
> > >
> > > So here, `de` is zeroed out, and therefore `de.d_name` is `NULL`.
> >
> > Um, yeah...didn't I have an allocation of de.d_name here?  It will
> > always have a subset of path copied into it, so an allocation of len+1
> > is plenty long enough.
>
> Actually, it looks like I looked up the definition of dirent
> previously and forgot by the time you emailed.  On linux, from
> /usr/include/bits/dirent.h:
>
> struct dirent
>   {
>     ....
>     unsigned char d_type;
>     char d_name[256];           /* We must not include limits.h! */
>   };
>
> and from compat/win32/dirent.h defines it as:
>
> struct dirent {
>         unsigned char d_type;      /* file type to prevent lstat after
> readdir */
>         char d_name[MAX_PATH * 3]; /* file name (* 3 for UTF-8 conversion) */
> };
>
> and 'man dirent' on Mac OS X says it's defined as:
>
> struct dirent {
>         ...
>         _uint8_t d_type;
>         _unit8_t d_namlen;   /* length of string in d_name */
>         char    d_name[255+1];  /* name must be no longer than this */
> }
>
> so, allocating it would be incorrect and my memset would just fill
> d_name with nul characters.
>
>
> But the raises the question...what kind of segfaults are you getting?
> Can you link to any builds or post any stack traces?  Can I duplicate
> with some copy of git-for-windows on linux?

If you care to look at our very own `compat/win32/dirent.h`, you will see
this:

struct dirent {
        unsigned char d_type; /* file type to prevent lstat after readdir */
        char *d_name;         /* file name */
};

And looking at
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/dirent.h.html, I
do not see any guarantee of that `[256]` at all:

The <dirent.h> header shall [...] define the structure dirent which shall
include the following members:

[XSI][Option Start]
ino_t  d_ino       File serial number.
[Option End]
char   d_name[]    Filename string of entry.

You will notice that not even `d_type` is guaranteed.

Ciao,
Dscho

  parent reply	other threads:[~2019-12-17  0:05 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-09 20:47 [PATCH 0/8] Directory traversal bugs Elijah Newren via GitGitGadget
2019-12-09 20:47 ` [PATCH 1/8] t3011: demonstrate directory traversal failures Elijah Newren via GitGitGadget
2019-12-09 21:06   ` Denton Liu
2019-12-09 20:47 ` [PATCH 2/8] Revert "dir.c: make 'git-status --ignored' work within leading directories" Elijah Newren via GitGitGadget
2019-12-09 21:32   ` Denton Liu
2019-12-09 21:51     ` Elijah Newren
2019-12-09 22:09     ` Eric Sunshine
2019-12-09 20:47 ` [PATCH 3/8] dir: remove stray quote character in comment Elijah Newren via GitGitGadget
2019-12-09 20:47 ` [PATCH 4/8] dir: exit before wildcard fall-through if there is no wildcard Elijah Newren via GitGitGadget
2019-12-09 20:47 ` [PATCH 5/8] dir: break part of read_directory_recursive() out for reuse Elijah Newren via GitGitGadget
2019-12-09 20:47 ` [PATCH 6/8] dir: fix checks on common prefix directory Elijah Newren via GitGitGadget
2019-12-09 20:47 ` [PATCH 7/8] dir: synchronize treat_leading_path() and read_directory_recursive() Elijah Newren via GitGitGadget
2019-12-09 20:47 ` [PATCH 8/8] dir: consolidate similar code in treat_directory() Elijah Newren via GitGitGadget
2019-12-10 20:00 ` [PATCH v2 0/8] Directory traversal bugs Elijah Newren via GitGitGadget
2019-12-10 20:00   ` [PATCH v2 1/8] t3011: demonstrate directory traversal failures Elijah Newren via GitGitGadget
2019-12-10 20:00   ` [PATCH v2 2/8] Revert "dir.c: make 'git-status --ignored' work within leading directories" Elijah Newren via GitGitGadget
2019-12-10 20:00   ` [PATCH v2 3/8] dir: remove stray quote character in comment Elijah Newren via GitGitGadget
2019-12-10 20:00   ` [PATCH v2 4/8] dir: exit before wildcard fall-through if there is no wildcard Elijah Newren via GitGitGadget
2019-12-10 20:00   ` [PATCH v2 5/8] dir: break part of read_directory_recursive() out for reuse Elijah Newren via GitGitGadget
2019-12-10 20:00   ` [PATCH v2 6/8] dir: fix checks on common prefix directory Elijah Newren via GitGitGadget
2019-12-15 10:29     ` Johannes Schindelin
2019-12-16 13:51       ` Elijah Newren
2019-12-16 16:00         ` Elijah Newren
2019-12-16 18:13           ` Junio C Hamano
2019-12-16 21:08             ` Elijah Newren
2019-12-16 21:25               ` Junio C Hamano
2019-12-16 22:39                 ` Elijah Newren
2019-12-17  0:04           ` Johannes Schindelin [this message]
2019-12-17  0:14             ` Junio C Hamano
2019-12-17 11:08               ` Johannes Schindelin
2019-12-17 17:33                 ` Junio C Hamano
2019-12-17 19:32                   ` Johannes Schindelin
2019-12-17  5:26             ` Elijah Newren
2019-12-17 11:15               ` Johannes Schindelin
2019-12-17 16:58                 ` Elijah Newren
2019-12-10 20:00   ` [PATCH v2 7/8] dir: synchronize treat_leading_path() and read_directory_recursive() Elijah Newren via GitGitGadget
2019-12-10 20:00   ` [PATCH v2 8/8] dir: consolidate similar code in treat_directory() Elijah Newren via GitGitGadget
2019-12-17  8:33   ` [PATCH v3 0/3] Directory traversal bugs Elijah Newren via GitGitGadget
2019-12-17  8:33     ` [PATCH v3 1/3] t3011: demonstrate directory traversal failures Elijah Newren via GitGitGadget
2019-12-17  8:33     ` [PATCH v3 2/3] dir: remove stray quote character in comment Elijah Newren via GitGitGadget
2019-12-17  8:33     ` [PATCH v3 3/3] dir: exit before wildcard fall-through if there is no wildcard Elijah Newren via GitGitGadget
2019-12-17 11:18     ` [PATCH v3 0/3] Directory traversal bugs Johannes Schindelin
2019-12-17 18:24       ` Junio C Hamano
2019-12-21 22:05         ` Johannes Schindelin
2019-12-18 19:29     ` [PATCH v4 0/8] " Elijah Newren via GitGitGadget
2019-12-18 19:29       ` [PATCH v4 1/8] t3011: demonstrate directory traversal failures Elijah Newren via GitGitGadget
2019-12-18 19:29       ` [PATCH v4 2/8] Revert "dir.c: make 'git-status --ignored' work within leading directories" Elijah Newren via GitGitGadget
2019-12-18 19:29       ` [PATCH v4 3/8] dir: remove stray quote character in comment Elijah Newren via GitGitGadget
2019-12-18 19:29       ` [PATCH v4 4/8] dir: exit before wildcard fall-through if there is no wildcard Elijah Newren via GitGitGadget
2019-12-18 19:29       ` [PATCH v4 5/8] dir: break part of read_directory_recursive() out for reuse Elijah Newren via GitGitGadget
2019-12-18 19:29       ` [PATCH v4 6/8] dir: fix checks on common prefix directory Elijah Newren via GitGitGadget
2019-12-18 21:29         ` Junio C Hamano
2019-12-19 20:23           ` Elijah Newren
2019-12-19 22:24             ` Jeff King
2019-12-20 17:00               ` Elijah Newren
2019-12-20 21:14                 ` Jeff King
2019-12-20 18:01               ` Junio C Hamano
2019-12-20 21:15                 ` Jeff King
2019-12-18 19:29       ` [PATCH v4 7/8] dir: synchronize treat_leading_path() and read_directory_recursive() Elijah Newren via GitGitGadget
2019-12-18 19:29       ` [PATCH v4 8/8] dir: consolidate similar code in treat_directory() Elijah Newren via GitGitGadget
2019-12-19 21:28       ` [PATCH v5 0/8] Directory traversal bugs Elijah Newren via GitGitGadget
2019-12-19 21:28         ` [PATCH v5 1/8] t3011: demonstrate directory traversal failures Elijah Newren via GitGitGadget
2019-12-19 21:28         ` [PATCH v5 2/8] Revert "dir.c: make 'git-status --ignored' work within leading directories" Elijah Newren via GitGitGadget
2019-12-19 21:28         ` [PATCH v5 3/8] dir: remove stray quote character in comment Elijah Newren via GitGitGadget
2019-12-19 21:28         ` [PATCH v5 4/8] dir: exit before wildcard fall-through if there is no wildcard Elijah Newren via GitGitGadget
2019-12-19 21:28         ` [PATCH v5 5/8] dir: break part of read_directory_recursive() out for reuse Elijah Newren via GitGitGadget
2019-12-19 21:28         ` [PATCH v5 6/8] dir: fix checks on common prefix directory Elijah Newren via GitGitGadget
2019-12-19 21:28         ` [PATCH v5 7/8] dir: synchronize treat_leading_path() and read_directory_recursive() Elijah Newren via GitGitGadget
2019-12-19 21:28         ` [PATCH v5 8/8] dir: consolidate similar code in treat_directory() Elijah Newren via GitGitGadget

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.QRO.7.76.6.1912170101230.46@tvgsbejvaqbjf.bet \
    --to=johannes.schindelin@gmx.de \
    --cc=blees@dcon.de \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=kyle@kyleam.com \
    --cc=newren@gmail.com \
    --cc=sxlijin@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.