All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
To: Jeff King <peff@peff.net>
Cc: git <git@vger.kernel.org>,
	"Christian Couder" <christian.couder@gmail.com>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>,
	"Оля Тележная" <olyatelezhnaya@gmail.com>
Subject: Re: [GSoC] How to protect cached_objects
Date: Sat, 25 May 2019 11:42:58 -0300	[thread overview]
Message-ID: <CAHd-oW4Y6bN+6SCJNEg+w0GMWQcUdiHmTbQfCpJ5J283xU8X+w@mail.gmail.com> (raw)
In-Reply-To: <20190524061352.GB25694@sigill.intra.peff.net>

On Fri, May 24, 2019 at 3:13 AM Jeff King <peff@peff.net> wrote:
>
> On Thu, May 23, 2019 at 01:51:47PM -0300, Matheus Tavares Bernardino wrote:
>
> > As one of my first tasks in GSoC, I'm looking to protect the global
> > states at sha1-file.c for future parallelizations. Currently, I'm
> > analyzing how to deal with the cached_objects array, which is a small
> > set of in-memory objects that read_object_file() is able to return
> > although they don't really exist on disk. The only current user of
> > this set is git-blame, which adds a fake commit containing
> > non-committed changes.
> >
> > As it is now, if we start parallelizing blame, cached_objects won't be
> > a problem since it is written to only once, at the beginning, and read
> > from a couple times latter, with no possible race conditions.
> >
> > But should we make these operations thread safe for future uses that
> > could involve potential parallel writes and reads too?
> >
> > If so, we have two options:
> > - Make the array thread local, which would oblige us to replicate data, or
> > - Protect it with locks, which could impact the sequential
> > performance. We could have a macro here, to skip looking on
> > single-threaded use cases. But we don't know, a priori, the number of
> > threads that would want to use the pack access code.
>
> It seems like a lot of the sha1-reading code is 99% read-only, but very
> occasionally will require a write (e.g., refreshing the packed_git list
> when we fail a lookup, or manipulating the set of cached mmap windows).
>
> I think pthreads has read/write locks, where many readers can hold the
> lock simultaneously but a writer blocks readers (and other writers).
> Then in the common case we'd only pay the price to take the lock, and
> not deal with contention. I don't know how expensive it is to take such
> a read lock; it's presumably just a few instructions but implies a
> memory barrier. Maybe it's worth timing?

The pthread_rwlock_t, right? Nice! I didn't know about this type of
lock. It will be very handy in other situations as well, thanks for
the suggestion.

> -Peff

  reply	other threads:[~2019-05-25 14:43 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-23 16:51 [GSoC] How to protect cached_objects Matheus Tavares Bernardino
2019-05-24  6:13 ` Jeff King
2019-05-25 14:42   ` Matheus Tavares Bernardino [this message]
2019-05-24  9:55 ` Duy Nguyen
2019-05-25 16:04   ` Matheus Tavares Bernardino
2019-05-26  2:43     ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHd-oW4Y6bN+6SCJNEg+w0GMWQcUdiHmTbQfCpJ5J283xU8X+w@mail.gmail.com \
    --to=matheus.bernardino@usp.br \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=olyatelezhnaya@gmail.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.