All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ori Rawlings <orirawlings@gmail.com>
To: git@vger.kernel.org
Cc: Vitor Antunes <vitor.hda@gmail.com>,
	Lars Schneider <larsxschneider@gmail.com>,
	Luke Diamand <luke@diamand.org>, Pete Wyckoff <pw@padd.com>,
	Ori Rawlings <orirawlings@gmail.com>
Subject: [PATCH] [git-p4.py] Add --checkpoint-period option to sync/clone
Date: Mon, 12 Sep 2016 17:02:12 -0500	[thread overview]
Message-ID: <1473717733-65682-1-git-send-email-orirawlings@gmail.com> (raw)

Importing a long history from Perforce into git using the git-p4 tool
can be especially challenging. The `git p4 clone` operation is based
on an all-or-nothing transactionality guarantee. Under real-world
conditions like network unreliability or a busy Perforce server,
`git p4 clone` and  `git p4 sync` operations can easily fail, forcing a
user to restart the import process from the beginning. The longer the
history being imported, the more likely a fault occurs during the
process. Long enough imports thus become statistically unlikely to ever
succeed.

I'm looking for feedback on a potential approach for addressing the
problem. My idea was to leverage the checkpoint feature of git 
fast-import. I've included a patch which exposes a new option to the 
sync/clone commands in the git-p4 tool. The option enables explict 
checkpoints on a periodic basis (approximately every x seconds).

If the sync/clone command fails during processing of Perforce changes, 
the user can craft a new git p4 sync command that will identify 
changes that have already been imported and proceed with importing 
only changes more recent than the last successful checkpoint.

Assuming this approach makes sense, there are a few questions/items I
have:

  1. To add tests for this option, I'm thinking I'd need to simulate a 
     Perforce server or client that exits abnormally after first 
     processing some operations successfully. I'm looking for 
     suggestions on sane approaches for implementing that.
  2. From a usability perspective, I think it makes sense to print 
     out a message upon clone/sync failure if the user has enabled the 
     option. This message would describe how long ago the last 
     successful checkpoint was completed and document what command/s 
     to execute to continue importing Perforce changes. Ideally, the 
     commmand to continue would be exactly the same as the command 
     which failed, but today, clone will ignore any commits already 
     imported to git. There are some lingering TODO comments in 
     git-p4.py suggesting that clone should try to avoid reimporting
     changes. I don't mind taking a stab at addressing the TODO, but 
     am worried I'll quickly encounter edge cases in the clone/sync 
     features I don't understand.
  3. This is my first attempt at a git contribution, so I'm definitely 
     looking for feedback on commit messages, etc.


Cheers!

Ori Rawlings (1):
  [git-p4.py] Add --checkpoint-period option to sync/clone

 git-p4.py | 8 ++++++++
 1 file changed, 8 insertions(+)

-- 
2.7.4 (Apple Git-66)


             reply	other threads:[~2016-09-12 22:02 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-12 22:02 Ori Rawlings [this message]
2016-09-12 22:02 ` [PATCH] [git-p4.py] Add --checkpoint-period option to sync/clone Ori Rawlings
2016-09-13  8:10   ` Luke Diamand
2016-09-15 21:17 ` [PATCH v2 0/1] git-p4: " Ori Rawlings
2016-09-15 21:17   ` [PATCH v2 1/1] " Ori Rawlings
2016-09-16 16:19     ` Lars Schneider
2016-09-16 17:43       ` Ori Rawlings

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1473717733-65682-1-git-send-email-orirawlings@gmail.com \
    --to=orirawlings@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=larsxschneider@gmail.com \
    --cc=luke@diamand.org \
    --cc=pw@padd.com \
    --cc=vitor.hda@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.