All of lore.kernel.org
 help / color / mirror / Atom feed
* Git-Mediawiki : Question about Jeff King's import script
@ 2011-05-26 15:18 Claire Fousse
  2011-05-26 15:42 ` Jeff King
  2011-05-27 12:45 ` Alexandre Dulaunoy
  0 siblings, 2 replies; 4+ messages in thread
From: Claire Fousse @ 2011-05-26 15:18 UTC (permalink / raw)
  To: peff; +Cc: git, matthieu.moy, Sylvain Boulme

Dear Jeff King,
We are the four students in charge of the Git-Mediawiki project
proposed by Matthieu Moy.
In case you skipped our email, here is a link to our last mail with a
few information about the project
http://www.spinics.net/lists/git/msg158701.html
We based our script on what you called a few months ago the "quick and
dirty perl script" for the import part and have a few questions about
it.
First of all, just in case, here is your original script :
http://article.gmane.org/gmane.comp.version-control.git/167560

It seems like you first used a hashmap for it to be transformed later
into a flat list / array. What is the reasoning behind this ? Why not
create an array right away ?

Thanks for the script and for any information you can give us,

Regards,
The Git-Mediawiki team, Arnaud Lacurie, David Amouyal, Claire Fousse &
Jeremie Nikaes.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Git-Mediawiki : Question about Jeff King's import script
  2011-05-26 15:18 Git-Mediawiki : Question about Jeff King's import script Claire Fousse
@ 2011-05-26 15:42 ` Jeff King
  2011-05-27  9:05   ` Claire Fousse
  2011-05-27 12:45 ` Alexandre Dulaunoy
  1 sibling, 1 reply; 4+ messages in thread
From: Jeff King @ 2011-05-26 15:42 UTC (permalink / raw)
  To: Claire Fousse; +Cc: git, matthieu.moy, Sylvain Boulme

On Thu, May 26, 2011 at 05:18:11PM +0200, Claire Fousse wrote:

> We based our script on what you called a few months ago the "quick and
> dirty perl script" for the import part and have a few questions about
> it.
> First of all, just in case, here is your original script :
> http://article.gmane.org/gmane.comp.version-control.git/167560
> 
> It seems like you first used a hashmap for it to be transformed later
> into a flat list / array. What is the reasoning behind this ? Why not
> create an array right away ?

The hashmap is actually backed by an on-disk key/value database.  The
purpose of this was to allow resuming an import that had failed in the
middle (since even for a moderate-sized wiki like the git wiki, the
import was quite slow).

So the hashmap is indexed by page id, and each value contains an array
of revisions for that page. If we see a page id that we've already done,
we can skip importing it.

If you wanted to do it all at once, yes, you could build a flat array of
revisions, with each revision mentioning the page that it came from, and
just keep appending to the array as you read more data from the wiki.
And then at the end, sort that array based on timestamp to get the
chronological ordering of changes.

Hope that helps,
-Peff

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Git-Mediawiki : Question about Jeff King's import script
  2011-05-26 15:42 ` Jeff King
@ 2011-05-27  9:05   ` Claire Fousse
  0 siblings, 0 replies; 4+ messages in thread
From: Claire Fousse @ 2011-05-27  9:05 UTC (permalink / raw)
  To: Jeff King; +Cc: git, matthieu.moy, Sylvain Boulme

Thanks for your answer. That helped a lot.

-- 
Claire Fousse

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Git-Mediawiki : Question about Jeff King's import script
  2011-05-26 15:18 Git-Mediawiki : Question about Jeff King's import script Claire Fousse
  2011-05-26 15:42 ` Jeff King
@ 2011-05-27 12:45 ` Alexandre Dulaunoy
  1 sibling, 0 replies; 4+ messages in thread
From: Alexandre Dulaunoy @ 2011-05-27 12:45 UTC (permalink / raw)
  To: Claire Fousse; +Cc: peff, git, matthieu.moy, Sylvain Boulme

On Thu, May 26, 2011 at 5:18 PM, Claire Fousse <claire.fousse@gmail.com> wrote:
> Dear Jeff King,
> We are the four students in charge of the Git-Mediawiki project
> proposed by Matthieu Moy.
> In case you skipped our email, here is a link to our last mail with a
> few information about the project
> http://www.spinics.net/lists/git/msg158701.html
> We based our script on what you called a few months ago the "quick and
> dirty perl script" for the import part and have a few questions about
> it.
> First of all, just in case, here is your original script :
> http://article.gmane.org/gmane.comp.version-control.git/167560
>
> It seems like you first used a hashmap for it to be transformed later
> into a flat list / array. What is the reasoning behind this ? Why not
> create an array right away ?
>
> Thanks for the script and for any information you can give us,

In a similar spirit, there is this Ruby script:

https://github.com/singpolyma/git-mediawiki/blob/master/clone.rb

using the Mediawiki API. Quite nifty.

Hope this helps,



-- 
--                   Alexandre Dulaunoy (adulau) -- http://www.foo.be/
--                             http://www.foo.be/cgi-bin/wiki.pl/Diary
--         "Knowledge can create problems, it is not through ignorance
--                                that we can solve them" Isaac Asimov

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2011-05-27 12:45 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-26 15:18 Git-Mediawiki : Question about Jeff King's import script Claire Fousse
2011-05-26 15:42 ` Jeff King
2011-05-27  9:05   ` Claire Fousse
2011-05-27 12:45 ` Alexandre Dulaunoy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.