xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Jesus M. Gonzalez-Barahona" <jgb@bitergia.com>
To: Priya <vppriya9@gmail.com>, Xen-devel <xen-devel@lists.xen.org>
Cc: Lars Kurth <lars.kurth@xenproject.org>,
	Daniel Izquierdo <dizquierdo@bitergia.com>
Subject: Re: Regarding Outreachy project on Improving CR Dashboard
Date: Wed, 06 Apr 2016 01:23:07 +0200	[thread overview]
Message-ID: <1459898587.7498.95.camel@bitergia.com> (raw)
In-Reply-To: <CAGwjOLOznuNy=B2hEg6ghdFcGfOX35UdygyjPxdjPJidvPkUyw@mail.gmail.com>

On Tue, 2016-04-05 at 22:05 +0530, Priya wrote:
> Hello all, 
> 
> I have completed coding the initial task of grouping the email thread
> using the Zawinski algorithms and then adding property entity to the
> json for the messages that belong to the same email thread. 
> 
> You can see my git repo [1]. The new.json is the output of my script
> and out.json is the output of Perceval. 
> 
> Also, I have updated the README.md file regarding the execution
> procedures in github.
> 
> Instructions
> ============
> 
> git clone https://github.com/priya299/Dashboard.git
> 
> cd Dashboard
> 
> python createjson.py 'Perceval Ouputfile' 'mbox file' 'output_file'
> 
> eg: python createjson.py out.json xen-devel-2016-03 new.json
> 
> "new.json" json file will be created with each message belong to a
> single thread having an additional attribute "property". The property
> attribute will have message id of the first message in the thread.
> 
> Now, I will be pushing the new.json into the elastic search db[2].
> Please give me your valuable feedback about my progress. 
> 
> [1]:https://github.com/priya299/Dashboard
> [2]:https://www.elastic.co/guide/en/kibana/3.0/import-some-data.html

Hi, Priya. To begin with, could you please integrate your code with the
Perceval iterator? In other words, you can run Perceval on the mailing
list archive directly from your code, which will render the use of
"out.json" void. That way, the invocation of the script would be more
like:

python createjson.py xen-devel-2016-03 new.json

In other words, create.json would use Perceval to parse the mailing
list archive. For this end, the Perceval mbox backend is a class, which
once instantiated, provides an iterator function, fetch(), that you can
run inside a loop. For each iteration of the loop, you get the
equivalent to a JSON element in out.json.

The code would be similar to:

-------------------------------
import perceval

mbox_parser = perceval.backends.mbox.MBox(
  origin=mbox_url,
  dirpath=mbox_file_name
)
for item in mbox_parser.fetch():
  thread_id = find_thread(item)
  ...
---------------------------------

Some details about the Perceval mbox class:

http://perceval.readthedocs.org/en/master/perceval.backends.html#module
-perceval.backends.mbox

If you have trouble running the Perceval backend as an iterator, please
let me know.

In addition, you can use argparse for reading the arguments in the
command line. It is easy and convenient.

Saludos,

	Jesus.

> 
-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

  reply	other threads:[~2016-04-05 23:23 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-05 16:35 Regarding Outreachy project on Improving CR Dashboard Priya
2016-04-05 23:23 ` Jesus M. Gonzalez-Barahona [this message]
2016-04-06 12:00   ` Priya
2016-04-06 21:59     ` Jesus M. Gonzalez-Barahona
2016-04-07 12:27       ` Priya
2016-04-07 17:57         ` Jesus M. Gonzalez-Barahona
2016-04-08 14:03           ` Priya
2016-04-11  7:53             ` Jesus M. Gonzalez-Barahona
2016-04-13 16:33               ` Priya
2016-04-14 17:11               ` Priya
2016-04-14 22:41                 ` Jesus M. Gonzalez-Barahona
  -- strict thread matches above, loose matches on Subject: below --
2016-03-19  9:54 Priya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1459898587.7498.95.camel@bitergia.com \
    --to=jgb@bitergia.com \
    --cc=dizquierdo@bitergia.com \
    --cc=lars.kurth@xenproject.org \
    --cc=vppriya9@gmail.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).