All of lore.kernel.org
 help / color / mirror / Atom feed
* Outreachy project - Xen Code Review Dashboard
@ 2017-03-31  1:48 Vaishnavi Ramesh Jayaraman
  2017-03-31 11:01 ` Lars Kurth
  2017-04-02 18:20 ` Jesus M. Gonzalez-Barahona
  0 siblings, 2 replies; 16+ messages in thread
From: Vaishnavi Ramesh Jayaraman @ 2017-03-31  1:48 UTC (permalink / raw)
  To: metrics-grimoire; +Cc: xen-devel, jgb, lars.kurth


[-- Attachment #1.1: Type: text/plain, Size: 593 bytes --]

Hi,
I am Vaishnavi, interested in contributing to the Xen Project as part of
the Outreachy Program. I am particularly interested in working on the Xen
Code Review Dashboard.

I have worked on the ElasticSearch - Logstash- Kibana (ELK) stack
previously and am comfortable with Javascript.

It would be great if you could give me pointers on how to get started!

Also, I am unable to join the mailing list for this project -
metrics-grimoire@lists.libresoft.es
<https://lists.libresoft.es/listinfo/metrics-grimoire>

Could you please add me? Thanks!

Looking forward to contributing!

Vaishnavi

[-- Attachment #1.2: Type: text/html, Size: 954 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-03-31  1:48 Outreachy project - Xen Code Review Dashboard Vaishnavi Ramesh Jayaraman
@ 2017-03-31 11:01 ` Lars Kurth
  2017-04-02 18:20 ` Jesus M. Gonzalez-Barahona
  1 sibling, 0 replies; 16+ messages in thread
From: Lars Kurth @ 2017-03-31 11:01 UTC (permalink / raw)
  To: Vaishnavi Ramesh Jayaraman
  Cc: xen-devel, Jesus M. Gonzalez-Barahona, metrics-grimoire, lars.kurth


[-- Attachment #1.1: Type: text/plain, Size: 614 bytes --]

Hi,

> On 31 Mar 2017, at 02:48, Vaishnavi Ramesh Jayaraman <vaishnavi.ur777@gmail.com> wrote:
> 
> Hi,
> I am Vaishnavi, interested in contributing to the Xen Project as part of the Outreachy Program. I am particularly interested in working on the Xen Code Review Dashboard.
> 
> I have worked on the ElasticSearch - Logstash- Kibana (ELK) stack previously and am comfortable with Javascript.
> 
> It would be great if you could give me pointers on how to get started!

You may want to look at http://markmail.org/message/7adkmords3imkswd <http://markmail.org/message/7adkmords3imkswd>

Regards
Lars

[-- Attachment #1.2: Type: text/html, Size: 1265 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-03-31  1:48 Outreachy project - Xen Code Review Dashboard Vaishnavi Ramesh Jayaraman
  2017-03-31 11:01 ` Lars Kurth
@ 2017-04-02 18:20 ` Jesus M. Gonzalez-Barahona
  2017-04-02 20:09   ` vaishnavi.ur777
  1 sibling, 1 reply; 16+ messages in thread
From: Jesus M. Gonzalez-Barahona @ 2017-04-02 18:20 UTC (permalink / raw)
  To: Vaishnavi Ramesh Jayaraman; +Cc: xen-devel, lars.kurth

On Thu, 2017-03-30 at 18:48 -0700, Vaishnavi Ramesh Jayaraman wrote:
> Hi,
> I am Vaishnavi, interested in contributing to the Xen Project as part
> of the Outreachy Program. I am particularly interested in working on
> the Xen Code Review Dashboard.
> 
> I have worked on the ElasticSearch - Logstash- Kibana (ELK) stack
> previously and am comfortable with Javascript.
> 
> It would be great if you could give me pointers on how to get
> started!
> 
> Also, I am unable to join the mailing list for this project - metrics
> -grimoire@lists.libresoft.es 

Hi, Vaishnavi,

First of all, thanks for your interest.

And now, a warning notice: this project will require mainly Python and
noSQL (ElasticSearch, in particular) knowledge. I see you're familiar
with ELK, what about Python?

[Lars, as I just commented in another message, I now notice this is
wrong in the project description at
https://wiki.xenproject.org/wiki/Outreach_Program_Projects
sorry about that. Could we change it?]

If you're still interested, I guess it would be good to have a quick
IRC chat, and discuss about next steps.

Saludos,

	Jesus.

> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-04-02 18:20 ` Jesus M. Gonzalez-Barahona
@ 2017-04-02 20:09   ` vaishnavi.ur777
  2017-04-02 21:14     ` Jesus M. Gonzalez-Barahona
  0 siblings, 1 reply; 16+ messages in thread
From: vaishnavi.ur777 @ 2017-04-02 20:09 UTC (permalink / raw)
  To: Jesus M. Gonzalez-Barahona; +Cc: xen-devel, lars.kurth

Hi Jesus,
I understand that the Perceval scripts are written in Python. I am familiar with Python and comfortable working in it. So does the micro task remain the same? To run Perceval on xen-devel list and save output to elastic search ?

Also, when are you generally online on IRC? 

Thanks a lot!

Vaishnavi

> On Apr 2, 2017, at 11:20 AM, Jesus M. Gonzalez-Barahona <jgb@bitergia.com> wrote:
> 
>> On Thu, 2017-03-30 at 18:48 -0700, Vaishnavi Ramesh Jayaraman wrote:
>> Hi,
>> I am Vaishnavi, interested in contributing to the Xen Project as part
>> of the Outreachy Program. I am particularly interested in working on
>> the Xen Code Review Dashboard.
>> 
>> I have worked on the ElasticSearch - Logstash- Kibana (ELK) stack
>> previously and am comfortable with Javascript.
>> 
>> It would be great if you could give me pointers on how to get
>> started!
>> 
>> Also, I am unable to join the mailing list for this project - metrics
>> -grimoire@lists.libresoft.es 
> 
> Hi, Vaishnavi,
> 
> First of all, thanks for your interest.
> 
> And now, a warning notice: this project will require mainly Python and
> noSQL (ElasticSearch, in particular) knowledge. I see you're familiar
> with ELK, what about Python?
> 
> [Lars, as I just commented in another message, I now notice this is
> wrong in the project description at
> https://wiki.xenproject.org/wiki/Outreach_Program_Projects
> sorry about that. Could we change it?]
> 
> If you're still interested, I guess it would be good to have a quick
> IRC chat, and discuss about next steps.
> 
> Saludos,
> 
>    Jesus.
> 
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> https://lists.xen.org/xen-devel
> -- 
> Bitergia: http://bitergia.com
> /me at Twitter: https://twitter.com/jgbarah
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-04-02 20:09   ` vaishnavi.ur777
@ 2017-04-02 21:14     ` Jesus M. Gonzalez-Barahona
  0 siblings, 0 replies; 16+ messages in thread
From: Jesus M. Gonzalez-Barahona @ 2017-04-02 21:14 UTC (permalink / raw)
  To: vaishnavi.ur777; +Cc: xen-devel, lars.kurth

On Sun, 2017-04-02 at 13:09 -0700, vaishnavi.ur777@gmail.com wrote:
> Hi Jesus,
> I understand that the Perceval scripts are written in Python. I am
> familiar with Python and comfortable working in it. So does the micro
> task remain the same? To run Perceval on xen-devel list and save
> output to elastic search ?

Yes, you can start that way, a the link suggested by Lars suggests.

> Also, when are you generally online on IRC? 

At different times, feel free to ping me (jgbarah at #metrics-grimoire
on Freenode) if you see me. But maybe it is better to set some slot.
I'm in CEST timezone. What timezone are you?

Saludos,

	Jesus.

> Thanks a lot!
> 
> Vaishnavi
> 
> > On Apr 2, 2017, at 11:20 AM, Jesus M. Gonzalez-Barahona <jgb@biterg
> > ia.com> wrote:
> > 
> > > On Thu, 2017-03-30 at 18:48 -0700, Vaishnavi Ramesh Jayaraman
> > > wrote:
> > > Hi,
> > > I am Vaishnavi, interested in contributing to the Xen Project as
> > > part
> > > of the Outreachy Program. I am particularly interested in working
> > > on
> > > the Xen Code Review Dashboard.
> > > 
> > > I have worked on the ElasticSearch - Logstash- Kibana (ELK) stack
> > > previously and am comfortable with Javascript.
> > > 
> > > It would be great if you could give me pointers on how to get
> > > started!
> > > 
> > > Also, I am unable to join the mailing list for this project -
> > > metrics
> > > -grimoire@lists.libresoft.es 
> > 
> > Hi, Vaishnavi,
> > 
> > First of all, thanks for your interest.
> > 
> > And now, a warning notice: this project will require mainly Python
> > and
> > noSQL (ElasticSearch, in particular) knowledge. I see you're
> > familiar
> > with ELK, what about Python?
> > 
> > [Lars, as I just commented in another message, I now notice this is
> > wrong in the project description at
> > https://wiki.xenproject.org/wiki/Outreach_Program_Projects
> > sorry about that. Could we change it?]
> > 
> > If you're still interested, I guess it would be good to have a
> > quick
> > IRC chat, and discuss about next steps.
> > 
> > Saludos,
> > 
> >    Jesus.
> > 
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.xen.org
> > > https://lists.xen.org/xen-devel
> > 
> > -- 
> > Bitergia: http://bitergia.com
> > /me at Twitter: https://twitter.com/jgbarah
> > 
-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-04-17  9:04               ` Jesus M. Gonzalez-Barahona
@ 2017-04-19 23:22                 ` Heather Booker
  0 siblings, 0 replies; 16+ messages in thread
From: Heather Booker @ 2017-04-19 23:22 UTC (permalink / raw)
  To: Jesus M. Gonzalez-Barahona; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 12749 bytes --]

Hi Jesus,

I have a version of the task running, I'd love if you could take a
look and let me know if there are any changes you'd like to see.

It's at https://github.com/heatherbooker/xen-outreachy

It gets the mailboxes, analyzes them using Perceval and an
implementation of the well known jwz's threading algorithm
(https://www.jwz.org/doc/threading.html) then indexes them
in Elasticsearch.
Each document in ES is a message, with its id being the
Message-ID and type being a modified Subject line from the
first message in a thread.

I hope this is what was intended for the task!

PS - Should I continue copying these messages to the
whole xen-devel mailing list, or is sending them to you
sufficient?

Thanks!

Heather

On Mon, Apr 17, 2017 at 2:04 AM, Jesus M. Gonzalez-Barahona <
jgb@bitergia.com> wrote:

> On Sun, 2017-04-16 at 21:26 -0700, Heather Booker wrote:
> > Hi Jesus!
> >
> > I appreciate the info on the unicode error. I might have missed it,
> > but I also asked about the general microtask specifications. Here
> > was my original inquiry:
> > > And to clarify, my understanding is that the final result of
> > this task
> > > is an index of Xen data, with two types: commits and messages.
> > > Each commit document should contain its original information
> > > from git, plus the name of the branch it was developed in. And
> > > should only the mbox messages which appear to be associated
> > > with a specific commit exist in the final index? Is there some
> > > key information in messages that is supposed to indicate the
> > > association of a given commit with a git branch? I would be
> > > grateful if you could specify the end goal a little more. :D
> >
> > Yeah, so overall I'm not sure I understand the relationship of
> > branches to the mailing list messages. Is this to be a simple
> > string parsing task wherein I should scan the message body
> > for the word "branch"? (I am guessing not ;P)
>
> I'm sorry, I understood that text was about the project, not about the
> microtask. The microtask is about either:
>
> * Producing an ES index with messages labeled by thread (by applying a
> threading algorithm to messages retrieved from archives), or
>
> * Producing an ES index with commits labeled by branch (by following
> refes, and parents information in the output produced by Perceval).
>
> In the complete project, both will be used to produce the final indexes
> that power the code review dashboard.
>
> > I will be happy to get back on developing once I better grasp
> > the goal! :)
>
> More clear now?
>
> If you want, let's schedule some IRC slot for clarifying whatever is
> not clear.
>
>         Jesus.
>
> > Thanks!
> >
> > Heather
> >
> > On Sun, Apr 16, 2017 at 4:23 PM, Jesus M. Gonzalez-Barahona <jgb@bite
> > rgia.com> wrote:
> > > On Thu, 2017-04-13 at 00:47 -0700, Heather Booker wrote:
> > > > Hi,
> > > >
> > > > I submitted an application for this code review dashboard and
> > > > would love to keep working on the microtask once I get some
> > > > more info. :)
> > >
> > > Great! I answered your message, could you progress with the task?
> > >
> > > > I also came up with a general idea of how the project might be
> > > > split up - any feedback on this would be welcome! I wrote:
> > > >
> > > > "As said by Jesus, the big picture of this project will be
> > > porting
> > > > everything behind the current code review dashboard to use
> > > > Grimoire Lab tools, from the current state of using
> > > > MetricsGrimoire and custom scripts. I expect this would involve
> > > > Perceval for analyzing data, and Grimoire Elk may be useful in
> > > > further stages, or may be too general - this is something I would
> > > > wish to explore.
> > > > This project will also involve a migration from SQL to
> > > Elasticsearch
> > > > - because I believe the relevant data is mostly / all available
> > > in
> > > > places online, I am unsure whether this would need to be a direct
> > > > migration. However, looking at the current SQL setup would be
> > > > beneficial to understanding the desired format of the
> > > Elasticsearch
> > > > indexes.
> > > > I would love to dive into this project and have 3 main parts -
> > > > getting
> > > > data into ES, turning it into dashboard displays, and then fine
> > > > tuning
> > > > and perhaps augmenting the dashboard to improve its usefulness.
> > > > Getting data into ES may seem simple but I believe that once it
> > > > needs to be used for the dashboard, many realizations will pop up
> > > > - thus I’d like to leave maybe 2-3 weeks for that first step, 6-7
> > > > weeks
> > > > for the visualizations (which will include querying the data),
> > > and
> > > > the
> > > > final 3 weeks for touch ups and improvements."
> > >
> > > The plan could be sound, but would need some tweaks, once your
> > > skills
> > > in Python are clear, which could be the main blocker for the first
> > > stages.
> > >
> > > > Does this sound like an accurate summary and reasonable
> > > timeline?
> > > > And I am guessing that from Jesus's involvement with the threads
> > > > that Jesus would be the mentor, is that correct? :)
> > >
> > > Yes, I would be ;-)
> > >
> > >         Jesus.
> > >
> > > > Thanks!
> > > >
> > > > Heather
> > > >
> > > >
> > > > On Sun, Apr 9, 2017 at 9:50 PM, Heather Booker <heather.j.booker@
> > > gmai
> > > > l.com> wrote:
> > > > > Hi Jesus,
> > > > >
> > > > > While using the Elasticsearch python library
> > > > > (https://elasticsearch-py.readthedocs.io/en/master/) to add
> > > mbox
> > > > > messages to an index, I would get a UnicodeEncodeError:
> > > > > "'utf-8' codec can't encode character '\udca0' in position 767:
> > > > > surrogates not allowed".
> > > > >
> > > > > Investigating in Grimoire elk https://github.com/grim
> > > > > oirelab/GrimoireELK/blob/96b00bc682485976104a6825ca63ae0
> > > > > 8639deacc/grimoire_elk/elk/mbox.py#L200 seems to show that
> > > > > perhaps that tool instead uses Latin-1 encoding, but I found
> > > that
> > > > > to then produce a serialization error (their custom error
> > > message:
> > > > > "Unable to serialize %r (type: %s)"). I suppose this is because
> > > > > now it's bytes; of course, converting back to string after
> > > encoding
> > > > > just cycles back to the first error.
> > > > >
> > > > > As somewhat of a Python newbie I don't really know how to
> > > tackle
> > > > > this! My thought atm is to splice the offending character out
> > > > > of the message.
> > > > >
> > > > > And to clarify, my understanding is that the final result of
> > > this
> > > > > task
> > > > > is an index of Xen data, with two types: commits and messages.
> > > > > Each commit document should contain its original information
> > > > > from git, plus the name of the branch it was developed in. And
> > > > > should only the mbox messages which appear to be associated
> > > > > with a specific commit exist in the final index? Is there some
> > > > > key information in messages that is supposed to indicate the
> > > > > association of a given commit with a git branch? I would be
> > > > > grateful if you could specify the end goal a little more. :D
> > > > >
> > > > > Thanks so much!
> > > > >
> > > > > Heather
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Apr 8, 2017 at 10:02 AM, Jesus M. Gonzalez-Barahona <jg
> > > b@bi
> > > > > tergia.com> wrote:
> > > > > > On Fri, 2017-04-07 at 15:49 -0700, Heather Booker wrote:
> > > > > > > Hi Jesus,
> > > > > > >
> > > > > > > Thanks for your reply!
> > > > > > >
> > > > > > > So about the task, instructions say after analyzing mboxes
> > > with
> > > > > > > Perceval to
> > > > > > > "store the resulting raw index in ElasticSearch" - what
> > > does
> > > > > > raw
> > > > > > > index mean?
> > > > > >
> > > > > > In this context, I mean "storing the JSON documents produced
> > > by
> > > > > > Perceval in an ElasticSearch index, as such". ElasticSearch
> > > > > > stores JSON
> > > > > > documents, so it is just uploading the output of Perceval to
> > > it.
> > > > > >
> > > > > > > In terms of figuring out the elasticsearch structure, do I
> > > want
> > > > > > an
> > > > > > > index
> > > > > > > (xen-devel mbox) with a type (message) and each object from
> > > the
> > > > > > > perceval
> > > > > > > output to be one document? Or should it be more fine-
> > > grained?
> > > > > >
> > > > > > Exactly.
> > > > > >
> > > > > > Saludos,
> > > > > >
> > > > > >         Jesus.
> > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Heather
> > > > > > >
> > > > > > > On Thu, Apr 6, 2017 at 7:05 AM, Jesus M. Gonzalez-Barahona
> > > <jgb
> > > > > > @biter
> > > > > > > gia.com> wrote:
> > > > > > > > On Wed, 2017-04-05 at 16:43 -0700, Heather Booker wrote:
> > > > > > > > > Hi!
> > > > > > > > >
> > > > > > > > > I'd love to work on the Code Review Dashboard project
> > > for
> > > > > > this
> > > > > > > > round
> > > > > > > > > of Outreachy.
> > > > > > > >
> > > > > > > > Great!!
> > > > > > > >
> > > > > > > > > Are the steps outlined
> > > > > > > > > here http://markmail.org/message/7adkmords3imkswd still
> > > the
> > > > > > first
> > > > > > > > > contribution you'd like to see?
> > > > > > > >
> > > > > > > > Yes.
> > > > > > > >
> > > > > > > > > So is this a project that has been worked on in
> > > previous
> > > > > > rounds
> > > > > > > > of
> > > > > > > > > GSOC/Outreachy also?
> > > > > > > > > If so is there a place to find links to the previous
> > > > > > participants
> > > > > > > > > blogs? :)
> > > > > > > >
> > > > > > > > No. We had one participation at some point, but couldn't
> > > even
> > > > > > start
> > > > > > > > for
> > > > > > > > personal reasons. There are some people considering
> > > working
> > > > > > on this
> > > > > > > > for
> > > > > > > > this next round of Outreachy, however. You'll see their
> > > > > > messages in
> > > > > > > > this mailing list.
> > > > > > > >
> > > > > > > > > Should questions about how the
> > > specifications/completion of
> > > > > > the
> > > > > > > > > microtask be addressed to
> > > > > > > > > IRC or this list? If IRC, which channel - #xen-opw or
> > > > > > #metrics-
> > > > > > > > > grimoire? On that note, I'm
> > > > > > > > > curious why #metrics-grimoire is the listed channel on
> > > the
> > > > > > > > project
> > > > > > > > > page - are main contributors
> > > > > > > > > involved in both projects? Or is it just because the
> > > Xen
> > > > > > > > dashboard
> > > > > > > > > doesn't have a channel?
> > > > > > > >
> > > > > > > > The code review is for the Xen project, but it is done
> > > with
> > > > > > (I
> > > > > > > > mean,
> > > > > > > > the ssoftware used for it is) GrimoireLab, which for
> > > > > > historical
> > > > > > > > reasons
> > > > > > > > uses the #metrics-grimoire channel. That's why it is
> > > likely
> > > > > > that
> > > > > > > > you
> > > > > > > > find somebody from the project there.
> > > > > > > >
> > > > > > > > If you have questions, and find me around in IRC, please
> > > ping
> > > > > > me.
> > > > > > > > If
> > > > > > > > I'm not available, please send an email message.
> > > > > > > >
> > > > > > > > Saludos,
> > > > > > > >
> > > > > > > >         Jesus.
> > > > > > > >
> > > > > > > > > Thanks!
> > > > > > > > >
> > > > > > > > > Heather
> > > > > > > > > _______________________________________________
> > > > > > > > > Xen-devel mailing list
> > > > > > > > > Xen-devel@lists.xen.org
> > > > > > > > > https://lists.xen.org/xen-devel
> > > > > > > > --
> > > > > > > > Bitergia: http://bitergia.com
> > > > > > > > /me at Twitter: https://twitter.com/jgbarah
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > Xen-devel mailing list
> > > > > > > Xen-devel@lists.xen.org
> > > > > > > https://lists.xen.org/xen-devel
> > > > > > --
> > > > > > Bitergia: http://bitergia.com
> > > > > > /me at Twitter: https://twitter.com/jgbarah
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > > _______________________________________________
> > > > Xen-devel mailing list
> > > > Xen-devel@lists.xen.org
> > > > https://lists.xen.org/xen-devel
> > > --
> > > Bitergia: http://bitergia.com
> > > /me at Twitter: https://twitter.com/jgbarah
> > >
> > >
> >
> >
> --
> Bitergia: http://bitergia.com
> /me at Twitter: https://twitter.com/jgbarah
>
>

[-- Attachment #1.2: Type: text/html, Size: 19691 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-04-17  4:26             ` Heather Booker
@ 2017-04-17  9:04               ` Jesus M. Gonzalez-Barahona
  2017-04-19 23:22                 ` Heather Booker
  0 siblings, 1 reply; 16+ messages in thread
From: Jesus M. Gonzalez-Barahona @ 2017-04-17  9:04 UTC (permalink / raw)
  To: Heather Booker; +Cc: xen-devel

On Sun, 2017-04-16 at 21:26 -0700, Heather Booker wrote:
> Hi Jesus!
> 
> I appreciate the info on the unicode error. I might have missed it,
> but I also asked about the general microtask specifications. Here
> was my original inquiry:
> > And to clarify, my understanding is that the final result of
> this task
> > is an index of Xen data, with two types: commits and messages.
> > Each commit document should contain its original information
> > from git, plus the name of the branch it was developed in. And
> > should only the mbox messages which appear to be associated
> > with a specific commit exist in the final index? Is there some
> > key information in messages that is supposed to indicate the
> > association of a given commit with a git branch? I would be
> > grateful if you could specify the end goal a little more. :D
> 
> Yeah, so overall I'm not sure I understand the relationship of
> branches to the mailing list messages. Is this to be a simple
> string parsing task wherein I should scan the message body
> for the word "branch"? (I am guessing not ;P)

I'm sorry, I understood that text was about the project, not about the
microtask. The microtask is about either:

* Producing an ES index with messages labeled by thread (by applying a
threading algorithm to messages retrieved from archives), or

* Producing an ES index with commits labeled by branch (by following
refes, and parents information in the output produced by Perceval).

In the complete project, both will be used to produce the final indexes
that power the code review dashboard.

> I will be happy to get back on developing once I better grasp
> the goal! :)

More clear now?

If you want, let's schedule some IRC slot for clarifying whatever is
not clear.

	Jesus.

> Thanks!
> 
> Heather
> 
> On Sun, Apr 16, 2017 at 4:23 PM, Jesus M. Gonzalez-Barahona <jgb@bite
> rgia.com> wrote:
> > On Thu, 2017-04-13 at 00:47 -0700, Heather Booker wrote:
> > > Hi,
> > >
> > > I submitted an application for this code review dashboard and
> > > would love to keep working on the microtask once I get some
> > > more info. :)
> > 
> > Great! I answered your message, could you progress with the task?
> > 
> > > I also came up with a general idea of how the project might be
> > > split up - any feedback on this would be welcome! I wrote:
> > >
> > > "As said by Jesus, the big picture of this project will be
> > porting
> > > everything behind the current code review dashboard to use
> > > Grimoire Lab tools, from the current state of using
> > > MetricsGrimoire and custom scripts. I expect this would involve
> > > Perceval for analyzing data, and Grimoire Elk may be useful in
> > > further stages, or may be too general - this is something I would
> > > wish to explore.
> > > This project will also involve a migration from SQL to
> > Elasticsearch
> > > - because I believe the relevant data is mostly / all available
> > in
> > > places online, I am unsure whether this would need to be a direct
> > > migration. However, looking at the current SQL setup would be
> > > beneficial to understanding the desired format of the
> > Elasticsearch
> > > indexes.
> > > I would love to dive into this project and have 3 main parts -
> > > getting
> > > data into ES, turning it into dashboard displays, and then fine
> > > tuning
> > > and perhaps augmenting the dashboard to improve its usefulness.
> > > Getting data into ES may seem simple but I believe that once it
> > > needs to be used for the dashboard, many realizations will pop up
> > > - thus I’d like to leave maybe 2-3 weeks for that first step, 6-7
> > > weeks
> > > for the visualizations (which will include querying the data),
> > and
> > > the
> > > final 3 weeks for touch ups and improvements."
> > 
> > The plan could be sound, but would need some tweaks, once your
> > skills
> > in Python are clear, which could be the main blocker for the first
> > stages.
> > 
> > > Does this sound like an accurate summary and reasonable
> > timeline? 
> > > And I am guessing that from Jesus's involvement with the threads
> > > that Jesus would be the mentor, is that correct? :)
> > 
> > Yes, I would be ;-)
> > 
> >         Jesus.
> > 
> > > Thanks!
> > >
> > > Heather
> > >
> > >
> > > On Sun, Apr 9, 2017 at 9:50 PM, Heather Booker <heather.j.booker@
> > gmai
> > > l.com> wrote:
> > > > Hi Jesus,
> > > >
> > > > While using the Elasticsearch python library
> > > > (https://elasticsearch-py.readthedocs.io/en/master/) to add
> > mbox
> > > > messages to an index, I would get a UnicodeEncodeError:
> > > > "'utf-8' codec can't encode character '\udca0' in position 767:
> > > > surrogates not allowed".
> > > >
> > > > Investigating in Grimoire elk https://github.com/grim
> > > > oirelab/GrimoireELK/blob/96b00bc682485976104a6825ca63ae0
> > > > 8639deacc/grimoire_elk/elk/mbox.py#L200 seems to show that 
> > > > perhaps that tool instead uses Latin-1 encoding, but I found
> > that
> > > > to then produce a serialization error (their custom error
> > message:
> > > > "Unable to serialize %r (type: %s)"). I suppose this is because
> > > > now it's bytes; of course, converting back to string after
> > encoding
> > > > just cycles back to the first error.
> > > >
> > > > As somewhat of a Python newbie I don't really know how to
> > tackle
> > > > this! My thought atm is to splice the offending character out
> > > > of the message. 
> > > >
> > > > And to clarify, my understanding is that the final result of
> > this
> > > > task
> > > > is an index of Xen data, with two types: commits and messages.
> > > > Each commit document should contain its original information
> > > > from git, plus the name of the branch it was developed in. And
> > > > should only the mbox messages which appear to be associated
> > > > with a specific commit exist in the final index? Is there some
> > > > key information in messages that is supposed to indicate the
> > > > association of a given commit with a git branch? I would be
> > > > grateful if you could specify the end goal a little more. :D
> > > >
> > > > Thanks so much!
> > > >
> > > > Heather
> > > >
> > > >
> > > >
> > > > On Sat, Apr 8, 2017 at 10:02 AM, Jesus M. Gonzalez-Barahona <jg
> > b@bi
> > > > tergia.com> wrote:
> > > > > On Fri, 2017-04-07 at 15:49 -0700, Heather Booker wrote:
> > > > > > Hi Jesus, 
> > > > > >
> > > > > > Thanks for your reply!
> > > > > >
> > > > > > So about the task, instructions say after analyzing mboxes
> > with
> > > > > > Perceval to
> > > > > > "store the resulting raw index in ElasticSearch" - what
> > does
> > > > > raw
> > > > > > index mean?
> > > > >
> > > > > In this context, I mean "storing the JSON documents produced
> > by
> > > > > Perceval in an ElasticSearch index, as such". ElasticSearch
> > > > > stores JSON
> > > > > documents, so it is just uploading the output of Perceval to
> > it.
> > > > >
> > > > > > In terms of figuring out the elasticsearch structure, do I
> > want
> > > > > an
> > > > > > index
> > > > > > (xen-devel mbox) with a type (message) and each object from
> > the
> > > > > > perceval
> > > > > > output to be one document? Or should it be more fine-
> > grained?
> > > > >
> > > > > Exactly.
> > > > >
> > > > > Saludos,
> > > > >
> > > > >         Jesus.
> > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Heather
> > > > > >
> > > > > > On Thu, Apr 6, 2017 at 7:05 AM, Jesus M. Gonzalez-Barahona
> > <jgb
> > > > > @biter
> > > > > > gia.com> wrote:
> > > > > > > On Wed, 2017-04-05 at 16:43 -0700, Heather Booker wrote:
> > > > > > > > Hi!
> > > > > > > >
> > > > > > > > I'd love to work on the Code Review Dashboard project
> > for
> > > > > this
> > > > > > > round
> > > > > > > > of Outreachy.
> > > > > > >
> > > > > > > Great!!
> > > > > > >
> > > > > > > > Are the steps outlined
> > > > > > > > here http://markmail.org/message/7adkmords3imkswd still
> > the
> > > > > first
> > > > > > > > contribution you'd like to see?
> > > > > > >
> > > > > > > Yes.
> > > > > > >
> > > > > > > > So is this a project that has been worked on in
> > previous
> > > > > rounds
> > > > > > > of
> > > > > > > > GSOC/Outreachy also?
> > > > > > > > If so is there a place to find links to the previous
> > > > > participants
> > > > > > > > blogs? :)
> > > > > > >
> > > > > > > No. We had one participation at some point, but couldn't
> > even
> > > > > start
> > > > > > > for
> > > > > > > personal reasons. There are some people considering
> > working
> > > > > on this
> > > > > > > for
> > > > > > > this next round of Outreachy, however. You'll see their
> > > > > messages in
> > > > > > > this mailing list.
> > > > > > >
> > > > > > > > Should questions about how the
> > specifications/completion of
> > > > > the
> > > > > > > > microtask be addressed to
> > > > > > > > IRC or this list? If IRC, which channel - #xen-opw or
> > > > > #metrics-
> > > > > > > > grimoire? On that note, I'm 
> > > > > > > > curious why #metrics-grimoire is the listed channel on
> > the
> > > > > > > project
> > > > > > > > page - are main contributors
> > > > > > > > involved in both projects? Or is it just because the
> > Xen
> > > > > > > dashboard
> > > > > > > > doesn't have a channel?
> > > > > > >
> > > > > > > The code review is for the Xen project, but it is done
> > with
> > > > > (I
> > > > > > > mean,
> > > > > > > the ssoftware used for it is) GrimoireLab, which for
> > > > > historical
> > > > > > > reasons
> > > > > > > uses the #metrics-grimoire channel. That's why it is
> > likely
> > > > > that
> > > > > > > you
> > > > > > > find somebody from the project there.
> > > > > > >
> > > > > > > If you have questions, and find me around in IRC, please
> > ping
> > > > > me.
> > > > > > > If
> > > > > > > I'm not available, please send an email message.
> > > > > > >
> > > > > > > Saludos,
> > > > > > >
> > > > > > >         Jesus.
> > > > > > >
> > > > > > > > Thanks!
> > > > > > > >
> > > > > > > > Heather
> > > > > > > > _______________________________________________
> > > > > > > > Xen-devel mailing list
> > > > > > > > Xen-devel@lists.xen.org
> > > > > > > > https://lists.xen.org/xen-devel
> > > > > > > --
> > > > > > > Bitergia: http://bitergia.com
> > > > > > > /me at Twitter: https://twitter.com/jgbarah
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > _______________________________________________
> > > > > > Xen-devel mailing list
> > > > > > Xen-devel@lists.xen.org
> > > > > > https://lists.xen.org/xen-devel
> > > > > --
> > > > > Bitergia: http://bitergia.com
> > > > > /me at Twitter: https://twitter.com/jgbarah
> > > > >
> > > > >
> > > >
> > > >
> > >
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.xen.org
> > > https://lists.xen.org/xen-devel
> > --
> > Bitergia: http://bitergia.com
> > /me at Twitter: https://twitter.com/jgbarah
> > 
> > 
> 
> 
-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-04-16 23:23           ` Jesus M. Gonzalez-Barahona
@ 2017-04-17  4:26             ` Heather Booker
  2017-04-17  9:04               ` Jesus M. Gonzalez-Barahona
  0 siblings, 1 reply; 16+ messages in thread
From: Heather Booker @ 2017-04-17  4:26 UTC (permalink / raw)
  To: Jesus M. Gonzalez-Barahona; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 9795 bytes --]

Hi Jesus!

I appreciate the info on the unicode error. I might have missed it,
but I also asked about the general microtask specifications. Here
was my original inquiry:
> And to clarify, my understanding is that the final result of this task
> is an index of Xen data, with two types: commits and messages.
> Each commit document should contain its original information
> from git, plus the name of the branch it was developed in. And
> should only the mbox messages which appear to be associated
> with a specific commit exist in the final index? Is there some
> key information in messages that is supposed to indicate the
> association of a given commit with a git branch? I would be
> grateful if you could specify the end goal a little more. :D

Yeah, so overall I'm not sure I understand the relationship of
branches to the mailing list messages. Is this to be a simple
string parsing task wherein I should scan the message body
for the word "branch"? (I am guessing not ;P)

I will be happy to get back on developing once I better grasp
the goal! :)

Thanks!

Heather

On Sun, Apr 16, 2017 at 4:23 PM, Jesus M. Gonzalez-Barahona <
jgb@bitergia.com> wrote:

> On Thu, 2017-04-13 at 00:47 -0700, Heather Booker wrote:
> > Hi,
> >
> > I submitted an application for this code review dashboard and
> > would love to keep working on the microtask once I get some
> > more info. :)
>
> Great! I answered your message, could you progress with the task?
>
> > I also came up with a general idea of how the project might be
> > split up - any feedback on this would be welcome! I wrote:
> >
> > "As said by Jesus, the big picture of this project will be porting
> > everything behind the current code review dashboard to use
> > Grimoire Lab tools, from the current state of using
> > MetricsGrimoire and custom scripts. I expect this would involve
> > Perceval for analyzing data, and Grimoire Elk may be useful in
> > further stages, or may be too general - this is something I would
> > wish to explore.
> > This project will also involve a migration from SQL to Elasticsearch
> > - because I believe the relevant data is mostly / all available in
> > places online, I am unsure whether this would need to be a direct
> > migration. However, looking at the current SQL setup would be
> > beneficial to understanding the desired format of the Elasticsearch
> > indexes.
> > I would love to dive into this project and have 3 main parts -
> > getting
> > data into ES, turning it into dashboard displays, and then fine
> > tuning
> > and perhaps augmenting the dashboard to improve its usefulness.
> > Getting data into ES may seem simple but I believe that once it
> > needs to be used for the dashboard, many realizations will pop up
> > - thus I’d like to leave maybe 2-3 weeks for that first step, 6-7
> > weeks
> > for the visualizations (which will include querying the data), and
> > the
> > final 3 weeks for touch ups and improvements."
>
> The plan could be sound, but would need some tweaks, once your skills
> in Python are clear, which could be the main blocker for the first
> stages.
>
> > Does this sound like an accurate summary and reasonable timeline?
> > And I am guessing that from Jesus's involvement with the threads
> > that Jesus would be the mentor, is that correct? :)
>
> Yes, I would be ;-)
>
>         Jesus.
>
> > Thanks!
> >
> > Heather
> >
> >
> > On Sun, Apr 9, 2017 at 9:50 PM, Heather Booker <heather.j.booker@gmai
> > l.com> wrote:
> > > Hi Jesus,
> > >
> > > While using the Elasticsearch python library
> > > (https://elasticsearch-py.readthedocs.io/en/master/) to add mbox
> > > messages to an index, I would get a UnicodeEncodeError:
> > > "'utf-8' codec can't encode character '\udca0' in position 767:
> > > surrogates not allowed".
> > >
> > > Investigating in Grimoire elk https://github.com/grim
> > > oirelab/GrimoireELK/blob/96b00bc682485976104a6825ca63ae0
> > > 8639deacc/grimoire_elk/elk/mbox.py#L200 seems to show that
> > > perhaps that tool instead uses Latin-1 encoding, but I found that
> > > to then produce a serialization error (their custom error message:
> > > "Unable to serialize %r (type: %s)"). I suppose this is because
> > > now it's bytes; of course, converting back to string after encoding
> > > just cycles back to the first error.
> > >
> > > As somewhat of a Python newbie I don't really know how to tackle
> > > this! My thought atm is to splice the offending character out
> > > of the message.
> > >
> > > And to clarify, my understanding is that the final result of this
> > > task
> > > is an index of Xen data, with two types: commits and messages.
> > > Each commit document should contain its original information
> > > from git, plus the name of the branch it was developed in. And
> > > should only the mbox messages which appear to be associated
> > > with a specific commit exist in the final index? Is there some
> > > key information in messages that is supposed to indicate the
> > > association of a given commit with a git branch? I would be
> > > grateful if you could specify the end goal a little more. :D
> > >
> > > Thanks so much!
> > >
> > > Heather
> > >
> > >
> > >
> > > On Sat, Apr 8, 2017 at 10:02 AM, Jesus M. Gonzalez-Barahona <jgb@bi
> > > tergia.com> wrote:
> > > > On Fri, 2017-04-07 at 15:49 -0700, Heather Booker wrote:
> > > > > Hi Jesus,
> > > > >
> > > > > Thanks for your reply!
> > > > >
> > > > > So about the task, instructions say after analyzing mboxes with
> > > > > Perceval to
> > > > > "store the resulting raw index in ElasticSearch" - what does
> > > > raw
> > > > > index mean?
> > > >
> > > > In this context, I mean "storing the JSON documents produced by
> > > > Perceval in an ElasticSearch index, as such". ElasticSearch
> > > > stores JSON
> > > > documents, so it is just uploading the output of Perceval to it.
> > > >
> > > > > In terms of figuring out the elasticsearch structure, do I want
> > > > an
> > > > > index
> > > > > (xen-devel mbox) with a type (message) and each object from the
> > > > > perceval
> > > > > output to be one document? Or should it be more fine-grained?
> > > >
> > > > Exactly.
> > > >
> > > > Saludos,
> > > >
> > > >         Jesus.
> > > >
> > > > > Cheers,
> > > > >
> > > > > Heather
> > > > >
> > > > > On Thu, Apr 6, 2017 at 7:05 AM, Jesus M. Gonzalez-Barahona <jgb
> > > > @biter
> > > > > gia.com> wrote:
> > > > > > On Wed, 2017-04-05 at 16:43 -0700, Heather Booker wrote:
> > > > > > > Hi!
> > > > > > >
> > > > > > > I'd love to work on the Code Review Dashboard project for
> > > > this
> > > > > > round
> > > > > > > of Outreachy.
> > > > > >
> > > > > > Great!!
> > > > > >
> > > > > > > Are the steps outlined
> > > > > > > here http://markmail.org/message/7adkmords3imkswd still the
> > > > first
> > > > > > > contribution you'd like to see?
> > > > > >
> > > > > > Yes.
> > > > > >
> > > > > > > So is this a project that has been worked on in previous
> > > > rounds
> > > > > > of
> > > > > > > GSOC/Outreachy also?
> > > > > > > If so is there a place to find links to the previous
> > > > participants
> > > > > > > blogs? :)
> > > > > >
> > > > > > No. We had one participation at some point, but couldn't even
> > > > start
> > > > > > for
> > > > > > personal reasons. There are some people considering working
> > > > on this
> > > > > > for
> > > > > > this next round of Outreachy, however. You'll see their
> > > > messages in
> > > > > > this mailing list.
> > > > > >
> > > > > > > Should questions about how the specifications/completion of
> > > > the
> > > > > > > microtask be addressed to
> > > > > > > IRC or this list? If IRC, which channel - #xen-opw or
> > > > #metrics-
> > > > > > > grimoire? On that note, I'm
> > > > > > > curious why #metrics-grimoire is the listed channel on the
> > > > > > project
> > > > > > > page - are main contributors
> > > > > > > involved in both projects? Or is it just because the Xen
> > > > > > dashboard
> > > > > > > doesn't have a channel?
> > > > > >
> > > > > > The code review is for the Xen project, but it is done with
> > > > (I
> > > > > > mean,
> > > > > > the ssoftware used for it is) GrimoireLab, which for
> > > > historical
> > > > > > reasons
> > > > > > uses the #metrics-grimoire channel. That's why it is likely
> > > > that
> > > > > > you
> > > > > > find somebody from the project there.
> > > > > >
> > > > > > If you have questions, and find me around in IRC, please ping
> > > > me.
> > > > > > If
> > > > > > I'm not available, please send an email message.
> > > > > >
> > > > > > Saludos,
> > > > > >
> > > > > >         Jesus.
> > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > > > Heather
> > > > > > > _______________________________________________
> > > > > > > Xen-devel mailing list
> > > > > > > Xen-devel@lists.xen.org
> > > > > > > https://lists.xen.org/xen-devel
> > > > > > --
> > > > > > Bitergia: http://bitergia.com
> > > > > > /me at Twitter: https://twitter.com/jgbarah
> > > > > >
> > > > > >
> > > > >
> > > > > _______________________________________________
> > > > > Xen-devel mailing list
> > > > > Xen-devel@lists.xen.org
> > > > > https://lists.xen.org/xen-devel
> > > > --
> > > > Bitergia: http://bitergia.com
> > > > /me at Twitter: https://twitter.com/jgbarah
> > > >
> > > >
> > >
> > >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > https://lists.xen.org/xen-devel
> --
> Bitergia: http://bitergia.com
> /me at Twitter: https://twitter.com/jgbarah
>
>

[-- Attachment #1.2: Type: text/html, Size: 15459 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-04-13  7:47         ` Heather Booker
@ 2017-04-16 23:23           ` Jesus M. Gonzalez-Barahona
  2017-04-17  4:26             ` Heather Booker
  0 siblings, 1 reply; 16+ messages in thread
From: Jesus M. Gonzalez-Barahona @ 2017-04-16 23:23 UTC (permalink / raw)
  To: Heather Booker, lars.kurth; +Cc: xen-devel

On Thu, 2017-04-13 at 00:47 -0700, Heather Booker wrote:
> Hi,
> 
> I submitted an application for this code review dashboard and
> would love to keep working on the microtask once I get some
> more info. :)

Great! I answered your message, could you progress with the task?

> I also came up with a general idea of how the project might be
> split up - any feedback on this would be welcome! I wrote:
> 
> "As said by Jesus, the big picture of this project will be porting
> everything behind the current code review dashboard to use
> Grimoire Lab tools, from the current state of using
> MetricsGrimoire and custom scripts. I expect this would involve
> Perceval for analyzing data, and Grimoire Elk may be useful in
> further stages, or may be too general - this is something I would
> wish to explore.
> This project will also involve a migration from SQL to Elasticsearch
> - because I believe the relevant data is mostly / all available in
> places online, I am unsure whether this would need to be a direct
> migration. However, looking at the current SQL setup would be
> beneficial to understanding the desired format of the Elasticsearch
> indexes.
> I would love to dive into this project and have 3 main parts -
> getting
> data into ES, turning it into dashboard displays, and then fine
> tuning
> and perhaps augmenting the dashboard to improve its usefulness.
> Getting data into ES may seem simple but I believe that once it
> needs to be used for the dashboard, many realizations will pop up
> - thus I’d like to leave maybe 2-3 weeks for that first step, 6-7
> weeks
> for the visualizations (which will include querying the data), and
> the
> final 3 weeks for touch ups and improvements."

The plan could be sound, but would need some tweaks, once your skills
in Python are clear, which could be the main blocker for the first
stages.

> Does this sound like an accurate summary and reasonable timeline? 
> And I am guessing that from Jesus's involvement with the threads
> that Jesus would be the mentor, is that correct? :)

Yes, I would be ;-)

	Jesus.

> Thanks!
> 
> Heather
> 
> 
> On Sun, Apr 9, 2017 at 9:50 PM, Heather Booker <heather.j.booker@gmai
> l.com> wrote:
> > Hi Jesus,
> > 
> > While using the Elasticsearch python library
> > (https://elasticsearch-py.readthedocs.io/en/master/) to add mbox
> > messages to an index, I would get a UnicodeEncodeError:
> > "'utf-8' codec can't encode character '\udca0' in position 767:
> > surrogates not allowed".
> > 
> > Investigating in Grimoire elk https://github.com/grim
> > oirelab/GrimoireELK/blob/96b00bc682485976104a6825ca63ae0
> > 8639deacc/grimoire_elk/elk/mbox.py#L200 seems to show that 
> > perhaps that tool instead uses Latin-1 encoding, but I found that
> > to then produce a serialization error (their custom error message:
> > "Unable to serialize %r (type: %s)"). I suppose this is because
> > now it's bytes; of course, converting back to string after encoding
> > just cycles back to the first error.
> > 
> > As somewhat of a Python newbie I don't really know how to tackle
> > this! My thought atm is to splice the offending character out
> > of the message. 
> > 
> > And to clarify, my understanding is that the final result of this
> > task
> > is an index of Xen data, with two types: commits and messages.
> > Each commit document should contain its original information
> > from git, plus the name of the branch it was developed in. And
> > should only the mbox messages which appear to be associated
> > with a specific commit exist in the final index? Is there some
> > key information in messages that is supposed to indicate the
> > association of a given commit with a git branch? I would be
> > grateful if you could specify the end goal a little more. :D
> > 
> > Thanks so much!
> > 
> > Heather
> > 
> > 
> > 
> > On Sat, Apr 8, 2017 at 10:02 AM, Jesus M. Gonzalez-Barahona <jgb@bi
> > tergia.com> wrote:
> > > On Fri, 2017-04-07 at 15:49 -0700, Heather Booker wrote:
> > > > Hi Jesus, 
> > > >
> > > > Thanks for your reply!
> > > >
> > > > So about the task, instructions say after analyzing mboxes with
> > > > Perceval to
> > > > "store the resulting raw index in ElasticSearch" - what does
> > > raw
> > > > index mean?
> > > 
> > > In this context, I mean "storing the JSON documents produced by
> > > Perceval in an ElasticSearch index, as such". ElasticSearch
> > > stores JSON
> > > documents, so it is just uploading the output of Perceval to it.
> > > 
> > > > In terms of figuring out the elasticsearch structure, do I want
> > > an
> > > > index
> > > > (xen-devel mbox) with a type (message) and each object from the
> > > > perceval
> > > > output to be one document? Or should it be more fine-grained?
> > > 
> > > Exactly.
> > > 
> > > Saludos,
> > > 
> > >         Jesus.
> > > 
> > > > Cheers,
> > > >
> > > > Heather
> > > >
> > > > On Thu, Apr 6, 2017 at 7:05 AM, Jesus M. Gonzalez-Barahona <jgb
> > > @biter
> > > > gia.com> wrote:
> > > > > On Wed, 2017-04-05 at 16:43 -0700, Heather Booker wrote:
> > > > > > Hi!
> > > > > >
> > > > > > I'd love to work on the Code Review Dashboard project for
> > > this
> > > > > round
> > > > > > of Outreachy.
> > > > >
> > > > > Great!!
> > > > >
> > > > > > Are the steps outlined
> > > > > > here http://markmail.org/message/7adkmords3imkswd still the
> > > first
> > > > > > contribution you'd like to see?
> > > > >
> > > > > Yes.
> > > > >
> > > > > > So is this a project that has been worked on in previous
> > > rounds
> > > > > of
> > > > > > GSOC/Outreachy also?
> > > > > > If so is there a place to find links to the previous
> > > participants
> > > > > > blogs? :)
> > > > >
> > > > > No. We had one participation at some point, but couldn't even
> > > start
> > > > > for
> > > > > personal reasons. There are some people considering working
> > > on this
> > > > > for
> > > > > this next round of Outreachy, however. You'll see their
> > > messages in
> > > > > this mailing list.
> > > > >
> > > > > > Should questions about how the specifications/completion of
> > > the
> > > > > > microtask be addressed to
> > > > > > IRC or this list? If IRC, which channel - #xen-opw or
> > > #metrics-
> > > > > > grimoire? On that note, I'm 
> > > > > > curious why #metrics-grimoire is the listed channel on the
> > > > > project
> > > > > > page - are main contributors
> > > > > > involved in both projects? Or is it just because the Xen
> > > > > dashboard
> > > > > > doesn't have a channel?
> > > > >
> > > > > The code review is for the Xen project, but it is done with
> > > (I
> > > > > mean,
> > > > > the ssoftware used for it is) GrimoireLab, which for
> > > historical
> > > > > reasons
> > > > > uses the #metrics-grimoire channel. That's why it is likely
> > > that
> > > > > you
> > > > > find somebody from the project there.
> > > > >
> > > > > If you have questions, and find me around in IRC, please ping
> > > me.
> > > > > If
> > > > > I'm not available, please send an email message.
> > > > >
> > > > > Saludos,
> > > > >
> > > > >         Jesus.
> > > > >
> > > > > > Thanks!
> > > > > >
> > > > > > Heather
> > > > > > _______________________________________________
> > > > > > Xen-devel mailing list
> > > > > > Xen-devel@lists.xen.org
> > > > > > https://lists.xen.org/xen-devel
> > > > > --
> > > > > Bitergia: http://bitergia.com
> > > > > /me at Twitter: https://twitter.com/jgbarah
> > > > >
> > > > >
> > > >
> > > > _______________________________________________
> > > > Xen-devel mailing list
> > > > Xen-devel@lists.xen.org
> > > > https://lists.xen.org/xen-devel
> > > --
> > > Bitergia: http://bitergia.com
> > > /me at Twitter: https://twitter.com/jgbarah
> > > 
> > > 
> > 
> > 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-04-10  4:50       ` Heather Booker
  2017-04-13  7:47         ` Heather Booker
@ 2017-04-13 12:21         ` Jesus M. Gonzalez-Barahona
  1 sibling, 0 replies; 16+ messages in thread
From: Jesus M. Gonzalez-Barahona @ 2017-04-13 12:21 UTC (permalink / raw)
  To: Heather Booker; +Cc: xen-devel

On Sun, 2017-04-09 at 21:50 -0700, Heather Booker wrote:
> Hi Jesus,
> 
> While using the Elasticsearch python library
> (https://elasticsearch-py.readthedocs.io/en/master/) to add mbox
> messages to an index, I would get a UnicodeEncodeError:
> "'utf-8' codec can't encode character '\udca0' in position 767:
> surrogates not allowed".
> 

What happens here is that Perceval has some assumptions about character
encoding, when reading messages (to convert them to Unicode strings).
If they are not fulfilled, it converts the character as "surrogate".
When trying to produce utf8 from those, that cannnot be done, since the
space for "surrogate" Unicode is thought to convert back to the
original encoding. But JSON expects the encoding to be utf8, so no luck
here.

The trick is to provide a serializer which either skips those messages,
or produces a "escaped" encoding for them.

See http://lucumr.pocoo.org/2013/7/2/the-updated-guide-to-unicode/ for
a detailed explanation.

Please, let me know if you can work from here...

	Jesus.

> Investigating in Grimoire elk https://github.com/grim
> oirelab/GrimoireELK/blob/96b00bc682485976104a6825ca63ae0
> 8639deacc/grimoire_elk/elk/mbox.py#L200 seems to show that 
> perhaps that tool instead uses Latin-1 encoding, but I found that
> to then produce a serialization error (their custom error message:
> "Unable to serialize %r (type: %s)"). I suppose this is because
> now it's bytes; of course, converting back to string after encoding
> just cycles back to the first error.
> 
> As somewhat of a Python newbie I don't really know how to tackle
> this! My thought atm is to splice the offending character out
> of the message. 
> 
> And to clarify, my understanding is that the final result of this
> task
> is an index of Xen data, with two types: commits and messages.
> Each commit document should contain its original information
> from git, plus the name of the branch it was developed in. And
> should only the mbox messages which appear to be associated
> with a specific commit exist in the final index? Is there some
> key information in messages that is supposed to indicate the
> association of a given commit with a git branch? I would be
> grateful if you could specify the end goal a little more. :D
> 
> Thanks so much!
> 
> Heather
> 
> 
> 
> On Sat, Apr 8, 2017 at 10:02 AM, Jesus M. Gonzalez-Barahona <jgb@bite
> rgia.com> wrote:
> > On Fri, 2017-04-07 at 15:49 -0700, Heather Booker wrote:
> > > Hi Jesus, 
> > >
> > > Thanks for your reply!
> > >
> > > So about the task, instructions say after analyzing mboxes with
> > > Perceval to
> > > "store the resulting raw index in ElasticSearch" - what does raw
> > > index mean?
> > 
> > In this context, I mean "storing the JSON documents produced by
> > Perceval in an ElasticSearch index, as such". ElasticSearch stores
> > JSON
> > documents, so it is just uploading the output of Perceval to it.
> > 
> > > In terms of figuring out the elasticsearch structure, do I want
> > an
> > > index
> > > (xen-devel mbox) with a type (message) and each object from the
> > > perceval
> > > output to be one document? Or should it be more fine-grained?
> > 
> > Exactly.
> > 
> > Saludos,
> > 
> >         Jesus.
> > 
> > > Cheers,
> > >
> > > Heather
> > >
> > > On Thu, Apr 6, 2017 at 7:05 AM, Jesus M. Gonzalez-Barahona <jgb@b
> > iter
> > > gia.com> wrote:
> > > > On Wed, 2017-04-05 at 16:43 -0700, Heather Booker wrote:
> > > > > Hi!
> > > > >
> > > > > I'd love to work on the Code Review Dashboard project for
> > this
> > > > round
> > > > > of Outreachy.
> > > >
> > > > Great!!
> > > >
> > > > > Are the steps outlined
> > > > > here http://markmail.org/message/7adkmords3imkswd still the
> > first
> > > > > contribution you'd like to see?
> > > >
> > > > Yes.
> > > >
> > > > > So is this a project that has been worked on in previous
> > rounds
> > > > of
> > > > > GSOC/Outreachy also?
> > > > > If so is there a place to find links to the previous
> > participants
> > > > > blogs? :)
> > > >
> > > > No. We had one participation at some point, but couldn't even
> > start
> > > > for
> > > > personal reasons. There are some people considering working on
> > this
> > > > for
> > > > this next round of Outreachy, however. You'll see their
> > messages in
> > > > this mailing list.
> > > >
> > > > > Should questions about how the specifications/completion of
> > the
> > > > > microtask be addressed to
> > > > > IRC or this list? If IRC, which channel - #xen-opw or
> > #metrics-
> > > > > grimoire? On that note, I'm 
> > > > > curious why #metrics-grimoire is the listed channel on the
> > > > project
> > > > > page - are main contributors
> > > > > involved in both projects? Or is it just because the Xen
> > > > dashboard
> > > > > doesn't have a channel?
> > > >
> > > > The code review is for the Xen project, but it is done with (I
> > > > mean,
> > > > the ssoftware used for it is) GrimoireLab, which for historical
> > > > reasons
> > > > uses the #metrics-grimoire channel. That's why it is likely
> > that
> > > > you
> > > > find somebody from the project there.
> > > >
> > > > If you have questions, and find me around in IRC, please ping
> > me.
> > > > If
> > > > I'm not available, please send an email message.
> > > >
> > > > Saludos,
> > > >
> > > >         Jesus.
> > > >
> > > > > Thanks!
> > > > >
> > > > > Heather
> > > > > _______________________________________________
> > > > > Xen-devel mailing list
> > > > > Xen-devel@lists.xen.org
> > > > > https://lists.xen.org/xen-devel
> > > > --
> > > > Bitergia: http://bitergia.com
> > > > /me at Twitter: https://twitter.com/jgbarah
> > > >
> > > >
> > >
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.xen.org
> > > https://lists.xen.org/xen-devel
> > --
> > Bitergia: http://bitergia.com
> > /me at Twitter: https://twitter.com/jgbarah
> > 
> > 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-04-10  4:50       ` Heather Booker
@ 2017-04-13  7:47         ` Heather Booker
  2017-04-16 23:23           ` Jesus M. Gonzalez-Barahona
  2017-04-13 12:21         ` Jesus M. Gonzalez-Barahona
  1 sibling, 1 reply; 16+ messages in thread
From: Heather Booker @ 2017-04-13  7:47 UTC (permalink / raw)
  To: Jesus M. Gonzalez-Barahona, lars.kurth; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 6977 bytes --]

Hi,

I submitted an application for this code review dashboard and
would love to keep working on the microtask once I get some
more info. :)

I also came up with a general idea of how the project might be
split up - any feedback on this would be welcome! I wrote:

"As said by Jesus, the big picture of this project will be porting
everything behind the current code review dashboard to use
Grimoire Lab tools, from the current state of using
MetricsGrimoire and custom scripts. I expect this would involve
Perceval for analyzing data, and Grimoire Elk may be useful in
further stages, or may be too general - this is something I would
wish to explore.
This project will also involve a migration from SQL to Elasticsearch
- because I believe the relevant data is mostly / all available in
places online, I am unsure whether this would need to be a direct
migration. However, looking at the current SQL setup would be
beneficial to understanding the desired format of the Elasticsearch
indexes.
I would love to dive into this project and have 3 main parts - getting
data into ES, turning it into dashboard displays, and then fine tuning
and perhaps augmenting the dashboard to improve its usefulness.
Getting data into ES may seem simple but I believe that once it
needs to be used for the dashboard, many realizations will pop up
- thus I’d like to leave maybe 2-3 weeks for that first step, 6-7 weeks
for the visualizations (which will include querying the data), and the
final 3 weeks for touch ups and improvements."

Does this sound like an accurate summary and reasonable timeline?
And I am guessing that from Jesus's involvement with the threads
that Jesus would be the mentor, is that correct? :)

Thanks!

Heather


On Sun, Apr 9, 2017 at 9:50 PM, Heather Booker <heather.j.booker@gmail.com>
wrote:

> Hi Jesus,
>
> While using the Elasticsearch python library
> (https://elasticsearch-py.readthedocs.io/en/master/) to add mbox
> messages to an index, I would get a UnicodeEncodeError:
> "'utf-8' codec can't encode character '\udca0' in position 767:
> surrogates not allowed".
>
> Investigating in Grimoire elk https://github.com/grim
> oirelab/GrimoireELK/blob/96b00bc682485976104a6825ca63ae0
> 8639deacc/grimoire_elk/elk/mbox.py#L200 seems to show that
> perhaps that tool instead uses Latin-1 encoding, but I found that
> to then produce a serialization error (their custom error message:
> "Unable to serialize %r (type: %s)"). I suppose this is because
> now it's bytes; of course, converting back to string after encoding
> just cycles back to the first error.
>
> As somewhat of a Python newbie I don't really know how to tackle
> this! My thought atm is to splice the offending character out
> of the message.
>
> And to clarify, my understanding is that the final result of this task
> is an index of Xen data, with two types: commits and messages.
> Each commit document should contain its original information
> from git, plus the name of the branch it was developed in. And
> should only the mbox messages which appear to be associated
> with a specific commit exist in the final index? Is there some
> key information in messages that is supposed to indicate the
> association of a given commit with a git branch? I would be
> grateful if you could specify the end goal a little more. :D
>
> Thanks so much!
>
> Heather
>
>
>
> On Sat, Apr 8, 2017 at 10:02 AM, Jesus M. Gonzalez-Barahona <
> jgb@bitergia.com> wrote:
>
>> On Fri, 2017-04-07 at 15:49 -0700, Heather Booker wrote:
>> > Hi Jesus,
>> >
>> > Thanks for your reply!
>> >
>> > So about the task, instructions say after analyzing mboxes with
>> > Perceval to
>> > "store the resulting raw index in ElasticSearch" - what does raw
>> > index mean?
>>
>> In this context, I mean "storing the JSON documents produced by
>> Perceval in an ElasticSearch index, as such". ElasticSearch stores JSON
>> documents, so it is just uploading the output of Perceval to it.
>>
>> > In terms of figuring out the elasticsearch structure, do I want an
>> > index
>> > (xen-devel mbox) with a type (message) and each object from the
>> > perceval
>> > output to be one document? Or should it be more fine-grained?
>>
>> Exactly.
>>
>> Saludos,
>>
>>         Jesus.
>>
>> > Cheers,
>> >
>> > Heather
>> >
>> > On Thu, Apr 6, 2017 at 7:05 AM, Jesus M. Gonzalez-Barahona <jgb@biter
>> > gia.com> wrote:
>> > > On Wed, 2017-04-05 at 16:43 -0700, Heather Booker wrote:
>> > > > Hi!
>> > > >
>> > > > I'd love to work on the Code Review Dashboard project for this
>> > > round
>> > > > of Outreachy.
>> > >
>> > > Great!!
>> > >
>> > > > Are the steps outlined
>> > > > here http://markmail.org/message/7adkmords3imkswd still the first
>> > > > contribution you'd like to see?
>> > >
>> > > Yes.
>> > >
>> > > > So is this a project that has been worked on in previous rounds
>> > > of
>> > > > GSOC/Outreachy also?
>> > > > If so is there a place to find links to the previous participants
>> > > > blogs? :)
>> > >
>> > > No. We had one participation at some point, but couldn't even start
>> > > for
>> > > personal reasons. There are some people considering working on this
>> > > for
>> > > this next round of Outreachy, however. You'll see their messages in
>> > > this mailing list.
>> > >
>> > > > Should questions about how the specifications/completion of the
>> > > > microtask be addressed to
>> > > > IRC or this list? If IRC, which channel - #xen-opw or #metrics-
>> > > > grimoire? On that note, I'm
>> > > > curious why #metrics-grimoire is the listed channel on the
>> > > project
>> > > > page - are main contributors
>> > > > involved in both projects? Or is it just because the Xen
>> > > dashboard
>> > > > doesn't have a channel?
>> > >
>> > > The code review is for the Xen project, but it is done with (I
>> > > mean,
>> > > the ssoftware used for it is) GrimoireLab, which for historical
>> > > reasons
>> > > uses the #metrics-grimoire channel. That's why it is likely that
>> > > you
>> > > find somebody from the project there.
>> > >
>> > > If you have questions, and find me around in IRC, please ping me.
>> > > If
>> > > I'm not available, please send an email message.
>> > >
>> > > Saludos,
>> > >
>> > >         Jesus.
>> > >
>> > > > Thanks!
>> > > >
>> > > > Heather
>> > > > _______________________________________________
>> > > > Xen-devel mailing list
>> > > > Xen-devel@lists.xen.org
>> > > > https://lists.xen.org/xen-devel
>> > > --
>> > > Bitergia: http://bitergia.com
>> > > /me at Twitter: https://twitter.com/jgbarah
>> > >
>> > >
>> >
>> > _______________________________________________
>> > Xen-devel mailing list
>> > Xen-devel@lists.xen.org
>> > https://lists.xen.org/xen-devel
>> --
>> Bitergia: http://bitergia.com
>> /me at Twitter: https://twitter.com/jgbarah
>>
>>
>

[-- Attachment #1.2: Type: text/html, Size: 10140 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-04-08 17:02     ` Jesus M. Gonzalez-Barahona
@ 2017-04-10  4:50       ` Heather Booker
  2017-04-13  7:47         ` Heather Booker
  2017-04-13 12:21         ` Jesus M. Gonzalez-Barahona
  0 siblings, 2 replies; 16+ messages in thread
From: Heather Booker @ 2017-04-10  4:50 UTC (permalink / raw)
  To: Jesus M. Gonzalez-Barahona; +Cc: xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 4793 bytes --]

Hi Jesus,

While using the Elasticsearch python library
(https://elasticsearch-py.readthedocs.io/en/master/) to add mbox
messages to an index, I would get a UnicodeEncodeError:
"'utf-8' codec can't encode character '\udca0' in position 767:
surrogates not allowed".

Investigating in Grimoire elk https://github.com/grim
oirelab/GrimoireELK/blob/96b00bc682485976104a6825ca63ae0
8639deacc/grimoire_elk/elk/mbox.py#L200 seems to show that
perhaps that tool instead uses Latin-1 encoding, but I found that
to then produce a serialization error (their custom error message:
"Unable to serialize %r (type: %s)"). I suppose this is because
now it's bytes; of course, converting back to string after encoding
just cycles back to the first error.

As somewhat of a Python newbie I don't really know how to tackle
this! My thought atm is to splice the offending character out
of the message.

And to clarify, my understanding is that the final result of this task
is an index of Xen data, with two types: commits and messages.
Each commit document should contain its original information
from git, plus the name of the branch it was developed in. And
should only the mbox messages which appear to be associated
with a specific commit exist in the final index? Is there some
key information in messages that is supposed to indicate the
association of a given commit with a git branch? I would be
grateful if you could specify the end goal a little more. :D

Thanks so much!

Heather



On Sat, Apr 8, 2017 at 10:02 AM, Jesus M. Gonzalez-Barahona <
jgb@bitergia.com> wrote:

> On Fri, 2017-04-07 at 15:49 -0700, Heather Booker wrote:
> > Hi Jesus,
> >
> > Thanks for your reply!
> >
> > So about the task, instructions say after analyzing mboxes with
> > Perceval to
> > "store the resulting raw index in ElasticSearch" - what does raw
> > index mean?
>
> In this context, I mean "storing the JSON documents produced by
> Perceval in an ElasticSearch index, as such". ElasticSearch stores JSON
> documents, so it is just uploading the output of Perceval to it.
>
> > In terms of figuring out the elasticsearch structure, do I want an
> > index
> > (xen-devel mbox) with a type (message) and each object from the
> > perceval
> > output to be one document? Or should it be more fine-grained?
>
> Exactly.
>
> Saludos,
>
>         Jesus.
>
> > Cheers,
> >
> > Heather
> >
> > On Thu, Apr 6, 2017 at 7:05 AM, Jesus M. Gonzalez-Barahona <jgb@biter
> > gia.com> wrote:
> > > On Wed, 2017-04-05 at 16:43 -0700, Heather Booker wrote:
> > > > Hi!
> > > >
> > > > I'd love to work on the Code Review Dashboard project for this
> > > round
> > > > of Outreachy.
> > >
> > > Great!!
> > >
> > > > Are the steps outlined
> > > > here http://markmail.org/message/7adkmords3imkswd still the first
> > > > contribution you'd like to see?
> > >
> > > Yes.
> > >
> > > > So is this a project that has been worked on in previous rounds
> > > of
> > > > GSOC/Outreachy also?
> > > > If so is there a place to find links to the previous participants
> > > > blogs? :)
> > >
> > > No. We had one participation at some point, but couldn't even start
> > > for
> > > personal reasons. There are some people considering working on this
> > > for
> > > this next round of Outreachy, however. You'll see their messages in
> > > this mailing list.
> > >
> > > > Should questions about how the specifications/completion of the
> > > > microtask be addressed to
> > > > IRC or this list? If IRC, which channel - #xen-opw or #metrics-
> > > > grimoire? On that note, I'm
> > > > curious why #metrics-grimoire is the listed channel on the
> > > project
> > > > page - are main contributors
> > > > involved in both projects? Or is it just because the Xen
> > > dashboard
> > > > doesn't have a channel?
> > >
> > > The code review is for the Xen project, but it is done with (I
> > > mean,
> > > the ssoftware used for it is) GrimoireLab, which for historical
> > > reasons
> > > uses the #metrics-grimoire channel. That's why it is likely that
> > > you
> > > find somebody from the project there.
> > >
> > > If you have questions, and find me around in IRC, please ping me.
> > > If
> > > I'm not available, please send an email message.
> > >
> > > Saludos,
> > >
> > >         Jesus.
> > >
> > > > Thanks!
> > > >
> > > > Heather
> > > > _______________________________________________
> > > > Xen-devel mailing list
> > > > Xen-devel@lists.xen.org
> > > > https://lists.xen.org/xen-devel
> > > --
> > > Bitergia: http://bitergia.com
> > > /me at Twitter: https://twitter.com/jgbarah
> > >
> > >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > https://lists.xen.org/xen-devel
> --
> Bitergia: http://bitergia.com
> /me at Twitter: https://twitter.com/jgbarah
>
>

[-- Attachment #1.2: Type: text/html, Size: 7381 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-04-07 22:49   ` Heather Booker
@ 2017-04-08 17:02     ` Jesus M. Gonzalez-Barahona
  2017-04-10  4:50       ` Heather Booker
  0 siblings, 1 reply; 16+ messages in thread
From: Jesus M. Gonzalez-Barahona @ 2017-04-08 17:02 UTC (permalink / raw)
  To: Heather Booker, xen-devel

On Fri, 2017-04-07 at 15:49 -0700, Heather Booker wrote:
> Hi Jesus, 
> 
> Thanks for your reply!
> 
> So about the task, instructions say after analyzing mboxes with
> Perceval to
> "store the resulting raw index in ElasticSearch" - what does raw
> index mean?

In this context, I mean "storing the JSON documents produced by
Perceval in an ElasticSearch index, as such". ElasticSearch stores JSON
documents, so it is just uploading the output of Perceval to it.

> In terms of figuring out the elasticsearch structure, do I want an
> index
> (xen-devel mbox) with a type (message) and each object from the
> perceval
> output to be one document? Or should it be more fine-grained?

Exactly.

Saludos,

	Jesus.

> Cheers,
> 
> Heather
> 
> On Thu, Apr 6, 2017 at 7:05 AM, Jesus M. Gonzalez-Barahona <jgb@biter
> gia.com> wrote:
> > On Wed, 2017-04-05 at 16:43 -0700, Heather Booker wrote:
> > > Hi!
> > >
> > > I'd love to work on the Code Review Dashboard project for this
> > round
> > > of Outreachy.
> > 
> > Great!!
> > 
> > > Are the steps outlined
> > > here http://markmail.org/message/7adkmords3imkswd still the first
> > > contribution you'd like to see?
> > 
> > Yes.
> > 
> > > So is this a project that has been worked on in previous rounds
> > of
> > > GSOC/Outreachy also?
> > > If so is there a place to find links to the previous participants
> > > blogs? :)
> > 
> > No. We had one participation at some point, but couldn't even start
> > for
> > personal reasons. There are some people considering working on this
> > for
> > this next round of Outreachy, however. You'll see their messages in
> > this mailing list.
> > 
> > > Should questions about how the specifications/completion of the
> > > microtask be addressed to
> > > IRC or this list? If IRC, which channel - #xen-opw or #metrics-
> > > grimoire? On that note, I'm 
> > > curious why #metrics-grimoire is the listed channel on the
> > project
> > > page - are main contributors
> > > involved in both projects? Or is it just because the Xen
> > dashboard
> > > doesn't have a channel?
> > 
> > The code review is for the Xen project, but it is done with (I
> > mean,
> > the ssoftware used for it is) GrimoireLab, which for historical
> > reasons
> > uses the #metrics-grimoire channel. That's why it is likely that
> > you
> > find somebody from the project there.
> > 
> > If you have questions, and find me around in IRC, please ping me.
> > If
> > I'm not available, please send an email message.
> > 
> > Saludos,
> > 
> >         Jesus.
> > 
> > > Thanks!
> > >
> > > Heather
> > > _______________________________________________
> > > Xen-devel mailing list
> > > Xen-devel@lists.xen.org
> > > https://lists.xen.org/xen-devel
> > --
> > Bitergia: http://bitergia.com
> > /me at Twitter: https://twitter.com/jgbarah
> > 
> > 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-04-06 14:05 ` Jesus M. Gonzalez-Barahona
@ 2017-04-07 22:49   ` Heather Booker
  2017-04-08 17:02     ` Jesus M. Gonzalez-Barahona
  0 siblings, 1 reply; 16+ messages in thread
From: Heather Booker @ 2017-04-07 22:49 UTC (permalink / raw)
  To: Jesus M. Gonzalez-Barahona, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 2320 bytes --]

Hi Jesus,

Thanks for your reply!

So about the task, instructions say after analyzing mboxes with Perceval to
"store the resulting raw index in ElasticSearch" - what does raw index mean?
In terms of figuring out the elasticsearch structure, do I want an index
(xen-devel mbox) with a type (message) and each object from the perceval
output to be one document? Or should it be more fine-grained?

Cheers,

Heather

On Thu, Apr 6, 2017 at 7:05 AM, Jesus M. Gonzalez-Barahona <jgb@bitergia.com
> wrote:

> On Wed, 2017-04-05 at 16:43 -0700, Heather Booker wrote:
> > Hi!
> >
> > I'd love to work on the Code Review Dashboard project for this round
> > of Outreachy.
>
> Great!!
>
> > Are the steps outlined
> > here http://markmail.org/message/7adkmords3imkswd still the first
> > contribution you'd like to see?
>
> Yes.
>
> > So is this a project that has been worked on in previous rounds of
> > GSOC/Outreachy also?
> > If so is there a place to find links to the previous participants
> > blogs? :)
>
> No. We had one participation at some point, but couldn't even start for
> personal reasons. There are some people considering working on this for
> this next round of Outreachy, however. You'll see their messages in
> this mailing list.
>
> > Should questions about how the specifications/completion of the
> > microtask be addressed to
> > IRC or this list? If IRC, which channel - #xen-opw or #metrics-
> > grimoire? On that note, I'm
> > curious why #metrics-grimoire is the listed channel on the project
> > page - are main contributors
> > involved in both projects? Or is it just because the Xen dashboard
> > doesn't have a channel?
>
> The code review is for the Xen project, but it is done with (I mean,
> the ssoftware used for it is) GrimoireLab, which for historical reasons
> uses the #metrics-grimoire channel. That's why it is likely that you
> find somebody from the project there.
>
> If you have questions, and find me around in IRC, please ping me. If
> I'm not available, please send an email message.
>
> Saludos,
>
>         Jesus.
>
> > Thanks!
> >
> > Heather
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > https://lists.xen.org/xen-devel
> --
> Bitergia: http://bitergia.com
> /me at Twitter: https://twitter.com/jgbarah
>
>

[-- Attachment #1.2: Type: text/html, Size: 4891 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: Outreachy project - Xen Code Review Dashboard
  2017-04-05 23:43 Heather Booker
@ 2017-04-06 14:05 ` Jesus M. Gonzalez-Barahona
  2017-04-07 22:49   ` Heather Booker
  0 siblings, 1 reply; 16+ messages in thread
From: Jesus M. Gonzalez-Barahona @ 2017-04-06 14:05 UTC (permalink / raw)
  To: Heather Booker, xen-devel, lars.kurth

On Wed, 2017-04-05 at 16:43 -0700, Heather Booker wrote:
> Hi!
> 
> I'd love to work on the Code Review Dashboard project for this round
> of Outreachy.

Great!!

> Are the steps outlined
> here http://markmail.org/message/7adkmords3imkswd still the first
> contribution you'd like to see?

Yes.

> So is this a project that has been worked on in previous rounds of
> GSOC/Outreachy also?
> If so is there a place to find links to the previous participants
> blogs? :)

No. We had one participation at some point, but couldn't even start for
personal reasons. There are some people considering working on this for
this next round of Outreachy, however. You'll see their messages in
this mailing list.

> Should questions about how the specifications/completion of the
> microtask be addressed to
> IRC or this list? If IRC, which channel - #xen-opw or #metrics-
> grimoire? On that note, I'm 
> curious why #metrics-grimoire is the listed channel on the project
> page - are main contributors
> involved in both projects? Or is it just because the Xen dashboard
> doesn't have a channel?

The code review is for the Xen project, but it is done with (I mean,
the ssoftware used for it is) GrimoireLab, which for historical reasons
uses the #metrics-grimoire channel. That's why it is likely that you
find somebody from the project there.

If you have questions, and find me around in IRC, please ping me. If
I'm not available, please send an email message.

Saludos,

	Jesus.

> Thanks!
> 
> Heather
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
-- 
Bitergia: http://bitergia.com
/me at Twitter: https://twitter.com/jgbarah


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Outreachy project - Xen Code Review Dashboard
@ 2017-04-05 23:43 Heather Booker
  2017-04-06 14:05 ` Jesus M. Gonzalez-Barahona
  0 siblings, 1 reply; 16+ messages in thread
From: Heather Booker @ 2017-04-05 23:43 UTC (permalink / raw)
  To: xen-devel, lars.kurth, jgb


[-- Attachment #1.1: Type: text/plain, Size: 762 bytes --]

Hi!

I'd love to work on the Code Review Dashboard project for this round of
Outreachy.

Are the steps outlined here http://markmail.org/message/7adkmords3imkswd
still the first
contribution you'd like to see?

So is this a project that has been worked on in previous rounds of
GSOC/Outreachy also?
If so is there a place to find links to the previous participants blogs? :)

Should questions about how the specifications/completion of the microtask
be addressed to
IRC or this list? If IRC, which channel - #xen-opw or #metrics-grimoire? On
that note, I'm
curious why #metrics-grimoire is the listed channel on the project page -
are main contributors
involved in both projects? Or is it just because the Xen dashboard doesn't
have a channel?

Thanks!

Heather

[-- Attachment #1.2: Type: text/html, Size: 1056 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2017-04-19 23:22 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-31  1:48 Outreachy project - Xen Code Review Dashboard Vaishnavi Ramesh Jayaraman
2017-03-31 11:01 ` Lars Kurth
2017-04-02 18:20 ` Jesus M. Gonzalez-Barahona
2017-04-02 20:09   ` vaishnavi.ur777
2017-04-02 21:14     ` Jesus M. Gonzalez-Barahona
2017-04-05 23:43 Heather Booker
2017-04-06 14:05 ` Jesus M. Gonzalez-Barahona
2017-04-07 22:49   ` Heather Booker
2017-04-08 17:02     ` Jesus M. Gonzalez-Barahona
2017-04-10  4:50       ` Heather Booker
2017-04-13  7:47         ` Heather Booker
2017-04-16 23:23           ` Jesus M. Gonzalez-Barahona
2017-04-17  4:26             ` Heather Booker
2017-04-17  9:04               ` Jesus M. Gonzalez-Barahona
2017-04-19 23:22                 ` Heather Booker
2017-04-13 12:21         ` Jesus M. Gonzalez-Barahona

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.