All of lore.kernel.org
 help / color / mirror / Atom feed
* [outreachy] progress
@ 2017-04-09 10:36 Vaishnavi Ramesh Jayaraman
  0 siblings, 0 replies; 2+ messages in thread
From: Vaishnavi Ramesh Jayaraman @ 2017-04-09 10:36 UTC (permalink / raw)
  To: Jesus M. Gonzalez-Barahona; +Cc: xen-devel, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3746 bytes --]

Hi,

I am now able to upload the JSON document obtained from Perceval to
ElasticSearch and then query it again to get the results. Below is my
output:-

----------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------

>>> es_result = es.search(index="messages",
doc_type="summary",body={"query":{"match":{"from": "Konrad Rzeszutek Wilk <
konrad.wilk@oracle.com>"}}})

>>> print(es_result)

{'took': 13, 'timed_out': False, '_shards': {'total': 5, 'successful': 5,
'failed': 0}, 'hits': {'total': 48, 'max_score': 9.135099, 'hits':
[{'_index': 'messages', '_type': 'summary', '_id': 'AVtR-Nw_GwYUfHJIT4Q4',
'_score': 9.135099, '_source': {'from': 'Konrad Rzeszutek Wilk <
konrad.wilk@oracle.com>', 'subject': 'Re: [xen-tct] Roadmap for
QEMU-traditional', 'date': '2014-04-18T17:01:08-04:00'}}, {'_index':
'messages', '_type': 'summary', '_id': 'AVtSJjdrGwYUfHJIT4SE', '_score':
9.135099, '_source': {'from': 'Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>',
'subject': 'Re: [xen-tct] [AGENDA] Monthly Xen.org Technical Call (April
9)', 'date': '2014-04-09T09:45:26-04:00'}}, {'_index': 'messages', '_type':
'summary', '_id': 'AVtSJjgkGwYUfHJIT4SZ', '_score': 9.135099, '_source':
{'from': 'Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>', 'subject': 'Re:
[xen-tct] Roadmap for QEMU-traditional', 'date':
'2014-04-18T10:01:42-04:00'}}, {'_index': 'messages', '_type': 'summary',
'_id': 'AVtSJjgtGwYUfHJIT4Sa', '_score': 9.135099, '_source': {'from':
'Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>', 'subject': 'Re: [xen-tct]
Roadmap for QEMU-traditional', 'date': '2014-04-18T10:01:42-04:00'}},
{'_index': 'messages', '_type': 'summary', '_id': 'AVtSJjhRGwYUfHJIT4Se',
'_score': 9.135099, '_source': {'from': 'Konrad Rzeszutek Wilk <
konrad.wilk@oracle.com>', 'subject': 'Re: [xen-tct] Roadmap for
QEMU-traditional', 'date': '2014-04-18T17:01:08-04:00'}}, {'_index':
'messages', '_type': 'summary', '_id': 'AVtSNDuSGwYUfHJIT4S_', '_score':
9.135099, '_source': {'from': 'Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>',
'subject': 'Re: [xen-tct] Roadmap for QEMU-traditional', 'date':
'2014-04-18T10:01:42-04:00'}}, {'_index': 'messages', '_type': 'summary',
'_id': 'AVtSNNd8GwYUfHJIT4TI', '_score': 9.135099, '_source': {'from':
'Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>', 'subject': 'Re: [xen-tct]
[AGENDA] Monthly Xen.org Technical Call (April 9)', 'date':
'2014-04-09T09:45:26-04:00'}}, {'_index': 'messages', '_type': 'summary',
'_id': 'AVtSIzL-GwYUfHJIT4RE', '_score': 9.05406, '_source': {'from':
'Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>', 'subject': 'Re: [xen-tct]
[AGENDA] Monthly Xen.org Technical Call (April 9)', 'date':
'2014-04-09T09:45:26-04:00'}}, {'_index': 'messages', '_type': 'summary',
'_id': 'AVtR9q9ZGwYUfHJIT4QU', '_score': 9.05406, '_source': {'from':
'Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>', 'subject': 'Re: [xen-tct]
Roadmap for QEMU-traditional', 'date': '2014-04-18T10:01:42-04:00'}},
{'_index': 'messages', '_type': 'summary', '_id': 'AVtR9q-JGwYUfHJIT4QZ',
'_score': 9.05406, '_source': {'from': 'Konrad Rzeszutek Wilk <
konrad.wilk@oracle.com>', 'subject': 'Re: [xen-tct] Roadmap for
QEMU-traditional', 'date': '2014-04-18T17:01:08-04:00'}}]}}

----------------------------------------------------------------------------------------------

----------------------------------------------------------------------------------------------

I am still working on parsing of the arguments. Also, how do I proceed
next? I will have to annotate the threads and then upload it again to
ElasticSearch with the messageIDs? What is the algorithm to be used?


thanks

Vaishnavi

[-- Attachment #1.2: Type: text/html, Size: 6282 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [outreachy] progress
@ 2017-04-16 18:50 Vaishnavi Ramesh Jayaraman
  0 siblings, 0 replies; 2+ messages in thread
From: Vaishnavi Ramesh Jayaraman @ 2017-04-16 18:50 UTC (permalink / raw)
  To: Jesus M. Gonzalez-Barahona, Lars Kurth; +Cc: xen-devel, xen-devel


[-- Attachment #1.1: Type: text/plain, Size: 3911 bytes --]

[11:24] <vr34> Hi!
[11:24] <vr34> This is Vaishnavi, outreachy applicant
[11:25] <vr34> i found a python implementation for the jwz threading algo
[11:25] <jgbarah> Hi, Vaishnavi
[11:25] <vr34>
https://github.com/akuchling/jwzthreading/blob/master/jwzthreading.py
[11:25] <vr34> so what this does is allots a message id for each of the
threads, right?
[11:26] <jgbarah> That's good. Use it as an inspiration if that suits you.
But you need to write your code...
[11:26] <vr34> Oh okay
[11:27] <jgbarah> However, since this is a microtask, no problem if you
sttart with a version which uses this code
[11:27] <vr34> what i have done till now is written a script that parses
cmd line args and parses and uploads mbox json docs to es
[11:27] <jgbarah> the main problem for using it *as such* is that very
likely it is suboptimal, since it assumes you have access to all messages
[11:28] <vr34> now what's left is - use the threading algo to get message
ids and add this to the json documents and then upload again, am i right?
[11:28] <jgbarah> which is our case is not real, since you have them in the
database, and the idea would be to minimize traffic with it
[11:28] <jgbarah> Yes, that is
[11:28] <jgbarah> Wo, if you want, try to do it in two phases:
[11:29] <jgbarah> in one, you can use the coe you found. Forget about
efficiency, and just make it work
[11:29] <jgbarah> In a second one, you can check if you can improve
performance by using your own code.
[11:29] <jgbarah> The first one will tell about how you reuse code, which
is important
[11:30] <jgbarah> The second one would tell about how you code the
algorithm in a certain scenario
[11:30] <jgbarah> Both are important...
[11:30] <vr34> okay, got it!
[11:30] <jgbarah> To be transparent to other pursuing for this project,
please send a message to the mailing list,
[11:30] <jgbarah> pointing to this implementation you found, and this
conversation, please.
[11:30] <jgbarah> A log of it would be enough.
[11:30] <vr34> i had also sent you a mail with a link to my github repo
[11:31] <vr34> Yes sure, will do
[11:31] <jgbarah> Of course, the fact that you looked for, and found, that
implmentation, will be credited to you
[11:31] <jgbarah> I saw it (the message) but still didn't look at the code.
Thanks.
[11:32] <jgbarah> Are you stumbling on any blocker?
[11:32] <vr34> sure,  thanks, i'll mail you if i have any further queries
[11:33] <vr34> i haven't yet started implementing the algo.. will
definitely let you know when i have issues.. thanks a lot!
[11:33] <jgbarah> Good. Thanks! Please, keep me updated.
[11:33] <vr34> Sure.


On Fri, Apr 14, 2017 at 11:06 AM, Vaishnavi Ramesh Jayaraman <
vaishnavi.ur777@gmail.com> wrote:

> Hi,
>
> I have applied to Outreachy for the project - Xen Code Review Dashboard
> and based on Jesus' suggestions I have made an initial contribution(There
> are more changes to be made which I am still working on.)
>
> Link to the contribution - https://github.com/vrameshj/
> dashboard/blob/master/tests.py
>
> I have created a script that accepts the mbox link as a command line
> argument, parses it and uploads the JSON documents that are obtained as an
> output from Perceval to ElasticSearch. The results can be queried too.
>
> I am currently working on annotating the threads with their message ids.
>
> Also, below is the timeline of the work I plan to accomplish:-
>
> Month 1 - The work during the first month would be centered on getting
> extensive information both from mailing lists and git repositories using
> Perceval, and then storing it in ElasticSearch.
> Month 2 - During the second month, scripts would have to be ported to use
> ElasticSearch data instead of SQL.
> Month 3- The task would be to improve the dashboard in Kibana. Various
> visualizations like pie charts, bar charts and histograms could be added to
> help understand the logs better.
>
> Thanks
> Vaishnavi
>
>
>
>

[-- Attachment #1.2: Type: text/html, Size: 17388 bytes --]

[-- Attachment #2: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-04-16 18:50 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-04-09 10:36 [outreachy] progress Vaishnavi Ramesh Jayaraman
2017-04-16 18:50 Vaishnavi Ramesh Jayaraman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.