Smalltalk › Frameworks & Tools › Moose

GSOC 2015 Call for Ideas

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

8 messages Options

SergeStinckwich

GSOC 2015 Call for Ideas

Dear pharoers,

this year Pharo consortium (and community) is going to take part in a
Google Summer of Code event[1] as a standalone organization. This is
an opportunity to promote Pharo, get some job done and have students
paid.

Currently we are at the most important stage as we are preparing the
organization application, and hoping that we will be accepted and
granted decent amount of project slots. Everyone can help with
application by submitting ideas for student projects.

Current list can be found at:
https://github.com/pharo-project/pharo-project-proposals/blob/master/Topics.st

It is in STON format, and result is being generated at: http://gsoc.pharo.org/

Please add your ideas following the format of existing projects and
open a pull request with them (you will need a github account).
Preferably submit ideas with possible mentors, but if none are
available at the moment ideas without mentors are also welcome.

The template to submit projects is :

PharoTopic new
title: 'The name of your project;
contact: 'email address';
supervisors: 'Supervisors names';
keywords: 'keywords separated by spaces;
context: 'a description of the context of the project';
goal: 'description of the goal';
level: 'Beginner or Intermediate or Advanced';
yourself.

We will need a lot of projects/idea before February 20th 2015, the
deadline for applying to GSOC 2015.

Do not hesitate to ask questions. Administrators of this year’s
application are Serge Stinckwich <[hidden email]> and
Yuriy Tymchuk <[hidden email]>

If you don't know how to edit the list, please send your project
following the template to the administrators.

[1]: https://www.google-melange.com/gsoc/homepage/google/gsoc2015

Cheers,
--
Serge Stinckwich
UCBN & UMI UMMISCO 209 (IRD/UPMC)
Every DSL ends up being Smalltalk
http://www.doesnotunderstand.org/

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

SergeStinckwich

Re: GSOC 2015 Call for Ideas

We have something like 45 projects ideas at the moment.
We really need more project ideas from more people (not only RMOD guys).

Even if you have a vague idea, you can contribute.

Thank you.

Regards,

On Sun, Feb 15, 2015 at 3:23 PM, Serge Stinckwich
<[hidden email]> wrote:

> Dear pharoers,
>
> this year Pharo consortium (and community) is going to take part in a
> Google Summer of Code event[1] as a standalone organization. This is
> an opportunity to promote Pharo, get some job done and have students
> paid.
>
> Currently we are at the most important stage as we are preparing the
> organization application, and hoping that we will be accepted and
> granted decent amount of project slots. Everyone can help with
> application by submitting ideas for student projects.
>
> Current list can be found at:
> https://github.com/pharo-project/pharo-project-proposals/blob/master/Topics.st
>
> It is in STON format, and result is being generated at: http://gsoc.pharo.org/
>
> Please add your ideas following the format of existing projects and
> open a pull request with them (you will need a github account).
> Preferably submit ideas with possible mentors, but if none are
> available at the moment ideas without mentors are also welcome.
>
> The template to submit projects is :
>
> PharoTopic new
> title: 'The name of your project;
> contact: 'email address';
> supervisors: 'Supervisors names';
> keywords: 'keywords separated by spaces;
> context: 'a description of the context of the project';
> goal: 'description of the goal';
> level: 'Beginner or Intermediate or Advanced';
> yourself.
>
> We will need a lot of projects/idea before February 20th 2015, the
> deadline for applying to GSOC 2015.
>
> Do not hesitate to ask questions. Administrators of this year’s
> application are Serge Stinckwich <[hidden email]> and
> Yuriy Tymchuk <[hidden email]>
>
> If you don't know how to edit the list, please send your project
> following the template to the administrators.
>
> [1]: https://www.google-melange.com/gsoc/homepage/google/gsoc2015
>
> Cheers,
> --
> Serge Stinckwich
> UCBN & UMI UMMISCO 209 (IRD/UPMC)
> Every DSL ends up being Smalltalk
> http://www.doesnotunderstand.org/

--
Serge Stinckwich
UCBN & UMI UMMISCO 209 (IRD/UPMC)
Every DSL ends up being Smalltalk
http://www.doesnotunderstand.org/

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Sven Van Caekenberghe-2

Re: [Pharo-users] GSOC 2015 Call for Ideas

> On 18 Feb 2015, at 09:52, Andrea Ferretti <[hidden email]> wrote:
>
> Also, these tasks
> often involve consuming data from various sources, such as CSV and
> Json files. NeoCSV and NeoJSON are still a little too rigid for the
> task - libraries like pandas allow to just feed a csv file and try to
> make head or tails of the content without having to define too much of
> a schema beforehand

Both NeoCSV and NeoJSON can operate in two ways, (1) without the definition of any schema's or (2) with the definition of schema's and mappings. The quick and dirty explore style is most certainly possible.

'my-data.csv' asFileReference readStreamDo: [ :in | (NeoCSVReader on: in) upToEnd ].

=> an array of arrays

'my-data.json' asFileReference readStreamDo: [ :in | (NeoJSONReader on: in) next ].

=> objects structured using dictionaries and arrays

Sven

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Sven Van Caekenberghe-2

Re: [Pharo-users] GSOC 2015 Call for Ideas

> On 18 Feb 2015, at 10:26, Andrea Ferretti <[hidden email]> wrote:
>
> Thank you Sven. I think this should be emphasized and prominent on the
> home page*. Still, libraries such as pandas are even more lenient,
> doing things such as:
>
> - autodetecting which fields are numeric in CSV files
> - allowing to fill missing data based on statistics (for instance, you
> can say: where the field `age` is missing, use the average age)
>
> Probably there is room for something built on top of Neo
>
>
> * by the way, I suggest that the documentation on Neo could benefit
> from a reorganization. Right now, the first topic on the NeoJSON
> paper introduces JSON itself. I would argue that everyone that tries
> to use the library knows what JSON is already. Still, there is no
> example of how to read JSON from a file in the whole document.

These libraries (NeoCSV, NeoJSON, STON) were all written with only a dependency on a limited character stream API. It was a design decision not to depend on a File API, because at the time we were transitioning from the old FileStreams to FileSystem.

And I disagree about the JSON introduction ;-) You might know it, but that is not the case for everyone. Like not everyone knows CSV, HTTP, ...

But I do agree that sometimes I too would like a convenience method here or there ;-)

> 2015-02-18 10:12 GMT+01:00 Sven Van Caekenberghe <[hidden email]>:
>>
>>> On 18 Feb 2015, at 09:52, Andrea Ferretti <[hidden email]> wrote:
>>>
>>> Also, these tasks
>>> often involve consuming data from various sources, such as CSV and
>>> Json files. NeoCSV and NeoJSON are still a little too rigid for the
>>> task - libraries like pandas allow to just feed a csv file and try to
>>> make head or tails of the content without having to define too much of
>>> a schema beforehand
>>
>> Both NeoCSV and NeoJSON can operate in two ways, (1) without the definition of any schema's or (2) with the definition of schema's and mappings. The quick and dirty explore style is most certainly possible.
>>
>> 'my-data.csv' asFileReference readStreamDo: [ :in | (NeoCSVReader on: in) upToEnd ].
>>
>> => an array of arrays
>>
>> 'my-data.json' asFileReference readStreamDo: [ :in | (NeoJSONReader on: in) next ].
>>
>> => objects structured using dictionaries and arrays
>>
>> Sven
>>
>>
>

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Sven Van Caekenberghe-2

Re: [Pharo-users] GSOC 2015 Call for Ideas

In reply to this post by Sven Van Caekenberghe-2

Well, you are certainly free to contribute.

Heuristic interpretation of data could be useful, but looks like an addition on top, the core library should be fast and efficient.

> On 18 Feb 2015, at 10:35, Andrea Ferretti <[hidden email]> wrote:
>
> For an example of what I am talking about, see
>
> http://pandas.pydata.org/pandas-docs/version/0.15.2/io.html#csv-text-files
>
> I agree that this is definitely too much options, but it gets the job
> done for quick and dirty exploration.
>
> The fact is that working with a dump of table on your db, whose
> content you know, requires different tools than exploring the latest
> opendata that your local municipality has put online, using yet
> another messy format.
>
> Enterprise programmers deal more often with the former, data
> scientists with the latter, and I think there is room for both kind of
> tools
>
> 2015-02-18 10:26 GMT+01:00 Andrea Ferretti <[hidden email]>:
>> Thank you Sven. I think this should be emphasized and prominent on the
>> home page*. Still, libraries such as pandas are even more lenient,
>> doing things such as:
>>
>> - autodetecting which fields are numeric in CSV files
>> - allowing to fill missing data based on statistics (for instance, you
>> can say: where the field `age` is missing, use the average age)
>>
>> Probably there is room for something built on top of Neo
>>
>>
>> * by the way, I suggest that the documentation on Neo could benefit
>> from a reorganization. Right now, the first topic on the NeoJSON
>> paper introduces JSON itself. I would argue that everyone that tries
>> to use the library knows what JSON is already. Still, there is no
>> example of how to read JSON from a file in the whole document.
>>
>> 2015-02-18 10:12 GMT+01:00 Sven Van Caekenberghe <[hidden email]>:
>>>
>>>> On 18 Feb 2015, at 09:52, Andrea Ferretti <[hidden email]> wrote:
>>>>
>>>> Also, these tasks
>>>> often involve consuming data from various sources, such as CSV and
>>>> Json files. NeoCSV and NeoJSON are still a little too rigid for the
>>>> task - libraries like pandas allow to just feed a csv file and try to
>>>> make head or tails of the content without having to define too much of
>>>> a schema beforehand
>>>
>>> Both NeoCSV and NeoJSON can operate in two ways, (1) without the definition of any schema's or (2) with the definition of schema's and mappings. The quick and dirty explore style is most certainly possible.
>>>
>>> 'my-data.csv' asFileReference readStreamDo: [ :in | (NeoCSVReader on: in) upToEnd ].
>>>
>>> => an array of arrays
>>>
>>> 'my-data.json' asFileReference readStreamDo: [ :in | (NeoJSONReader on: in) next ].
>>>
>>> => objects structured using dictionaries and arrays
>>>
>>> Sven
>>>
>>>
>

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

Sven Van Caekenberghe-2

Re: [Pharo-users] GSOC 2015 Call for Ideas

OK, try making a proposal then, http://gsoc.pharo.org has the instructions and the current list, you probably know more about data science than I do.

> On 18 Feb 2015, at 10:53, Andrea Ferretti <[hidden email]> wrote:
>
> I am sorry if the previous messages came off as too harsh. The Neo
> tools are perfectly fine for their intended use.
>
> What I was trying to say is that a good idea for a SoC project would
> be to develop a framework for data analysis that would be useful for
> data scientists, and in particular this would include something to
> import unstructured data more freely.
>
> 2015-02-18 10:39 GMT+01:00 Sven Van Caekenberghe <[hidden email]>:
>> Well, you are certainly free to contribute.
>>
>> Heuristic interpretation of data could be useful, but looks like an addition on top, the core library should be fast and efficient.
>>
>>> On 18 Feb 2015, at 10:35, Andrea Ferretti <[hidden email]> wrote:
>>>
>>> For an example of what I am talking about, see
>>>
>>> http://pandas.pydata.org/pandas-docs/version/0.15.2/io.html#csv-text-files
>>>
>>> I agree that this is definitely too much options, but it gets the job
>>> done for quick and dirty exploration.
>>>
>>> The fact is that working with a dump of table on your db, whose
>>> content you know, requires different tools than exploring the latest
>>> opendata that your local municipality has put online, using yet
>>> another messy format.
>>>
>>> Enterprise programmers deal more often with the former, data
>>> scientists with the latter, and I think there is room for both kind of
>>> tools
>>>
>>> 2015-02-18 10:26 GMT+01:00 Andrea Ferretti <[hidden email]>:
>>>> Thank you Sven. I think this should be emphasized and prominent on the
>>>> home page*. Still, libraries such as pandas are even more lenient,
>>>> doing things such as:
>>>>
>>>> - autodetecting which fields are numeric in CSV files
>>>> - allowing to fill missing data based on statistics (for instance, you
>>>> can say: where the field `age` is missing, use the average age)
>>>>
>>>> Probably there is room for something built on top of Neo
>>>>
>>>>
>>>> * by the way, I suggest that the documentation on Neo could benefit
>>>> from a reorganization. Right now, the first topic on the NeoJSON
>>>> paper introduces JSON itself. I would argue that everyone that tries
>>>> to use the library knows what JSON is already. Still, there is no
>>>> example of how to read JSON from a file in the whole document.
>>>>
>>>> 2015-02-18 10:12 GMT+01:00 Sven Van Caekenberghe <[hidden email]>:
>>>>>
>>>>>> On 18 Feb 2015, at 09:52, Andrea Ferretti <[hidden email]> wrote:
>>>>>>
>>>>>> Also, these tasks
>>>>>> often involve consuming data from various sources, such as CSV and
>>>>>> Json files. NeoCSV and NeoJSON are still a little too rigid for the
>>>>>> task - libraries like pandas allow to just feed a csv file and try to
>>>>>> make head or tails of the content without having to define too much of
>>>>>> a schema beforehand
>>>>>
>>>>> Both NeoCSV and NeoJSON can operate in two ways, (1) without the definition of any schema's or (2) with the definition of schema's and mappings. The quick and dirty explore style is most certainly possible.
>>>>>
>>>>> 'my-data.csv' asFileReference readStreamDo: [ :in | (NeoCSVReader on: in) upToEnd ].
>>>>>
>>>>> => an array of arrays
>>>>>
>>>>> 'my-data.json' asFileReference readStreamDo: [ :in | (NeoJSONReader on: in) next ].
>>>>>
>>>>> => objects structured using dictionaries and arrays
>>>>>
>>>>> Sven
>>>>>
>>>>>
>>>
>>
>>
>

_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

SergeStinckwich

Re: [Pharo-dev] [Pharo-users] GSOC 2015 Call for Ideas

On Wed, Feb 18, 2015 at 11:01 AM, Sven Van Caekenberghe <[hidden email]> wrote:

> OK, try making a proposal then, http://gsoc.pharo.org has the instructions and the current list, you probably know more about data science than I do.
>
>> On 18 Feb 2015, at 10:53, Andrea Ferretti <[hidden email]> wrote:
>>
>> I am sorry if the previous messages came off as too harsh. The Neo
>> tools are perfectly fine for their intended use.
>>
>> What I was trying to say is that a good idea for a SoC project would
>> be to develop a framework for data analysis that would be useful for
>> data scientists, and in particular this would include something to
>> import unstructured data more freely.

Sorry Andrea. I didn't see you message because I'm not pharo-users
mailing-list, only on pharo-dev.
I'm also really interested to have a gsoc project to develop data
analysis framework.
Please let's talk together in order to discuss about a proposal.

Regards,
--
Serge Stinckwich
UCBN & UMI UMMISCO 209 (IRD/UPMC)
Every DSL ends up being Smalltalk
http://www.doesnotunderstand.org/
_______________________________________________
Moose-dev mailing list
[hidden email]
https://www.iam.unibe.ch/mailman/listinfo/moose-dev

SergeStinckwich

Re: GSOC 2015 Call for Ideas

In reply to this post by SergeStinckwich

Dear all,

last week, we submit the Pharo proposal for GSOC 2015 with Uko and we
are waiting now for the answer from Google.
Accepted organisations will be announce March 2, 2015.

We have now more than 50 projects ideas but we are still looking for
more ideas from the community !

Don't be shy, propose your project idea here:
https://github.com/pharo-project/pharo-project-proposals/blob/master/Topics.st

regards,

On Sun, Feb 15, 2015 at 3:23 PM, Serge Stinckwich
<[hidden email]> wrote: