NLP Library + Generating Documentation

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

NLP Library + Generating Documentation

Myroslava Romaniuk
For my university end-of-term programming project which I am doing in Pharo, I created a method that returns a Bag of all words used in the courses' description repeated the same amount of times that they are in the text (I am working with Udacity API) so I could use them as keywords and determine which technologies are mentioned the most in each course.

I am wondering how could I remove 'this' from the list of repeated words, and how to remove punctuation marks attached to the word? Do I use sentence segmentation in this NLP library or is there another way to do that? And I don't really need verbs either so I suppose I should use Entity Recognition from the NLP Library?

One more thing: as this is my uni project it needs to be sufficiently documented. Is there a way to generate documentation in Pharo?

Cheers,
Myroslava