Hi to all, As you may know I'm working on in some improvements for the String class. Until now I implemented some missing tests. Right now I'm looking forward to add new methods that could be useful based on Ruby API (http://www.ruby-doc.org/core-2.1.0/String.html). These are a few of the methods that I'm planning to implement:
Could you help to find out if these methods are already available for the String class?
If you have any idea of new methods for the string class, will be really welcome.
Cheers ,Daniela Meneses |
Daniela you should try the method finder open it and select the example in the dropbox then you can type examples and see if a method already implement it for example ‘abcab’ . ‘a’ . ‘bcb' shows that copyWithoutAll: is the method. but it expects a character as second argument ‘abcab’ . $a . ‘bcb’ Stef On 24 Feb 2014, at 18:30, Daniela Meneses <[hidden email]> wrote:
|
In reply to this post by Daniela Meneses
Hi Daniela, 2014-02-24 14:30 GMT-03:00 Daniela Meneses <[hidden email]>:
We can have an information retrieval API for aproximate string matching, i.e. Levenshtein distance (already implemented, various versions), Hamming distance, both are the most used and simplest edit distances. Then you have Longest common subsequence, Longest common substring (they are implemented in a package called "Fuzz", #longestCommonSubsequenceWith: ). Also there is the shift-or adapted for approximate matches (also implemented), fuzzy phrasing is another world also. Many applications use Damerau edit distance. Bioinformatics uses the Needleman-Wunsch and Smith-Waterman, but they call them "aligners" :) but you don't want to code the optimized version in Smalltalk, some say it could take years. All edit distances out there have specific requirements and no one is better than another for all cases. For example Jaro-Winkler is useful for one-word short strings. You have a lot of options for research. Smalltalkers here are very experienced and clever, always gives cool advices so don't be afraid to ask. Cheers, Hernán
|
I’m not sure that all these edit distances should be part of the String core api. Now what would be good is to have a chapter describing them. This chapter would work well with the bioSmalltalk one :)
|
Am 26.02.2014 um 09:50 schrieb Pharo4Stef <[hidden email]>: I’m pretty sure they shouldn’t. Most of these are most likely for special applications. So a perfect candidate for a string extension package. A real modular entity that could load each of them individually would be perfect but we don’t have the proper tools, yet. Unless of course every of those algorithms is composed of multiple classes and would fit naturally in a package. But the most important prerequisite would be to make a separate package out of it. Did I understand that right that those are part of biosmalltalk? Then the problem is that useful things are buried in a specialized application. I encounter this often that I don’t know about some code because it is buried inside another project. Or I know about it and cannot use it because it is tied closely to a project. my 2 cents, Norbert
|
2014-02-26 7:10 GMT-03:00 Norbert Hartl <[hidden email]>:
Absolutely for a separate package for information retrieval algorithms. From what I've seen, some algorithms require optimization through dynamic programming (automatas, matrices, etc) and that would lead to multiple classes, assuming you don't want to get dirty String class.
No. Those algorithms are spread over different packages in repositories like SqueakSource, Cincom Store, etc. Hernán |
Administrator
|
"No. Those algorithms are spread over different packages in repositories like SqueakSource, Cincom Store, etc"
Can you tell me where to find code for longest common substring? I would appreciate the detailed location. Thanks, Aik-Siong Koh |
In reply to this post by hernanmd
What fuzzy-string matching tools & packages are available today? -cam On Wed, Feb 26, 2014 at 9:09 AM, Hernán Morales Durand <[hidden email]> wrote:
|
Hi
I know that Olivier Auverlot has all kind of string distance. Hernan Morales has also a package extending string. Stef |
In reply to this post by Pharo Smalltalk Users mailing list
Thank You! -cam On Tue, Jul 28, 2015 at 11:39 AM, Cameron Sanders via Pharo-users <[hidden email]> wrote:
|
Free forum by Nabble | Edit this page |