Ropes (functional strings)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Ropes (functional strings)

KenDickey
Juan wrote:

> Yesterday I played a bit to see how quickly the system would break if
> #at: and #at:put: were banned for Strings.

I did a simple, prototype implementation of Ropes, functional strings, for Cuis.

Ropes are _immutable_ (constant) strings, and #at:put: returns a new Rope.  Internal data structures hide the details.

The basic idea is that I had a utf-8 Rope and #at:put: a utf-32 Chinese character into it, then we would NOT have to change the utf-8 part just because of one character.  We could have separate 8/16/32 bit char-arrays supporting Unicode and do mix 'n match as we needed.

        https://github.com/KenDickey/Cuis-Ropes

Invoking
        Rope openTextEditor
brings up a simple text editor with a rope instead of a string.

This seems to work OK for me, but more knowledgeable people should take a look.

As this is a prototype, Rope>>doesNotUnderstand writes a message to the Transcript and re-sends it to its stringRepresentation.

So far, I have only seen sends to #encompassParagraph, which I don't yet understand and did not implement.

Anyway, feel free to take a look and let me know if you think it is useful.

Cheers,
-KenD

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
-KenD
Reply | Threaded
Open this post in threaded view
|

Re: Ropes (functional strings)

Juan Vuletich-4
Wow! This is really cool!

I must admit that I don't know when it makes sense to use it, but it is
really interesting.

My idea of maybe using uft-8 bytearrays for Strings everywhere was based
on the intuition that maybe #at: and #at:put: are (maybe almost) not
needed by the system. Now this gives a third alternative, with its own
distinct balance between cost of different operations and memory usage.
I guess the first question to be answered remains the same: Do we need
modifiable Strings? Or immutable Strings are enough?

Food for though...

BTW, I find it totally amazing that you could hook this into the
existing TextModel and text editor _without modifying a single method_!

WRT #encompassParagraph:, the idea is pretty simple. On many text
editors (like, for instance, this email client), doing triple click
selects the whole paragraph. #encompassParagraph: answers just that. You
pass an interval, i.e. start and end indexes of some substring, and the
method answers whole paragraphs containing it. This is used, for
instance, to ensure that a TextAlignment attribute is applied to whole
paragraphs, as for instance, it doesn't make sense to have part of a
paragraph justified and part of it centered. The method has a reasonable
comments, and in senders you'll find some tests. HTH.

Cheers,
Juan Vuletich

On 12/02/2013 01:08 a.m., Ken Dickey wrote:

> Juan wrote:
>
>> Yesterday I played a bit to see how quickly the system would break if
>> #at: and #at:put: were banned for Strings.
> I did a simple, prototype implementation of Ropes, functional strings, for Cuis.
>
> Ropes are _immutable_ (constant) strings, and #at:put: returns a new Rope.  Internal data structures hide the details.
>
> The basic idea is that I had a utf-8 Rope and #at:put: a utf-32 Chinese character into it, then we would NOT have to change the utf-8 part just because of one character.  We could have separate 8/16/32 bit char-arrays supporting Unicode and do mix 'n match as we needed.
>
> https://github.com/KenDickey/Cuis-Ropes
>
> Invoking
> Rope openTextEditor
> brings up a simple text editor with a rope instead of a string.
>
> This seems to work OK for me, but more knowledgeable people should take a look.
>
> As this is a prototype, Rope>>doesNotUnderstand writes a message to the Transcript and re-sends it to its stringRepresentation.
>
> So far, I have only seen sends to #encompassParagraph, which I don't yet understand and did not implement.
>
> Anyway, feel free to take a look and let me know if you think it is useful.
>
> Cheers,
> -KenD
>
> _______________________________________________
> Cuis mailing list
> [hidden email]
> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>
>


_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
Reply | Threaded
Open this post in threaded view
|

Re: Ropes (functional strings)

KenDickey
On Wed, 13 Feb 2013 00:37:31 -0300
Juan Vuletich <[hidden email]> wrote:

> I must admit that I don't know when it makes sense to use it, but it is
> really interesting.

The Cedar/Mesa language and IDE used only ropes.

At least one Python implementation uses ropes internally for Unicode strings:

        http://morepypy.blogspot.com/2007/11/ropes-branch-merged.html


The IBM Java Ropes performance is also interesting, particularly with respect to regular expression search.  [We could probably speed this up with Zippers, but I hate to get too functional here.  ;^].

        http://www.ibm.com/developerworks/java/library/j-ropes/index.html


> BTW, I find it totally amazing that you could hook this into the
> existing TextModel and text editor _without modifying a single method_!

I thought that it was pretty cool myself.

Hey, that is the way good Smalltalk should be written.

Cheers,
-KenD

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
-KenD
Reply | Threaded
Open this post in threaded view
|

Re: Ropes (functional strings)

Hannes Hirzel
Congrats to this implementation, Ken

In particular I like how you did

Rope>>doesNotUnderstand:

doesNotUnderstand: aMessage

        "See what is missing from Ropes"
        Transcript log: (String streamContents: [:s | aMessage storeOn: s]).
       
        "Do what a String would do"
        aMessage sendTo: (self asString)


and then


Rope class>>openTextEditor


openTextEditor

"
        Rope openTextEditor.
"
        SystemWindow
                editText: (TextModel
                        withText: (FlatRope fromString:
                               'Let us see what these Rope things can do.'))
                        label: 'Text Editor'
                        wrap: true


Kind regards

Hannes

On 2/13/13, Ken Dickey <[hidden email]> wrote:

> On Wed, 13 Feb 2013 00:37:31 -0300
> Juan Vuletich <[hidden email]> wrote:
>
>> I must admit that I don't know when it makes sense to use it, but it is
>> really interesting.
>
> The Cedar/Mesa language and IDE used only ropes.
>
> At least one Python implementation uses ropes internally for Unicode
> strings:
>
> http://morepypy.blogspot.com/2007/11/ropes-branch-merged.html
>
>
> The IBM Java Ropes performance is also interesting, particularly with
> respect to regular expression search.  [We could probably speed this up with
> Zippers, but I hate to get too functional here.  ;^].
>
> http://www.ibm.com/developerworks/java/library/j-ropes/index.html
>
>
>> BTW, I find it totally amazing that you could hook this into the
>> existing TextModel and text editor _without modifying a single method_!
>
> I thought that it was pretty cool myself.
>
> Hey, that is the way good Smalltalk should be written.
>
> Cheers,
> -KenD
>
> _______________________________________________
> Cuis mailing list
> [hidden email]
> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
>

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
Reply | Threaded
Open this post in threaded view
|

Re: Ropes (functional strings)

KenDickey
On Wed, 13 Feb 2013 17:49:26 +0000
"H. Hirzel" <[hidden email]> wrote:

> Congrats to this implementation, Ken

Glad you enjoyed it.  

Nice to have multithreaded data structure access which does not need locking.

BTW, I updated with a version which does more optimization (could be still further improved) and rebalances large trees.  No change in interface.

Cheers,
-KenD

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
-KenD
Reply | Threaded
Open this post in threaded view
|

Re: Ropes (functional strings)

Guido Stepken
In reply to this post by Hannes Hirzel


doesNotUnderstand: aMessage

Is the most stupid answer, a intelligent system can give. Humans can tell not only that they didn'nt understand, but also why.

In a full reflecting language it should be totally easy to cache the last message a object received and backtracking the line of last sends.

Just add a 'last sent' cache to Object and trace.

Why doesn't CUIS do that?

Tnx in advance

>
>         "See what is missing from Ropes"
>         Transcript log: (String streamContents: [:s | aMessage storeOn: s]).
>
>         "Do what a String would do"
>         aMessage sendTo: (self asString)
>
>
> and then
>
>
> Rope class>>openTextEditor
>
>
> openTextEditor
>
> "
>         Rope openTextEditor.
> "
>         SystemWindow
>                 editText: (TextModel
>                         withText: (FlatRope fromString:
>                                'Let us see what these Rope things can do.'))
>                         label: 'Text Editor'
>                         wrap: true
>
>
> Kind regards
>
> Hannes
>
> On 2/13/13, Ken Dickey <[hidden email]> wrote:
> > On Wed, 13 Feb 2013 00:37:31 -0300
> > Juan Vuletich <[hidden email]> wrote:
> >
> >> I must admit that I don't know when it makes sense to use it, but it is
> >> really interesting.
> >
> > The Cedar/Mesa language and IDE used only ropes.
> >
> > At least one Python implementation uses ropes internally for Unicode
> > strings:
> >
> >       http://morepypy.blogspot.com/2007/11/ropes-branch-merged.html
> >
> >
> > The IBM Java Ropes performance is also interesting, particularly with
> > respect to regular expression search.  [We could probably speed this up with
> > Zippers, but I hate to get too functional here.  ;^].
> >
> >       http://www.ibm.com/developerworks/java/library/j-ropes/index.html
> >
> >
> >> BTW, I find it totally amazing that you could hook this into the
> >> existing TextModel and text editor _without modifying a single method_!
> >
> > I thought that it was pretty cool myself.
> >
> > Hey, that is the way good Smalltalk should be written.
> >
> > Cheers,
> > -KenD
> >
> > _______________________________________________
> > Cuis mailing list
> > [hidden email]
> > http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
> >
>
> _______________________________________________
> Cuis mailing list
> [hidden email]
> http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org


_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
Reply | Threaded
Open this post in threaded view
|

Re: Ropes (functional strings)

Casey Ransberger-2
In reply to this post by KenDickey
Hey cool. Just got around to reading this. I dig it. I've had some moments staring out windows and daydreaming about what a more functional Smalltalk might look like. I think we err on the side of mutable designs too often.

I'm curious: what's the difference WRT performance? Has anyone looked? Wouldn't it be a kick to find out that object allocation and storage ended up being faster/smaller than what we have with the mutable structures? I kind of want to see the difference in performance between the interpreter, the stack VM and Cog. I missed the original message, and it looks like the thread got split (at least from my POV) so can you point me at the code?

TIA!

On Mon, Feb 11, 2013 at 8:08 PM, Ken Dickey <[hidden email]> wrote:
Juan wrote:

> Yesterday I played a bit to see how quickly the system would break if
> #at: and #at:put: were banned for Strings.

I did a simple, prototype implementation of Ropes, functional strings, for Cuis.

Ropes are _immutable_ (constant) strings, and #at:put: returns a new Rope.  Internal data structures hide the details.

The basic idea is that I had a utf-8 Rope and #at:put: a utf-32 Chinese character into it, then we would NOT have to change the utf-8 part just because of one character.  We could have separate 8/16/32 bit char-arrays supporting Unicode and do mix 'n match as we needed.

        https://github.com/KenDickey/Cuis-Ropes

Invoking
        Rope openTextEditor
brings up a simple text editor with a rope instead of a string.

This seems to work OK for me, but more knowledgeable people should take a look.

As this is a prototype, Rope>>doesNotUnderstand writes a message to the Transcript and re-sends it to its stringRepresentation.

So far, I have only seen sends to #encompassParagraph, which I don't yet understand and did not implement.

Anyway, feel free to take a look and let me know if you think it is useful.

Cheers,
-KenD

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org



--
Casey Ransberger

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
Reply | Threaded
Open this post in threaded view
|

Re: Ropes (functional strings)

KenDickey
On Wed, 13 Feb 2013 21:01:01 -0800
Casey Ransberger <[hidden email]> wrote:

> I'm curious: what's the difference WRT performance? Has anyone looked?

Two sets of performance measurements have been published:

IBM implementation of Ropes in Java:

        http://www.ibm.com/developerworks/java/library/j-ropes/index.html

C implementation of Ropes in the paper 'Ropes: an Alternative to Strings':

http://citeseer.ist.psu.edu/viewdoc/downloaddoi=10.1.1.14.9450&rep=rep1&type=pdf

The Java data is probably closer to what we might expect.

I have implemented the optimizations suggested in the Alternative to Strings paper, but more optimizations could be done.

I consider the current implementation a proof of concept that we could replace strings with ropes in the current editor.

Once we have immutable strings, it is (IMHO) easier to support multiple sizes of encodings for UTF8/16/32 characters & compact arrays.

The larger pieces of work would be to integrate the entire String protocol and then do the speed optimizations for parsing and regular expressions.  The IBM paper indicates an easy factor of 10 speedup in large regular expression tests.

I don't expect rope operations to be as fast as strings.  I do expect the code would be easier to get correct and to test.

$0.02
-KenD

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
-KenD
Reply | Threaded
Open this post in threaded view
|

Re: Ropes (functional strings)

KenDickey
In reply to this post by Casey Ransberger-2
On Wed, 13 Feb 2013 21:01:01 -0800
Casey Ransberger <[hidden email]> wrote:

> so can you point me at the code?

Casey,

I just noticed the above line.

We use GIT and the GitHub repository for sharing Cuis code:

  https://github.com/jvuletich/Cuis
  https://github.com/KenDickey/Cuis-Ropes

There are details on the Cuis README.md file.

Did I answer your question(s)?

-KenD
--
Ken Dickey <[hidden email]>

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
-KenD
Reply | Threaded
Open this post in threaded view
|

Re: Ropes (functional strings)

KenDickey
In reply to this post by Hannes Hirzel
On Wed, 13 Feb 2013 17:49:26 +0000
"H. Hirzel" <[hidden email]> wrote:

> Congrats to this implementation, Ken
>
> In particular I like how you did
>
> Rope>>doesNotUnderstand:

I don't like to over-use #doesNotUnderstand, but it is totally cool.  I don't know of another major language that lets me do thin.

In case you missed it, here is a new method in Color, which helps when changing color dictionaries:

Color>>doesNotUnderstand: aMessage
        "Some code takes
                 Color colorNames
        and does
                Color perform: aColorname.
               
        Make this work."

        ^(Color colorNamesDict)
                at: (aMessage selector)
                ifAbsent: [super doesNotUnderstand: aMessage]

8^)

Cheers,
-KenD
--
Ken Dickey <[hidden email]>

_______________________________________________
Cuis mailing list
[hidden email]
http://jvuletich.org/mailman/listinfo/cuis_jvuletich.org
-KenD