join

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
60 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: join

Andreas.Raab
 > Is anyone convinced?

I'm sure someone is convinced but probably not everyone ;-) Couple of notes:

> For this reason I think that split: should depend on VB-Regex, unless
> someone want's to implement one of the modern algoirthms
>
> I propose that: String>>split: ^ regexString asRegex split: self

I don't like this too much as it means there is special meaning to the
search pattern (wildcards and the like) and it requires a non-optional
dependency on a regex package. Perhaps the simple string split should
remain the simple string split and the regex string split should remain
in regex?

> On join:
> - join: is the conceptual inverse of split: (see the tests in
> http://squeaksource.com/RubyShards/)
> - join: obviously works for Sequenceables as well as Strings
>
> I propose adding the following method to either SequenceableCollection
> or OrderedCollection [the tradeoff is not clear to me].

OrderedCollection makes little sense since you couldn't join: arrays or
strings in that case so it should go into SequenceableCollection.

> join: anOrderedCollection

It would be good to give this a role instead of a type name. From the
type name it's not immediately obvious whether:
   'abc' join: 'xyz'
results in 'axyzbxyzc' or 'xabcyabcz'.

>     "Implicit precondition: my elements are all OrderedCollections"

The precondition should probably be that "receiver species = argument
species" and consequently determine the result species.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: join

Yanni Chiu
In reply to this post by Oscar Nierstrasz
Oscar Nierstrasz wrote:
>     eg := 'Now is the time for all good men to come to the aid of the  
> party'.
>
>     self assert: ((eg split: 'the') join: 'the') = eg.

In my viewer I saw two spaces at "the  party".
So what should happen for #split: on a single space.
Should there be an extra item with value nil or an
empty string? Or should there be no extra item?


Reply | Threaded
Open this post in threaded view
|

Re: join

keith1y
In reply to this post by Yanni Chiu
I think that join: should be more than a simple string generator. Using
the power of Smalltalk collections you will find all sorts of wierd and
wonderful operators that work on all manner of collections, and I think
that join: should demonstrate some of this power.

avi's joinTokens: aStringToken does just what it say it does, it just
creates a string. Simple and unambiguous.

My proposed join: implementation is able to do much more, but there are
some ambiguities as to what behaviour is desired/obtained. I have no
idea what other Smalltalks do but here are my ideas so far:

Character-useTojoin: result will be a string.
String-useToJoin: result will be a string.
$/ useToJoin: #('hello' 'my' 'world')   -> 'hello/my/world'
', ' useToJoin: #('hello' my' 'world') -> 'hello, my, world'

SequencableCollection-useToJoin result will be a SequencableCollection.
#(1 2) useToJoin: #(3 4 5) ->  #(3 1 2 4 1 2 5)

The double dispatch approach can be used.

#('hello' 'my' world') joinUsing: $/

SequenceableCollestion>>joinUsing: joiner
^ joiner useToJoin: self

----
ideas:

useToJoin:  <-> joinUsing:
join: <-> joinWith:

Keith










               
___________________________________________________________
NEW Yahoo! Cars - sell your car and browse thousands of new and used cars online! http://uk.cars.yahoo.com/

Reply | Threaded
Open this post in threaded view
|

Re: split:

keith1y
The question as to whether split: should work with Strings or Regex's is
something that should be handled with a double dispatch. If you feed it
a String it should use a string if you feed it a Regex then it will use
a Regex.

eg := 'Now is the time for all good men to come to the aid of the  party'.

eg splitOnEvery: 'the'
eg splitOnEvery: ( '+[\s]' asRegex ).
eg splitOnEvery: $e.
#(1 2 3 4 5 6 7 8 9) splitOnEvery: #(4).
#(1 2 3 4 5 6 7 8 9) splitOnEvery: #(4 5 6).
#(1 2 3 4 5 6 7 8 9) splitOnEvery: [ :n | n isPrimeNumber ]

SequencableCollection>>splitOnEvery: aSpliter
^aSpliter  splitUp:  self
 
aRegex>>splitUp: aString.
...regex implementation
aString>>splitUp: aString.
...string implementation
aCharacter>>splitUp: aString.
...character implementation
aSequencableCollection>>splitUp: aSequencableCollection
... generic collection implementation.
aMonadicBlock>>splitOnEvery: thing
^ thing splitUsing: aBlock.

more ideas

Keith




 >In my viewer I saw two spaces at "the  party".
 >So what should happen for #split: on a single space.
 >Should there be an extra item with value nil or an
 >empty string? Or should there be no extra item?

I think that the convention is to return an empty string. -> #('the' ''
'party')


       
       
               
___________________________________________________________
Yahoo! Messenger - NEW crystal clear PC to PC calling worldwide with voicemail http://uk.messenger.yahoo.com

Reply | Threaded
Open this post in threaded view
|

Re: join

Yanni Chiu
In reply to this post by keith1y
Keith Hodges wrote:
> I think that join: should be more than a simple string generator. Using
> the power of Smalltalk collections you will find all sorts of wierd and
> wonderful operators that work on all manner of collections, and I think
> that join: should demonstrate some of this power.

A potential problem is that *if* the results depend on particular
quirks of Squeak, then trying to do the equivalent in another
Smalltalk becomes difficult. Seaside runs in VW (and Dolphin) too,
so portability is important. We don't want to diverge in a way
that makes frameworks & components built for Seaside in one Smalltalk,
difficult to port to other Smalltalks.

Having said that, I'll be happy with whatever covers my use case,
and isn't so general or complicated that we can't agree on what
the correct result ought to be. Otherwise, I can always rename my
methods.


Reply | Threaded
Open this post in threaded view
|

Re: split:

stephane ducasse-2
In reply to this post by keith1y
keep trying to find a fix point with a nice solution and tests ;)

Stef

On 16 sept. 06, at 03:20, Keith Hodges wrote:

> The question as to whether split: should work with Strings or  
> Regex's is something that should be handled with a double dispatch.  
> If you feed it a String it should use a string if you feed it a  
> Regex then it will use a Regex.
>
> eg := 'Now is the time for all good men to come to the aid of the  
> party'.
>
> eg splitOnEvery: 'the'
> eg splitOnEvery: ( '+[\s]' asRegex ).
> eg splitOnEvery: $e.
> #(1 2 3 4 5 6 7 8 9) splitOnEvery: #(4).
> #(1 2 3 4 5 6 7 8 9) splitOnEvery: #(4 5 6).
> #(1 2 3 4 5 6 7 8 9) splitOnEvery: [ :n | n isPrimeNumber ]
>
> SequencableCollection>>splitOnEvery: aSpliter
> ^aSpliter  splitUp:  self
> aRegex>>splitUp: aString.
> ...regex implementation
> aString>>splitUp: aString.
> ...string implementation
> aCharacter>>splitUp: aString.
> ...character implementation
> aSequencableCollection>>splitUp: aSequencableCollection
> ... generic collection implementation.
> aMonadicBlock>>splitOnEvery: thing
> ^ thing splitUsing: aBlock.
>
> more ideas
>
> Keith
>
>
>
>
> >In my viewer I saw two spaces at "the  party".
> >So what should happen for #split: on a single space.
> >Should there be an extra item with value nil or an
> >empty string? Or should there be no extra item?
>
> I think that the convention is to return an empty string. -> #
> ('the' '' 'party')
>
>
>
>
>
> ___________________________________________________________ Yahoo!  
> Messenger - NEW crystal clear PC to PC calling worldwide with  
> voicemail http://uk.messenger.yahoo.com
>


Reply | Threaded
Open this post in threaded view
|

RE: join

J J-6
In reply to this post by keith1y
Forgive my newbieness, but doesn't smalltalk already have a join called ","  
(e.g. #(1 2 3) , #(4 5 6) ----> #(1 2 3 4 5 6) )?


>From: Keith Hodges <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: The general-purpose Squeak developers
>list<[hidden email]>
>Subject: join
>Date: Fri, 15 Sep 2006 00:57:52 +0100
>
>Entirely against my will, I once worked in a perl shop, and I noticed that
>perlites liked joining things.
>
>Whenever I find myself wanting to join a bunch of items together (e.g. to
>make a path) I am never satisfied with the result and so I took a look at
>the Collections/String classes to see whether anything fitted nicely.
>
>
>I came up with
>
>SequencableCollection>>join: aCollection
>
>    ^ self class streamContents: [ :stream |
>        aCollection
>            do: [ :each | stream nextPut: each ]
>            separatedBy: [ stream nextPutAll: self ] ]
>
>
>and
>
>Character>>join: aCollectionOfStrings
>
>    ^ self class streamContents: [ :stream |
>        aCollectionOfStrings
>            do: [ :each | stream nextPutAll: each ]
>            separatedBy: [ stream nextPut: self ] ]
>
>and
>
>Collection>>joinWith: aCollection
>
>aCollection join: self
>
>----
>This now allows
>
>(Array with: 1 with: 2) join: (Array with: 3 with: 4 with: 5)
>$/ join: (Array with: 'Hello' with: 'my' with: 'World').
>
>any thoughts? I am curious as to why #join: hasn't made it into the core
>image, and if it were to how would it happen?
>
>Keith
>
>
>___________________________________________________________ The all-new
>Yahoo! Mail goes wherever you go - free your email address from your
>Internet provider. http://uk.docs.yahoo.com/nowyoucan.html
>



Reply | Threaded
Open this post in threaded view
|

Re: join

Klaus D. Witzel
Hi J J, you wrote:

> Forgive my newbieness, but doesn't smalltalk already have a join called  
> ","  (e.g. #(1 2 3) , #(4 5 6) ----> #(1 2 3 4 5 6) )?

No, #, is concatenation of collections.

#(1 2 3) useToJoin: #(4 5 6) is not same as #, see Keith' postings.

BTW Keith you get my +1 for the predictable species of the result, very  
non-confusing!

/Klaus


Reply | Threaded
Open this post in threaded view
|

Re: join

J J-6
*bonk self*


>From: "Klaus D. Witzel" <[hidden email]>
>Reply-To: The general-purpose Squeak developers
>list<[hidden email]>
>To: [hidden email]
>Subject: Re: join
>Date: Sat, 16 Sep 2006 15:34:00 +0200
>
>Hi J J, you wrote:
>
>>Forgive my newbieness, but doesn't smalltalk already have a join called  
>>","  (e.g. #(1 2 3) , #(4 5 6) ----> #(1 2 3 4 5 6) )?
>
>No, #, is concatenation of collections.
>
>#(1 2 3) useToJoin: #(4 5 6) is not same as #, see Keith' postings.
>
>BTW Keith you get my +1 for the predictable species of the result, very  
>non-confusing!
>
>/Klaus
>
>



Reply | Threaded
Open this post in threaded view
|

Re: join

keith1y
My running image is displaying a cursor like a pair of spectacles and it
appears to be stuck/busy. How can I interrupt it and get control back?

many thanks

Keith

               
___________________________________________________________
Inbox full of spam? Get leading spam protection and 1GB storage with All New Yahoo! Mail. http://uk.docs.yahoo.com/nowyoucan.html

Reply | Threaded
Open this post in threaded view
|

Re: join

Lex Spoon
In reply to this post by Avi Bryant
Avi Bryant <[hidden email]> writes:
> Sure we need it.  At Smallthought, we have a package of utility
> methods that includes:
>
> SequencableCollection>>joinTokens: aString


Scala's library has a similar method.  I thought it was the silliest,
overly specific method I've seen, but then I keep finding myself using
it!



> Looking through my sends to it, the argument is always either ' ' or
> ', ', so #joinedWithSpaces and #joinedWithCommas would probably be
> sensible methods to have too.

This is my experience, too.

It seems people just cannot resist the urge to generalize this method
more than is useful.  We've all taken too *many* programming classes,
and are overlooking some pragmatics!

Scala's method is actually more general than the join: methods being
passed around -- it takes in an initial string and a final string as
arguments -- but I find myself always setting those to the empty
string...


-Lex




Reply | Threaded
Open this post in threaded view
|

Re: join

Lex Spoon
In reply to this post by Andreas.Raab
Andreas Raab <[hidden email]> writes:
> > join: anOrderedCollection
>
> It would be good to give this a role instead of a type name. From the
> type name it's not immediately obvious whether:
>    'abc' join: 'xyz'
> results in 'axyzbxyzc' or 'xabcyabcz'.

Yeah, you can only learn the behavior by experience.  It would be
better if the name just gave away the behavior.



> >     "Implicit precondition: my elements are all OrderedCollections"
>
> The precondition should probably be that "receiver species = argument
> species" and consequently determine the result species.


Actually, it might be nice to make this produce *strings*.  That's
what most people seem to want the method for.  It's hard to think of a good name that is shorter than the implementation, but maybe:


   joinStringsWith:
   printStrings:
   makeString:


Join is popular for people who use Perl, but it seems weird if there's
a print involved in the method.  printStrings: can be misread in
several ways, e.g. does it return the print strings?  Does it print
the argument's elmeents, which are expected to be strings?

I lean towards makeString:, though I admit I have different and warped
experience that makes this one look familiar to me.  That said,
makeString: does tell you the most important thing about the method--
it flattens the receiver into a string.  It is vague about how it does
it, but in most use cases the reader can guess (foo makeString: ', ').



-Lex



Reply | Threaded
Open this post in threaded view
|

RE: join

Ramon Leon-5
> Join is popular for people who use Perl, but it seems weird

And Python, Ruby, JavaScript, CSharp, and Visual Basic.Net.  Split and join
are the common names in most languages that use them.

> if there's a print involved in the method.  printStrings: can
> be misread in several ways, e.g. does it return the print
> strings?  Does it print the argument's elmeents, which are
> expected to be strings?
>
> I lean towards makeString:, though I admit I have different
> and warped experience that makes this one look familiar to
> me.  That said,
> makeString: does tell you the most important thing about the
> method-- it flattens the receiver into a string.  It is vague
> about how it does it, but in most use cases the reader can
> guess (foo makeString: ', ').
> -Lex

They should be called #split: and #join:, no need to hide them under other
names when all newcomers are going to be looking for #split: and #join:.
Let's not be different, just to be different.


Reply | Threaded
Open this post in threaded view
|

Re: join

stephane ducasse-2
+ 1

Stef

On 18 sept. 06, at 17:38, Ramon Leon wrote:

>> Join is popular for people who use Perl, but it seems weird
>
> And Python, Ruby, JavaScript, CSharp, and Visual Basic.Net.  Split  
> and join
> are the common names in most languages that use them.
>
>> if there's a print involved in the method.  printStrings: can
>> be misread in several ways, e.g. does it return the print
>> strings?  Does it print the argument's elmeents, which are
>> expected to be strings?
>>
>> I lean towards makeString:, though I admit I have different
>> and warped experience that makes this one look familiar to
>> me.  That said,
>> makeString: does tell you the most important thing about the
>> method-- it flattens the receiver into a string.  It is vague
>> about how it does it, but in most use cases the reader can
>> guess (foo makeString: ', ').
>> -Lex
>
> They should be called #split: and #join:, no need to hide them  
> under other
> names when all newcomers are going to be looking for #split: and  
> #join:.
> Let's not be different, just to be different.
>
>


Reply | Threaded
Open this post in threaded view
|

Re: join

Lex Spoon
In reply to this post by Ramon Leon-5
"Ramon Leon" <[hidden email]> writes:
> > Join is popular for people who use Perl, but it seems weird
>
> And Python, Ruby, JavaScript, CSharp, and Visual Basic.Net.  Split and join
> are the common names in most languages that use them.

OK, that's a good reason to call it "join".

Just to be sure, though, which "join" do these languages have?  "join"
sounds right for the method posted initially in this thread, but
sounds wrong for the method that creates a string regardless of the
initial collection types.


-Lex


Reply | Threaded
Open this post in threaded view
|

Re: join

Ramon Leon-4
Lex Spoon wrote:

> "Ramon Leon" <[hidden email]> writes:
>
>>>Join is popular for people who use Perl, but it seems weird
>>
>>And Python, Ruby, JavaScript, CSharp, and Visual Basic.Net.  Split and join
>>are the common names in most languages that use them.
>
>
> OK, that's a good reason to call it "join".
>
> Just to be sure, though, which "join" do these languages have?  "join"
> sounds right for the method posted initially in this thread, but
> sounds wrong for the method that creates a string regardless of the
> initial collection types.
>
>
> -Lex

As far as I know, join creates a string in each of these languages.


Reply | Threaded
Open this post in threaded view
|

Re: join

keith1y
It appears to be to be a discrepancy between the smalltalk way and these
'other' languages.

in ruby

[ 'a', 'b', 'c' ].join(', ')

in smalltalk to achieve the same thing, including the specification that
the result should be a String, arguably could be

' ,' join: #('a' 'b' 'c')

but I can bet that if you are trying to satisfy the aesthetic
requirements of users of 'other' languages (those languages with
supposedly less brackets), they will take one look at this smalltalk
version and say that it is the wrong way around.

so... if you do want the wrong way around for them, we need a right way
around for us.
how about #joining:

aCollection>>join:bCollection
^bCollection joining: aCollection

#('a' 'b' 'c') join: ', '.

', ' joining: #('a' 'b' 'c')


Keith



               
___________________________________________________________
Copy addresses and emails from any email account to Yahoo! Mail - quick, easy and free. http://uk.docs.yahoo.com/trueswitch2.html

Reply | Threaded
Open this post in threaded view
|

RE: join

Ramon Leon-5
> in ruby
>
> [ 'a', 'b', 'c' ].join(', ')
>
> in smalltalk to achieve the same thing, including the
> specification that the result should be a String, arguably could be
>
> ' ,' join: #('a' 'b' 'c')

What?  Why would you do this, join isn't a method on string, it produces a
string, that doesn't mean it belongs there.  Join belongs on Collection or
SequencedCollection.  Split and Join are partners, join makes a list into a
string and split does the opposite, join belongs to lists, split belongs to
String.

> but I can bet that if you are trying to satisfy the aesthetic
> requirements of users of 'other' languages (those languages
> with supposedly less brackets), they will take one look at
> this smalltalk version and say that it is the wrong way around.
>
> so... if you do want the wrong way around for them, we need a
> right way around for us.
> how about #joining:

Why do you consider their way the wrong way around?



Reply | Threaded
Open this post in threaded view
|

Re: join

Tony Garnock-Jones-2
Ramon Leon wrote:
> Split and Join are partners, join makes a list into a
> string and split does the opposite, join belongs to lists, split belongs to
> String.

You could see it as primarily oriented on the collection being split or
joined, or on the separator interleaving the collection. Python's join
is separator-as-receiver. Python isn't symmetric though: it expects the
string /to be split/ as the receiver of the split() message.

Tony


Reply | Threaded
Open this post in threaded view
|

Re: join

Stéphane Rollandin
In reply to this post by Ramon Leon-5
Ramon Leon wrote:
>> ' ,' join: #('a' 'b' 'c')
>
> What?  Why would you do this,

that seems obvious to me:

it's ' ,' that joins the items in #('a' 'b' 'c'), acting as a glue,
while the expression

#('a' 'b' 'c') join: ' ,'

does not make any sense when you read it.
we do "small talk" after all... not "reverse talk"


just my 2 cents

Stef


123