[ENH] Problem with CharacterArray>>equalsAcrossPlatforms:

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

[ENH] Problem with CharacterArray>>equalsAcrossPlatforms:

Paul Baumann
CharacterArray>>#equalsAcrossPlatforms: is used to compare code
definition strings from different platforms without having to breakup
and compare smaller parts of the string. Windows lines end with two
characters, Unix and Mac with one character. The method currently checks
the size of the string--which won't be equal across platforms. This is
one way to compare two strings that may contain line endings from
different platforms:

gbcEqualsAcrossPlatforms: aCharacterArray
        "Added by plb 2006.05.18: This is similar to
#equalsAcrossPlatforms: but this one works comparing with UNIX and DOS
line endings."

        | eols s1 c1 s2 c2 |
        aCharacterArray isCharacters ifFalse: [ ^false ].
        eols := Array with: Character lf with: Character cr with:
(Character value: 11).
        s1 := ReadStream on: self.
        s2 := ReadStream on: aCharacterArray.
        [
                [s1 atEnd ifTrue: [c1 := eols first. false] ifFalse:
[eols includes: (c1 := s1 next)]] whileTrue: [c1 := eols first].
                [s2 atEnd ifTrue: [c2 := eols first. false] ifFalse:
[eols includes: (c2 := s2 next)]] whileTrue: [c2 := eols first].
                c1 = c2 and: [(s1 atEnd and: [s2 atEnd]) not]
        ] whileTrue: [].
        ^c1 = c2

It could be made faster using index positions rather than streams but
this was easier to implement and read.

Paul Baumann
IntercontinentalExchange | ICE
2100 RiverEdge Pkwy | 5th Floor | Atlanta, GA 30328
Tel: 770.738.2137 | Fax: 770.951.1307 | Cel: 505.780.1470
[hidden email]

24-hour ice helpdesk 770.738.2101
www.theice.com
 
 
 
--------------------------------------------------------
This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.  
 

Reply | Threaded
Open this post in threaded view
|

Re: [ENH] Problem with CharacterArray>>equalsAcrossPlatforms:

Alan Knight-2
Thanks. I'm a bit confused about why the source wouldn't be stored internally in Smalltalk line-end convention, and who it is that uses character value 11 for line-end, but I've created AR 50970 for this.

At 09:54 AM 15/06/2006, Paul Baumann wrote:

>CharacterArray>>#equalsAcrossPlatforms: is used to compare code
>definition strings from different platforms without having to breakup
>and compare smaller parts of the string. Windows lines end with two
>characters, Unix and Mac with one character. The method currently checks
>the size of the string--which won't be equal across platforms. This is
>one way to compare two strings that may contain line endings from
>different platforms:
>
>gbcEqualsAcrossPlatforms: aCharacterArray
>        "Added by plb 2006.05.18: This is similar to
>#equalsAcrossPlatforms: but this one works comparing with UNIX and DOS
>line endings."
>
>        | eols s1 c1 s2 c2 |
>        aCharacterArray isCharacters ifFalse: [ ^false ].
>        eols := Array with: Character lf with: Character cr with:
>(Character value: 11).
>        s1 := ReadStream on: self.
>        s2 := ReadStream on: aCharacterArray.
>        [
>                [s1 atEnd ifTrue: [c1 := eols first. false] ifFalse:
>[eols includes: (c1 := s1 next)]] whileTrue: [c1 := eols first].
>                [s2 atEnd ifTrue: [c2 := eols first. false] ifFalse:
>[eols includes: (c2 := s2 next)]] whileTrue: [c2 := eols first].
>                c1 = c2 and: [(s1 atEnd and: [s2 atEnd]) not]
>        ] whileTrue: [].
>        ^c1 = c2
>
>It could be made faster using index positions rather than streams but
>this was easier to implement and read.
>
>Paul Baumann
>IntercontinentalExchange | ICE
>2100 RiverEdge Pkwy | 5th Floor | Atlanta, GA 30328
>Tel: 770.738.2137 | Fax: 770.951.1307 | Cel: 505.780.1470
>[hidden email]
>
>24-hour ice helpdesk 770.738.2101
>www.theice.com
>  
>  
>  
>--------------------------------------------------------
>This message may contain confidential information and is intended for specific recipients unless explicitly noted otherwise. If you have reason to believe you are not an intended recipient of this message, please delete it and notify the sender. This message may not represent the opinion of IntercontinentalExchange, Inc. (ICE), its subsidiaries or affiliates, and does not constitute a contract or guarantee. Unencrypted electronic mail is not secure and the recipient of this message is expected to provide safeguards from viruses and pursue alternate means of communication where privacy or a binding message is desired.  
>

--
Alan Knight [|], Cincom Smalltalk Development
[hidden email]
[hidden email]
http://www.cincom.com/smalltalk

"The Static Typing Philosophy: Make it fast. Make it right. Make it run." - Niall Ross

Reply | Threaded
Open this post in threaded view
|

Re: [ENH] Problem with CharacterArray>>equalsAcrossPlatforms:

Reinout Heeck

On Jul 7, 2006, at 9:31 PM, Alan Knight wrote:

> Thanks. I'm a bit confused about why the source wouldn't be stored  
> internally in Smalltalk line-end convention,

Me too, this sounds like condoning broken code elsewhere, introducing  
other models into an otherwise clean system


> and who it is that uses character value 11 for line-end, but I've  
> created AR 50970 for this.

I haven't tested, but the code below seems to not count line-ends so  
an empty string and one containing only a couple of CRs compare as  
equal.

R
-

>
> At 09:54 AM 15/06/2006, Paul Baumann wrote:
>> CharacterArray>>#equalsAcrossPlatforms: is used to compare code
>> definition strings from different platforms without having to breakup
>> and compare smaller parts of the string. Windows lines end with two
>> characters, Unix and Mac with one character. The method currently  
>> checks
>> the size of the string--which won't be equal across platforms.  
>> This is
>> one way to compare two strings that may contain line endings from
>> different platforms:
>>
>> gbcEqualsAcrossPlatforms: aCharacterArray
>>        "Added by plb 2006.05.18: This is similar to
>> #equalsAcrossPlatforms: but this one works comparing with UNIX and  
>> DOS
>> line endings."
>>
>>        | eols s1 c1 s2 c2 |
>>        aCharacterArray isCharacters ifFalse: [ ^false ].
>>        eols := Array with: Character lf with: Character cr with:
>> (Character value: 11).
>>        s1 := ReadStream on: self.
>>        s2 := ReadStream on: aCharacterArray.
>>        [
>>                [s1 atEnd ifTrue: [c1 := eols first. false] ifFalse:
>> [eols includes: (c1 := s1 next)]] whileTrue: [c1 := eols first].
>>                [s2 atEnd ifTrue: [c2 := eols first. false] ifFalse:
>> [eols includes: (c2 := s2 next)]] whileTrue: [c2 := eols first].
>>                c1 = c2 and: [(s1 atEnd and: [s2 atEnd]) not]
>>        ] whileTrue: [].
>>        ^c1 = c2
>>
>> It could be made faster using index positions rather than streams but
>> this was easier to implement and read.
>>
>> Paul Baumann
>> IntercontinentalExchange | ICE
>> 2100 RiverEdge Pkwy | 5th Floor | Atlanta, GA 30328
>> Tel: 770.738.2137 | Fax: 770.951.1307 | Cel: 505.780.1470
>> [hidden email]
>>
>> 24-hour ice helpdesk 770.738.2101
>> www.theice.com
>>
>>
>>
>> --------------------------------------------------------
>> This message may contain confidential information and is intended  
>> for specific recipients unless explicitly noted otherwise. If you  
>> have reason to believe you are not an intended recipient of this  
>> message, please delete it and notify the sender. This message may  
>> not represent the opinion of IntercontinentalExchange, Inc. (ICE),  
>> its subsidiaries or affiliates, and does not constitute a contract  
>> or guarantee. Unencrypted electronic mail is not secure and the  
>> recipient of this message is expected to provide safeguards from  
>> viruses and pursue alternate means of communication where privacy  
>> or a binding message is desired.
>>
>
> --
> Alan Knight [|], Cincom Smalltalk Development
> [hidden email]
> [hidden email]
> http://www.cincom.com/smalltalk
>
> "The Static Typing Philosophy: Make it fast. Make it right. Make it  
> run." - Niall Ross
>
>