|
thanks this is interesting.
S.
For what it's worth, here's a moderately thorough examination of several different Smalltalk systems. Let A = astc, D = Dolphin, G = GNU Smalltalk, Q = Squeak, S = ANSI standard, T = Strongtalk, V = VisualWorks, X = ST/X #asString ADGPQST!X Character => String with: self ADGPQSTVX String => self ADGPQSTVX Symbol => String withAll: self ......... Text/Paragraph, where provided, => underlying string. A-G-----X Collection => String withAll: self ---PQ---- Collection => self printString A........ ByteArray => "decoded as Unicode" .DGPQ-TVX ByteArray => "decoded as Latin1" ADG.Q..VX Filename => "unparse (Filename, FilePath, whatever) AdgpQ.TVX URI => "unparse (ZnUrl in P)Ad.PQ...X UUID => "unparse (d, GUID equivalent)" .DGpq...X Number => self displayString "or self printString, maybe via Object" ...PQ...X Object => self printString There's a rough consensus here: - if something *is* textual (Character, String, Symbol, Text, Paragraph) #asString returns it as a normal string - if something is a parsed form of structured text (file name, URI, UUID) its unparsed form is returned, converting it back should yield an equal parsed form - if something is a byte array, it may be converted to a string (the rule is to preserve the bytes, my library presumes byte arrays are UTF8-encoded) - if numbers are accepted at all, their #printString is returned - if is none of the above, but has a #name or #title, that is commonly the result.
The open question is what converting an array of characters using #asString will do. There is at least one Smalltalk out there where #(65 66 67) asString => 'ABC', but that's a step further than I personally want, though it is consistent with ByteArray.
Hi all,
I’ve been lurking so far, but I must add my voice here and agree with Richard.
The malleability of Smalltalk tempts people into implementing #asString, #name, and similar semantically ambiguous method names. Like Richard, I regretted every single time I (or someone else on my team before me) decided to use these. It goes back to ’Smalltalk with Style’ (probably the best little red book you could ever own): write intention revealing code whenever you can, and comment liberally when you can’t.
Jerry Kott This message has been digitally signed. PGP Fingerprint: A9181736DD2F1B6CC7CF9E51AC8514F48C0979A5
On May 11, 2020 2:19:49 PM PDT, Richard O'Keefe < [hidden email]> wrote: I was saying that I expected #($a $b $c) asString ==> 'abc'.
Over the years, I found myself being opposed to the idea that all objects can sensibly have an #asString implementation. When it's been done, it ultimately caused more problems than it solved. Consider $(48 49 50) asString. Do you expect it to give you a string with all the digits? Or perhaps it's meant to interpret the elements as byte-like things, as you would need for "String withAll: aCollection". So, the numbers could be interpreted as codepoints, as they are in a ByteArray. But, what does "(Array with: Object new with: ProcessScheduler) asString" mean? It seems to me that having all objects understand #asString leads to confusion. If you want an array to print as its literal representation, implement #printAsLiteral, so that your intention is clear. If you want something that can be read back, that's what #storeString is for,
On Tue, 12 May 2020 at 01:28, Stéphane Ducasse <[hidden email]> wrote:
On 5 May 2020, at 16:16, Richard O'Keefe <[hidden email]> wrote:
By the way, while playing with this problem, I ran into a moderately painful issue.
There is a reason that Smalltalk has both #printString (to get a printable representation of an object) and #asString (to convert a sequence to another kind of sequence with the same elements.) If I *want* #printString, I know where to find it. The definition in my Smalltalk no reads
asString "What should #($a $b $c) do? - Blue Book, Inside Smalltalk, Apple Smalltalk-80: there is no #asString. - ANSI, VW, Dolphin, CSOM: #asString is defined on characters and strings (and things like file names and URIs that are sort of
strings),
so expect an error report. - VisualAge Smalltalk: '($a $b $c)' - Squeak and Pharo: '#($a $b $c)' - GNU Smalltalk, Smalltalk/X, and astc: 'abc' I don't intend any gratuitous incompatibility, but when there is no consensus to be compatible with, one must pick something, and this seems most useful. " ^String withAll: self
Does anyone here know WHY Squeak and Pharo do what they do here?
Oops I did not see the quotes on my screen..
#( a b c) asString
'#(#a #b #c)’
this is unclear to me why this is not good but I have no strong
opinion
that this is good.
I worked on printString for literals because I wanted to have self evaluating properties for basic literal like in Scheme and
others.
where #t
#t
And I payed attention that we get the same for literal arrays. Now the conversion is open to me.
#($a $b $c) asString
'#($a $b $c)’
In fact I do not really understand why a string
#($a $b $c) asString would be '(a b c)’ and its use if this is to nicely display in the ui I would have displayString doing it.
S.
On Wed, 6 May 2020 at 01:20, Richard O'Keefe <[hidden email]>
wrote:
The irony is that the code I was responding to ISN'T obviously
correct.
Indeed, I found it rather puzzling. The problem specification says that the input string may contain
digits
AND SPACES. The original message includes this:
Strings of length 1 or less are not valid. Spaces are allowed in the input, but they should be stripped before checking. All other non-digit characters are disallowed.
Now it isn't clear what "disallowed" means. I took it to mean "may
occur and
should simply mean the input is rejected as invalid." Perhaps "may
not occur"
was the intention. So we shall not quibble about such characters.
But I can't for the life of me figure out how Trygve's code checks
for spaces.
One reason this is an issue is that the behaviour of #digitValue is
not
consistent between systems. Character space digitValue does not exist in the ANSI standard answers -1 in many Smalltalks (which is a pain) answers a positive integer that can't be mistake for a digit in my
Smalltalk
raises an exception in some Smalltalks.
This is a comment I now have in my Smalltalk library for #digitValue "This is in the Blue Book, but unspecified on non-digits. Squeak, Pharo, Dolphin, VW, VAST, and Apple Smalltalk-80 answer -1 for characters that are not digits (or ASCII
letters),
which is unfortunate but consistent with Inside Smalltalk which specifies this result for non-digits. ST/X and GST raise an exception which is worse. Digitalk ST/V documentation doesn't specify the result. This selector is *much* easier to use safely if it returns a 'large' (>= 36) value for non-digits."
Let's compare three versions, the two I compared last time, and the "version A" code I discussed before, which to my mind is fairly readable.
"Don't add slowness": 1 (normalised time) "Trygve's code": 6.5 "High level code": 30.6 (or 4.7 times slower than Trygve's)
Here's the "High level code". ^(aString allSatisfy: [:each | each isSpace or: [each isDigit]])
and: [
|digitsReverse| digitsReverse := (aString select: [:each | each isDigit])
reverse.
digitsReverse size > 1 and: [ |evens odds evenSum oddSum| odds := digitsReverse withIndexSelect: [:y :i | i odd]. evens := digitsReverse withIndexSelect: [:x :i | i even]. oddSum := odds detectSum: [:y | y digitValue]. evenSum := evens detectSum: [:x | #(0 2 4 6 8 1 3 5 7 9) at: x digitValue + 1]. (oddSum + evenSum) \\ 10 = 0]]
This is the kind of code I was recommending that Roelof write.
As a rough guide, by counting traversals (including ones inside
existing
methods), I'd expect the "high level" code to be at least 10 times
slower
than the "no added slowness" code.
We are in vehement agreement that there is a time to write high level really obvious easily testable and debuggable code, and that's most of the time, especially with programming exercises.
I hope that we are also in agreement that factors of 30 (or even 6) *can* be a serious problem. I mean, if I wanted something that slow, I'd use Ruby.
I hope we are also agreed that (with the exception of investigations like this one) the time to hack on something to make it faster is
AFTER
you have profiled it and determined that you have a problem.
But I respectfully suggest that there is a difference taking slowness
OUT
and simply not going out of your way to add slowness in the first
place.
I'd also like to remark that my preference for methods that traverse
a
sequence exactly once has more to do with Smalltalk protocols than with efficiency. If the only method I perform on an object is #do: the method will work just as well for readable streams as for collections. If the only method I perform on an object is
#reverseDo:
the method will work just as well for Read[Write]Streams as for SequenceReadableCollections, at least in my library. It's just like trying to write #mean so that it works for Durations as well as
Numbers.
Oh heck, I suppose I should point out that much of the overheads in this case could be eliminated by a Self-style compiler doing dynamic inlining + loop fusion. There's no reason *in principle*, given
enough
people, money, and time, that the differences couldn't be greatly reduced in Pharo.
On Tue, 5 May 2020 at 21:50, Trygve Reenskaug <[hidden email]>
wrote:
Richard,
Thank you for looking at the code. It is comforting to learn that the
code has been executed for a large number of examples without breaking. The code is not primarily written for execution but for being read and checked by the human end user. It would be nice if we could also check that it gave the right answers, but I don't know how to do that.
The first question is: Can a human domain expert read the code and
sign their name for its correctness?
When this is achieved, a programming expert will transcribe the first
code to a professional quality program. This time, the second code should be reviewed by an independent programmer who signs their name for its correct transcription from the first version.
--Trygve
PS: In his 1991 Turing Award Lecture, Tony Hoare said: "There are two
ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."
--Trygve
On tirsdag.05.05.2020 04:41, Richard O'Keefe wrote:
As a coding experiment, I adapted Trygve Reenskoug's code to my Smalltalk compiler, put in my code slightly tweaked, and benchmarked them on randomly generated data.
Result: a factor of 6.3.
In Squeak it was a factor of ten.
I had not, in all honesty, expected it to to be so high.
On Tue, 5 May 2020 at 02:00, Trygve Reenskaug <[hidden email]>
wrote:
A coding experiment. Consider a Scrum development environment. Every programming team has
an end user as a member.
The team's task is to code a credit card validity check. A first goal is that the user representative shall read the code and
agree that it is a correct rendering of their code checker:
luhnTest: trialNumber | s1 odd s2 even charValue reverse | ----------------------------------------------- " Luhn test according to Rosetta" "Reverse the order of the digits in the number." reverse := trialNumber reversed. "Take the first, third, ... and every other odd digit in the reversed
digits and sum them to form the partial sum s1"
s1 := 0. odd := true. reverse do: [:char | odd ifTrue: [ s1 := s1 + char digitValue. ]. odd := odd not ]. "Taking the second, fourth ... and every other even digit in the
reversed digits:
Multiply each digit by two and sum the digits if the answer is
greater than nine to form partial sums for the even digits"
"The subtracting 9 gives the same answer. " "Sum the partial sums of the even digits to form s2" s2 := 0. even := false. reverse do: [:char | even ifTrue: [ charValue := char digitValue * 2. charValue > 9 ifTrue: [charValue := charValue -
9].
s2 := s2 + charValue ]. even := even not ]. "If s1 + s2 ends in zero then the original number is in the form of a
valid credit card number as verified by the Luhn test."
^(s1 + s2) asString last = $0 --------------------------------- Once this step is completed, the next step will be to make the code
right without altering the algorithm (refactoring). The result should be readable and follow the team's conventions.
P.S. code attached.
--
The essence of object orientation is that objects collaborate to
achieve a goal.
Trygve Reenskaug mailto: [hidden email] Morgedalsvn. 5A http://folk.uio.no/trygver/ N-0378 Oslo http://fullOO.info Norway Tel: (+47) 468 58 625
-------------------------------------------- Stéphane Ducasse http://stephane.ducasse.free.fr / http://www.pharo.org 03 59 35 87 52 Assistant: Julie Jonas FAX 03 59 57 78 50 TEL 03 59 35 86 16 S. Ducasse - Inria 40, avenue Halley, Parc Scientifique de la Haute Borne, Bât.A, Park Plaza Villeneuve d'Ascq 59650 France
-------------------------------------------- Stéphane Ducasse 03 59 35 87 52 Assistant: Julie Jonas FAX 03 59 57 78 50 TEL 03 59 35 86 16 S. Ducasse - Inria 40, avenue Halley, Parc Scientifique de la Haute Borne, Bât.A, Park Plaza Villeneuve d'Ascq 59650 France
|