mentor question 4

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
46 messages Options
123
Reply | Threaded
Open this post in threaded view
|

Re: mentor question 4

Stéphane Ducasse


On 11 May 2020, at 23:19, Richard O'Keefe <[hidden email]> wrote:

I was saying that I expected #($a $b $c) asString ==> 'abc’.

To me it makes no sense. 
I do not understand what is asString in fact. 


If you want something that can be read back, that's what #storeString is for,

On Tue, 12 May 2020 at 01:28, Stéphane Ducasse
<[hidden email]> wrote:



On 5 May 2020, at 16:16, Richard O'Keefe <[hidden email]> wrote:

By the way, while playing with this problem, I ran into a moderately
painful issue.

There is a reason that Smalltalk has both #printString (to get a
printable representation of an object) and #asString (to convert a
sequence to another kind of sequence with the same elements.)  If I
*want* #printString, I know where to find it.  The definition in my
Smalltalk no reads

  asString
    "What should #($a $b $c) do?
    - Blue Book, Inside Smalltalk, Apple Smalltalk-80:
      there is no #asString.
    - ANSI, VW, Dolphin, CSOM:
      #asString is defined on characters and strings
      (and things like file names and URIs that are sort of strings),
      so expect an error report.
    - VisualAge Smalltalk:
      '($a $b $c)'
    - Squeak and Pharo:
      '#($a $b $c)'
    - GNU Smalltalk, Smalltalk/X, and astc:
      'abc'
     I don't intend any gratuitous incompatibility, but when there
     is no consensus to be compatible with, one must pick something,
     and this seems most useful.
    "
    ^String withAll: self

Does anyone here know WHY Squeak and Pharo do what they do here?


Oops I did not see the quotes on my screen..

#( a b c) asString
'#(#a #b #c)’

this is unclear to me why this is not good but I have no strong opinion
that this is good.

I worked on printString for literals because I wanted to have
self evaluating properties for basic literal like in Scheme and others.
where
#t

#t

And I payed attention that we get the same for literal arrays.
Now the conversion is open to me.

#($a $b $c) asString

'#($a $b $c)’

In fact I do not really understand why a string

#($a $b $c) asString would be '(a b c)’
and its use
if this is to nicely display in the ui I would have
displayString doing it.

S.



On Wed, 6 May 2020 at 01:20, Richard O'Keefe <[hidden email]> wrote:


The irony is that the code I was responding to ISN'T obviously correct.
Indeed, I found it rather puzzling.
The problem specification says that the input string may contain digits
AND SPACES.  The original message includes this:

Strings of length 1 or less are not valid. Spaces are allowed in the
input, but they should be stripped before checking. All other
non-digit characters are disallowed.

Now it isn't clear what "disallowed" means.  I took it to mean "may occur and
should simply mean the input is rejected as invalid."  Perhaps "may not occur"
was the intention.  So we shall not quibble about such characters.

But I can't for the life of me figure out how Trygve's code checks for spaces.
One reason this is an issue is that the behaviour of #digitValue is not
consistent between systems.
Character space digitValue
  does not exist in the ANSI standard
  answers -1 in many Smalltalks (which is a pain)
  answers a positive integer that can't be mistake for a digit in my Smalltalk
  raises an exception in some Smalltalks.

This is a comment I now have in my Smalltalk library for #digitValue
    "This is in the Blue Book, but unspecified on non-digits.
     Squeak, Pharo, Dolphin, VW, VAST, and Apple Smalltalk-80
     answer -1 for characters that are not digits (or ASCII letters),
     which is unfortunate but consistent with Inside Smalltalk
     which specifies this result for non-digits.
     ST/X and GST raise an exception which is worse.
     Digitalk ST/V documentation doesn't specify the result.
     This selector is *much* easier to use safely if it
     returns a 'large' (>= 36) value for non-digits."

Let's compare three versions, the two I compared last time,
and the "version A" code I discussed before, which to my mind
is fairly readable.

"Don't add slowness": 1 (normalised time)
"Trygve's code":  6.5
"High level code": 30.6 (or 4.7 times slower than Trygve's)

Here's the "High level code".
    ^(aString allSatisfy: [:each | each isSpace or: [each isDigit]]) and: [
      |digitsReverse|
      digitsReverse := (aString select: [:each | each isDigit]) reverse.
      digitsReverse size > 1 and: [
        |evens odds evenSum oddSum|
        odds  := digitsReverse withIndexSelect: [:y :i | i odd].
        evens := digitsReverse withIndexSelect: [:x :i | i even].
        oddSum  := odds  detectSum: [:y | y digitValue].
        evenSum := evens detectSum: [:x |
                     #(0 2 4 6 8 1 3 5 7 9) at: x digitValue + 1].
        (oddSum + evenSum) \\ 10 = 0]]

This is the kind of code I was recommending that Roelof write.

As a rough guide, by counting traversals (including ones inside existing
methods), I'd expect the "high level" code to be at least 10 times slower
than the "no added slowness" code.

We are in vehement agreement that there is a time to write high level
really obvious easily testable and debuggable code, and that's most
of the time, especially with programming exercises.

I hope that we are also in agreement that factors of 30 (or even 6)
*can* be a serious problem.  I mean, if I wanted something that slow,
I'd use Ruby.

I hope we are also agreed that (with the exception of investigations
like this one) the time to hack on something to make it faster is AFTER
you have profiled it and determined that you have a problem.

But I respectfully suggest that there is a difference taking slowness OUT
and simply not going out of your way to add slowness in the first place.

I'd also like to remark that my preference for methods that traverse a
sequence exactly once has more to do with Smalltalk protocols than
with efficiency.  If the only method I perform on an object is #do:
the method will work just as well for readable streams as for
collections.  If the only method I perform on an object is #reverseDo:
the method will work just as well for Read[Write]Streams as for
SequenceReadableCollections, at least in my library.   It's just like
trying to write #mean so that it works for Durations as well as Numbers.

Oh heck, I suppose I should point out that much of the overheads in
this case could be eliminated by a Self-style compiler doing dynamic
inlining + loop fusion.    There's no reason *in principle*, given enough
people, money, and time, that the differences couldn't be greatly
reduced in Pharo.

On Tue, 5 May 2020 at 21:50, Trygve Reenskaug <[hidden email]> wrote:


Richard,

Thank you for looking at the code. It is comforting to learn that the code has been executed for a large number of examples without breaking. The code is not primarily written for execution but for being read and checked by the human end user. It would be nice if we could also check that it gave the right answers, but I don't know how to do that.

The first question is: Can a human domain expert read the code and sign their name for its correctness?


When this is achieved, a programming expert will transcribe the first code to a professional quality program. This time, the second code should be reviewed by an independent programmer who signs their name for its correct transcription from the first version.

--Trygve

PS: In his 1991 Turing Award Lecture, Tony Hoare said: "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."

--Trygve

On tirsdag.05.05.2020 04:41, Richard O'Keefe wrote:

As a coding experiment, I adapted Trygve  Reenskoug's code to my
Smalltalk compiler, put in my code slightly tweaked, and benchmarked
them on randomly generated data.

Result: a factor of 6.3.

In Squeak it was a factor of ten.

I had not, in all honesty, expected it to to be so high.

On Tue, 5 May 2020 at 02:00, Trygve Reenskaug <[hidden email]> wrote:

A coding experiment.
Consider a Scrum development environment. Every programming team has an end user as a member.
The team's task is to code a credit card validity check.
A first goal is that the user representative shall read the code and agree that it is a correct rendering of their code checker:

  luhnTest: trialNumber
      | s1 odd s2 even charValue reverse |
-----------------------------------------------
" Luhn test according to Rosetta"
"Reverse the order of the digits in the number."
  reverse := trialNumber reversed.
"Take the first, third, ... and every other odd digit in the reversed digits and sum them to form the partial sum s1"
  s1 := 0.
  odd := true.
  reverse do:
      [:char |
          odd
              ifTrue: [
                  s1 := s1 + char digitValue.
              ].
              odd := odd not
      ].
"Taking the second, fourth ... and every other even digit in the reversed digits:
Multiply each digit by two and sum the digits if the answer is greater than nine to form partial sums for the even digits"
  "The subtracting 9 gives the same answer. "
"Sum the partial sums of the even digits to form s2"
  s2 := 0.
  even := false.
  reverse do:
      [:char |
          even
              ifTrue: [
                  charValue := char digitValue * 2.
                  charValue > 9 ifTrue: [charValue := charValue - 9].
                  s2 := s2 + charValue
              ].
              even := even not
      ].
"If s1 + s2 ends in zero then the original number is in the form of a valid credit card number as verified by the Luhn test."
  ^(s1 + s2) asString last = $0
---------------------------------
Once this step is completed, the next step will be to make the code right without altering the algorithm (refactoring). The result should be readable and follow the team's conventions.


P.S. code attached.


--

The essence of object orientation is that objects collaborate  to achieve a goal.
Trygve Reenskaug      mailto: [hidden email]
Morgedalsvn. 5A       http://folk.uio.no/trygver/
N-0378 Oslo             http://fullOO.info
Norway                     Tel: (+47) 468 58 625



--------------------------------------------
Stéphane Ducasse
http://stephane.ducasse.free.fr / http://www.pharo.org
03 59 35 87 52
Assistant: Julie Jonas
FAX 03 59 57 78 50
TEL 03 59 35 86 16
S. Ducasse - Inria
40, avenue Halley,
Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
Villeneuve d'Ascq 59650
France



--------------------------------------------
Stéphane Ducasse
03 59 35 87 52
Assistant: Julie Jonas 
FAX 03 59 57 78 50
TEL 03 59 35 86 16
S. Ducasse - Inria
40, avenue Halley, 
Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
Villeneuve d'Ascq 59650
France

Reply | Threaded
Open this post in threaded view
|

Re: mentor question 4

Stéphane Ducasse
In reply to this post by Richard Sargent (again)


On 11 May 2020, at 23:48, Richard Sargent <[hidden email]> wrote:



On May 11, 2020 2:19:49 PM PDT, Richard O'Keefe <[hidden email]> wrote:
I was saying that I expected #($a $b $c) asString ==> 'abc'.

Over the years, I found myself being opposed to the idea that all objects can sensibly have an #asString implementation. When it's been done, it ultimately caused more problems than it solved.

I agree and I would like to clean this situation. 
I would like to use displayString when you want to show something in UI element. 
printString for a limited UI for debugging purposes. 

Consider $(48 49 50) asString. Do you expect it to give you a string with all the digits? Or perhaps it's meant to interpret the elements as byte-like things, as you would need for "String withAll: aCollection". So, the numbers could be interpreted as codepoints, as they are in a ByteArray.

But, what does "(Array with: Object new with: ProcessScheduler) asString" mean?

It seems to me that having all objects understand #asString leads to confusion.

If you want an array to print as its literal representation, implement #printAsLiteral, so that your intention is clear.

+ 1000

S

If you want something that can be read back, that's what #storeString
is for,

On Tue, 12 May 2020 at 01:28, Stéphane Ducasse
<[hidden email]> wrote:



On 5 May 2020, at 16:16, Richard O'Keefe <[hidden email]> wrote:

By the way, while playing with this problem, I ran into a moderately
painful issue.

There is a reason that Smalltalk has both #printString (to get a
printable representation of an object) and #asString (to convert a
sequence to another kind of sequence with the same elements.)  If I
*want* #printString, I know where to find it.  The definition in my
Smalltalk no reads

  asString
    "What should #($a $b $c) do?
    - Blue Book, Inside Smalltalk, Apple Smalltalk-80:
      there is no #asString.
    - ANSI, VW, Dolphin, CSOM:
      #asString is defined on characters and strings
      (and things like file names and URIs that are sort of
strings),
      so expect an error report.
    - VisualAge Smalltalk:
      '($a $b $c)'
    - Squeak and Pharo:
      '#($a $b $c)'
    - GNU Smalltalk, Smalltalk/X, and astc:
      'abc'
     I don't intend any gratuitous incompatibility, but when there
     is no consensus to be compatible with, one must pick something,
     and this seems most useful.
    "
    ^String withAll: self

Does anyone here know WHY Squeak and Pharo do what they do here?


Oops I did not see the quotes on my screen..

#( a b c) asString
'#(#a #b #c)’

this is unclear to me why this is not good but I have no strong
opinion
that this is good.

I worked on printString for literals because I wanted to have
self evaluating properties for basic literal like in Scheme and
others.
where
#t

#t

And I payed attention that we get the same for literal arrays.
Now the conversion is open to me.

#($a $b $c) asString

'#($a $b $c)’

In fact I do not really understand why a string

#($a $b $c) asString would be '(a b c)’
and its use
if this is to nicely display in the ui I would have
displayString doing it.

S.



On Wed, 6 May 2020 at 01:20, Richard O'Keefe <[hidden email]>
wrote:


The irony is that the code I was responding to ISN'T obviously
correct.
Indeed, I found it rather puzzling.
The problem specification says that the input string may contain
digits
AND SPACES.  The original message includes this:

Strings of length 1 or less are not valid. Spaces are allowed in the
input, but they should be stripped before checking. All other
non-digit characters are disallowed.

Now it isn't clear what "disallowed" means.  I took it to mean "may
occur and
should simply mean the input is rejected as invalid."  Perhaps "may
not occur"
was the intention.  So we shall not quibble about such characters.

But I can't for the life of me figure out how Trygve's code checks
for spaces.
One reason this is an issue is that the behaviour of #digitValue is
not
consistent between systems.
Character space digitValue
  does not exist in the ANSI standard
  answers -1 in many Smalltalks (which is a pain)
  answers a positive integer that can't be mistake for a digit in my
Smalltalk
  raises an exception in some Smalltalks.

This is a comment I now have in my Smalltalk library for #digitValue
    "This is in the Blue Book, but unspecified on non-digits.
     Squeak, Pharo, Dolphin, VW, VAST, and Apple Smalltalk-80
     answer -1 for characters that are not digits (or ASCII
letters),
     which is unfortunate but consistent with Inside Smalltalk
     which specifies this result for non-digits.
     ST/X and GST raise an exception which is worse.
     Digitalk ST/V documentation doesn't specify the result.
     This selector is *much* easier to use safely if it
     returns a 'large' (>= 36) value for non-digits."

Let's compare three versions, the two I compared last time,
and the "version A" code I discussed before, which to my mind
is fairly readable.

"Don't add slowness": 1 (normalised time)
"Trygve's code":  6.5
"High level code": 30.6 (or 4.7 times slower than Trygve's)

Here's the "High level code".
    ^(aString allSatisfy: [:each | each isSpace or: [each isDigit]])
and: [
      |digitsReverse|
      digitsReverse := (aString select: [:each | each isDigit])
reverse.
      digitsReverse size > 1 and: [
        |evens odds evenSum oddSum|
        odds  := digitsReverse withIndexSelect: [:y :i | i odd].
        evens := digitsReverse withIndexSelect: [:x :i | i even].
        oddSum  := odds  detectSum: [:y | y digitValue].
        evenSum := evens detectSum: [:x |
                     #(0 2 4 6 8 1 3 5 7 9) at: x digitValue + 1].
        (oddSum + evenSum) \\ 10 = 0]]

This is the kind of code I was recommending that Roelof write.

As a rough guide, by counting traversals (including ones inside
existing
methods), I'd expect the "high level" code to be at least 10 times
slower
than the "no added slowness" code.

We are in vehement agreement that there is a time to write high level
really obvious easily testable and debuggable code, and that's most
of the time, especially with programming exercises.

I hope that we are also in agreement that factors of 30 (or even 6)
*can* be a serious problem.  I mean, if I wanted something that slow,
I'd use Ruby.

I hope we are also agreed that (with the exception of investigations
like this one) the time to hack on something to make it faster is
AFTER
you have profiled it and determined that you have a problem.

But I respectfully suggest that there is a difference taking slowness
OUT
and simply not going out of your way to add slowness in the first
place.

I'd also like to remark that my preference for methods that traverse
a
sequence exactly once has more to do with Smalltalk protocols than
with efficiency.  If the only method I perform on an object is #do:
the method will work just as well for readable streams as for
collections.  If the only method I perform on an object is
#reverseDo:
the method will work just as well for Read[Write]Streams as for
SequenceReadableCollections, at least in my library.   It's just like
trying to write #mean so that it works for Durations as well as
Numbers.

Oh heck, I suppose I should point out that much of the overheads in
this case could be eliminated by a Self-style compiler doing dynamic
inlining + loop fusion.    There's no reason *in principle*, given
enough
people, money, and time, that the differences couldn't be greatly
reduced in Pharo.

On Tue, 5 May 2020 at 21:50, Trygve Reenskaug <[hidden email]>
wrote:


Richard,

Thank you for looking at the code. It is comforting to learn that the
code has been executed for a large number of examples without breaking.
The code is not primarily written for execution but for being read and
checked by the human end user. It would be nice if we could also check
that it gave the right answers, but I don't know how to do that.

The first question is: Can a human domain expert read the code and
sign their name for its correctness?


When this is achieved, a programming expert will transcribe the first
code to a professional quality program. This time, the second code
should be reviewed by an independent programmer who signs their name
for its correct transcription from the first version.

--Trygve

PS: In his 1991 Turing Award Lecture, Tony Hoare said: "There are two
ways of constructing a software design: One way is to make it so simple
that there are obviously no deficiencies and the other is to make it so
complicated that there are no obvious deficiencies. The first method is
far more difficult."

--Trygve

On tirsdag.05.05.2020 04:41, Richard O'Keefe wrote:

As a coding experiment, I adapted Trygve  Reenskoug's code to my
Smalltalk compiler, put in my code slightly tweaked, and benchmarked
them on randomly generated data.

Result: a factor of 6.3.

In Squeak it was a factor of ten.

I had not, in all honesty, expected it to to be so high.

On Tue, 5 May 2020 at 02:00, Trygve Reenskaug <[hidden email]>
wrote:

A coding experiment.
Consider a Scrum development environment. Every programming team has
an end user as a member.
The team's task is to code a credit card validity check.
A first goal is that the user representative shall read the code and
agree that it is a correct rendering of their code checker:

  luhnTest: trialNumber
      | s1 odd s2 even charValue reverse |
-----------------------------------------------
" Luhn test according to Rosetta"
"Reverse the order of the digits in the number."
  reverse := trialNumber reversed.
"Take the first, third, ... and every other odd digit in the reversed
digits and sum them to form the partial sum s1"
  s1 := 0.
  odd := true.
  reverse do:
      [:char |
          odd
              ifTrue: [
                  s1 := s1 + char digitValue.
              ].
              odd := odd not
      ].
"Taking the second, fourth ... and every other even digit in the
reversed digits:
Multiply each digit by two and sum the digits if the answer is
greater than nine to form partial sums for the even digits"
  "The subtracting 9 gives the same answer. "
"Sum the partial sums of the even digits to form s2"
  s2 := 0.
  even := false.
  reverse do:
      [:char |
          even
              ifTrue: [
                  charValue := char digitValue * 2.
                  charValue > 9 ifTrue: [charValue := charValue -
9].
                  s2 := s2 + charValue
              ].
              even := even not
      ].
"If s1 + s2 ends in zero then the original number is in the form of a
valid credit card number as verified by the Luhn test."
  ^(s1 + s2) asString last = $0
---------------------------------
Once this step is completed, the next step will be to make the code
right without altering the algorithm (refactoring). The result should
be readable and follow the team's conventions.


P.S. code attached.


--

The essence of object orientation is that objects collaborate  to
achieve a goal.
Trygve Reenskaug      mailto: [hidden email]
Morgedalsvn. 5A       http://folk.uio.no/trygver/
N-0378 Oslo             http://fullOO.info
Norway                     Tel: (+47) 468 58 625



--------------------------------------------
Stéphane Ducasse
http://stephane.ducasse.free.fr / http://www.pharo.org
03 59 35 87 52
Assistant: Julie Jonas
FAX 03 59 57 78 50
TEL 03 59 35 86 16
S. Ducasse - Inria
40, avenue Halley,
Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
Villeneuve d'Ascq 59650
France



--------------------------------------------
Stéphane Ducasse
03 59 35 87 52
Assistant: Julie Jonas 
FAX 03 59 57 78 50
TEL 03 59 35 86 16
S. Ducasse - Inria
40, avenue Halley, 
Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
Villeneuve d'Ascq 59650
France

Reply | Threaded
Open this post in threaded view
|

Re: mentor question 4

Stéphane Ducasse
In reply to this post by Richard O'Keefe
thanks this is interesting. 

S. 

On 12 May 2020, at 07:12, Richard O'Keefe <[hidden email]> wrote:

For what it's worth, here's a moderately thorough examination of several
different Smalltalk systems.
Let A = astc, D = Dolphin, G = GNU Smalltalk, Q = Squeak,
    S = ANSI standard, T = Strongtalk, V = VisualWorks, X = ST/X

#asString

ADGPQST!X  Character => String with: self
ADGPQSTVX  String => self
ADGPQSTVX  Symbol => String withAll: self
.........  Text/Paragraph, where provided, => underlying string.
A-G-----X  Collection => String withAll: self
---PQ----  Collection => self printString
A........  ByteArray => "decoded as Unicode"
.DGPQ-TVX  ByteArray => "decoded as Latin1"
ADG.Q..VX  Filename => "unparse (Filename, FilePath, whatever)
AdgpQ.TVX  URI => "unparse (ZnUrl in P)

Ad.PQ...X  UUID => "unparse (d, GUID equivalent)"
.DGpq...X  Number => self displayString "or self printString, maybe via Object"
...PQ...X  
Object => self printString

There's a rough consensus here:
 - if something *is* textual (Character, String, Symbol, Text,
   Paragraph) #asString returns it as a normal string
 - if something is a parsed form of structured text (file name,
   URI, UUID) its unparsed form is returned, converting it back
   should yield an equal parsed form
 - if something is a byte array, it may be converted to a string
   (the rule is to preserve the bytes, my library presumes byte
   arrays are UTF8-encoded)
 - if numbers are accepted at all, their #printString is returned
 - if is none of the above, but has a #name or #title, that is
   commonly the result.

The open question is what converting an array of characters using
#asString will do.  There is at least one Smalltalk out there
where #(65 66 67) asString => 'ABC', but that's a step further
than I personally want, though it is consistent with ByteArray.




On Tue, 12 May 2020 at 11:16, Jerry Kott <[hidden email]> wrote:
Hi all,

I’ve been lurking so far, but I must add my voice here and agree with Richard.

The malleability of Smalltalk tempts people into implementing #asString, #name, and similar semantically ambiguous method names. Like Richard, I regretted every single time I (or someone else on my team before me) decided to use these. It goes back to ’Smalltalk with Style’ (probably the best little red book you could ever own): write intention revealing code whenever you can, and comment liberally when you can’t.

Jerry Kott
This message has been digitally signed. 
PGP Fingerprint:
A9181736DD2F1B6CC7CF9E51AC8514F48C0979A5



On 11-05-2020, at 2:48 PM, Richard Sargent <[hidden email]> wrote:



On May 11, 2020 2:19:49 PM PDT, Richard O'Keefe <[hidden email]> wrote:
I was saying that I expected #($a $b $c) asString ==> 'abc'.

Over the years, I found myself being opposed to the idea that all objects can sensibly have an #asString implementation. When it's been done, it ultimately caused more problems than it solved.

Consider $(48 49 50) asString. Do you expect it to give you a string with all the digits? Or perhaps it's meant to interpret the elements as byte-like things, as you would need for "String withAll: aCollection". So, the numbers could be interpreted as codepoints, as they are in a ByteArray.

But, what does "(Array with: Object new with: ProcessScheduler) asString" mean?

It seems to me that having all objects understand #asString leads to confusion.

If you want an array to print as its literal representation, implement #printAsLiteral, so that your intention is clear.


If you want something that can be read back, that's what #storeString
is for,

On Tue, 12 May 2020 at 01:28, Stéphane Ducasse
<[hidden email]> wrote:



On 5 May 2020, at 16:16, Richard O'Keefe <[hidden email]> wrote:

By the way, while playing with this problem, I ran into a moderately
painful issue.

There is a reason that Smalltalk has both #printString (to get a
printable representation of an object) and #asString (to convert a
sequence to another kind of sequence with the same elements.)  If I
*want* #printString, I know where to find it.  The definition in my
Smalltalk no reads

  asString
    "What should #($a $b $c) do?
    - Blue Book, Inside Smalltalk, Apple Smalltalk-80:
      there is no #asString.
    - ANSI, VW, Dolphin, CSOM:
      #asString is defined on characters and strings
      (and things like file names and URIs that are sort of
strings),
      so expect an error report.
    - VisualAge Smalltalk:
      '($a $b $c)'
    - Squeak and Pharo:
      '#($a $b $c)'
    - GNU Smalltalk, Smalltalk/X, and astc:
      'abc'
     I don't intend any gratuitous incompatibility, but when there
     is no consensus to be compatible with, one must pick something,
     and this seems most useful.
    "
    ^String withAll: self

Does anyone here know WHY Squeak and Pharo do what they do here?


Oops I did not see the quotes on my screen..

#( a b c) asString
'#(#a #b #c)’

this is unclear to me why this is not good but I have no strong
opinion
that this is good.

I worked on printString for literals because I wanted to have
self evaluating properties for basic literal like in Scheme and
others.
where
#t

#t

And I payed attention that we get the same for literal arrays.
Now the conversion is open to me.

#($a $b $c) asString

'#($a $b $c)’

In fact I do not really understand why a string

#($a $b $c) asString would be '(a b c)’
and its use
if this is to nicely display in the ui I would have
displayString doing it.

S.



On Wed, 6 May 2020 at 01:20, Richard O'Keefe <[hidden email]>
wrote:


The irony is that the code I was responding to ISN'T obviously
correct.
Indeed, I found it rather puzzling.
The problem specification says that the input string may contain
digits
AND SPACES.  The original message includes this:

Strings of length 1 or less are not valid. Spaces are allowed in the
input, but they should be stripped before checking. All other
non-digit characters are disallowed.

Now it isn't clear what "disallowed" means.  I took it to mean "may
occur and
should simply mean the input is rejected as invalid."  Perhaps "may
not occur"
was the intention.  So we shall not quibble about such characters.

But I can't for the life of me figure out how Trygve's code checks
for spaces.
One reason this is an issue is that the behaviour of #digitValue is
not
consistent between systems.
Character space digitValue
  does not exist in the ANSI standard
  answers -1 in many Smalltalks (which is a pain)
  answers a positive integer that can't be mistake for a digit in my
Smalltalk
  raises an exception in some Smalltalks.

This is a comment I now have in my Smalltalk library for #digitValue
    "This is in the Blue Book, but unspecified on non-digits.
     Squeak, Pharo, Dolphin, VW, VAST, and Apple Smalltalk-80
     answer -1 for characters that are not digits (or ASCII
letters),
     which is unfortunate but consistent with Inside Smalltalk
     which specifies this result for non-digits.
     ST/X and GST raise an exception which is worse.
     Digitalk ST/V documentation doesn't specify the result.
     This selector is *much* easier to use safely if it
     returns a 'large' (>= 36) value for non-digits."

Let's compare three versions, the two I compared last time,
and the "version A" code I discussed before, which to my mind
is fairly readable.

"Don't add slowness": 1 (normalised time)
"Trygve's code":  6.5
"High level code": 30.6 (or 4.7 times slower than Trygve's)

Here's the "High level code".
    ^(aString allSatisfy: [:each | each isSpace or: [each isDigit]])
and: [
      |digitsReverse|
      digitsReverse := (aString select: [:each | each isDigit])
reverse.
      digitsReverse size > 1 and: [
        |evens odds evenSum oddSum|
        odds  := digitsReverse withIndexSelect: [:y :i | i odd].
        evens := digitsReverse withIndexSelect: [:x :i | i even].
        oddSum  := odds  detectSum: [:y | y digitValue].
        evenSum := evens detectSum: [:x |
                     #(0 2 4 6 8 1 3 5 7 9) at: x digitValue + 1].
        (oddSum + evenSum) \\ 10 = 0]]

This is the kind of code I was recommending that Roelof write.

As a rough guide, by counting traversals (including ones inside
existing
methods), I'd expect the "high level" code to be at least 10 times
slower
than the "no added slowness" code.

We are in vehement agreement that there is a time to write high level
really obvious easily testable and debuggable code, and that's most
of the time, especially with programming exercises.

I hope that we are also in agreement that factors of 30 (or even 6)
*can* be a serious problem.  I mean, if I wanted something that slow,
I'd use Ruby.

I hope we are also agreed that (with the exception of investigations
like this one) the time to hack on something to make it faster is
AFTER
you have profiled it and determined that you have a problem.

But I respectfully suggest that there is a difference taking slowness
OUT
and simply not going out of your way to add slowness in the first
place.

I'd also like to remark that my preference for methods that traverse
a
sequence exactly once has more to do with Smalltalk protocols than
with efficiency.  If the only method I perform on an object is #do:
the method will work just as well for readable streams as for
collections.  If the only method I perform on an object is
#reverseDo:
the method will work just as well for Read[Write]Streams as for
SequenceReadableCollections, at least in my library.   It's just like
trying to write #mean so that it works for Durations as well as
Numbers.

Oh heck, I suppose I should point out that much of the overheads in
this case could be eliminated by a Self-style compiler doing dynamic
inlining + loop fusion.    There's no reason *in principle*, given
enough
people, money, and time, that the differences couldn't be greatly
reduced in Pharo.

On Tue, 5 May 2020 at 21:50, Trygve Reenskaug <[hidden email]>
wrote:


Richard,

Thank you for looking at the code. It is comforting to learn that the
code has been executed for a large number of examples without breaking.
The code is not primarily written for execution but for being read and
checked by the human end user. It would be nice if we could also check
that it gave the right answers, but I don't know how to do that.

The first question is: Can a human domain expert read the code and
sign their name for its correctness?


When this is achieved, a programming expert will transcribe the first
code to a professional quality program. This time, the second code
should be reviewed by an independent programmer who signs their name
for its correct transcription from the first version.

--Trygve

PS: In his 1991 Turing Award Lecture, Tony Hoare said: "There are two
ways of constructing a software design: One way is to make it so simple
that there are obviously no deficiencies and the other is to make it so
complicated that there are no obvious deficiencies. The first method is
far more difficult."

--Trygve

On tirsdag.05.05.2020 04:41, Richard O'Keefe wrote:

As a coding experiment, I adapted Trygve  Reenskoug's code to my
Smalltalk compiler, put in my code slightly tweaked, and benchmarked
them on randomly generated data.

Result: a factor of 6.3.

In Squeak it was a factor of ten.

I had not, in all honesty, expected it to to be so high.

On Tue, 5 May 2020 at 02:00, Trygve Reenskaug <[hidden email]>
wrote:

A coding experiment.
Consider a Scrum development environment. Every programming team has
an end user as a member.
The team's task is to code a credit card validity check.
A first goal is that the user representative shall read the code and
agree that it is a correct rendering of their code checker:

  luhnTest: trialNumber
      | s1 odd s2 even charValue reverse |
-----------------------------------------------
" Luhn test according to Rosetta"
"Reverse the order of the digits in the number."
  reverse := trialNumber reversed.
"Take the first, third, ... and every other odd digit in the reversed
digits and sum them to form the partial sum s1"
  s1 := 0.
  odd := true.
  reverse do:
      [:char |
          odd
              ifTrue: [
                  s1 := s1 + char digitValue.
              ].
              odd := odd not
      ].
"Taking the second, fourth ... and every other even digit in the
reversed digits:
Multiply each digit by two and sum the digits if the answer is
greater than nine to form partial sums for the even digits"
  "The subtracting 9 gives the same answer. "
"Sum the partial sums of the even digits to form s2"
  s2 := 0.
  even := false.
  reverse do:
      [:char |
          even
              ifTrue: [
                  charValue := char digitValue * 2.
                  charValue > 9 ifTrue: [charValue := charValue -
9].
                  s2 := s2 + charValue
              ].
              even := even not
      ].
"If s1 + s2 ends in zero then the original number is in the form of a
valid credit card number as verified by the Luhn test."
  ^(s1 + s2) asString last = $0
---------------------------------
Once this step is completed, the next step will be to make the code
right without altering the algorithm (refactoring). The result should
be readable and follow the team's conventions.


P.S. code attached.


--

The essence of object orientation is that objects collaborate  to
achieve a goal.
Trygve Reenskaug      mailto: [hidden email]
Morgedalsvn. 5A       http://folk.uio.no/trygver/
N-0378 Oslo             http://fullOO.info
Norway                     Tel: (+47) 468 58 625



--------------------------------------------
Stéphane Ducasse
http://stephane.ducasse.free.fr / http://www.pharo.org
03 59 35 87 52
Assistant: Julie Jonas
FAX 03 59 57 78 50
TEL 03 59 35 86 16
S. Ducasse - Inria
40, avenue Halley,
Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
Villeneuve d'Ascq 59650
France





--------------------------------------------
Stéphane Ducasse
03 59 35 87 52
Assistant: Julie Jonas 
FAX 03 59 57 78 50
TEL 03 59 35 86 16
S. Ducasse - Inria
40, avenue Halley, 
Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
Villeneuve d'Ascq 59650
France

Reply | Threaded
Open this post in threaded view
|

Re: mentor question 4

Richard O'Keefe
In reply to this post by Stéphane Ducasse
Why does it make no sense?
To a first approximation,
 aCollection asWHATEVER
=
 WHATEVER withAll: aCollection

Consider
 'CAB' asSet
 'CAB' asArray
 'CAB' asSortedCollection
 'CAB' asBag
and so on.  They all mean "an instance of the class named
in the selector with the elements of the receiver."
Why should #asString be any different?

Why on earth does #($a $b $c) asString make no sense
when #($a $b $c) as: String
-- which we expect to do the same thing -- DOES apparently
make sense?

Note that when I'm talking about #asString, I haven't the slightest
thought in my mind about user interfaces.  I am talking about a
selector in the 'converting' category which I expect to act just
like other methods in the same category.  I am talking about
expecting _ as: String and _ asString to be consistent the way
that _ as: Array and _ asArray are.  I am talking about "if it
is OK to convert a String to an Array, why can't I convert that
Array back to a String using the same pattern of code"?

#printString exists.
#storeString exists.
#displayString exists -- although I have never seen a clear description of exactly what it is supposed to do and how it is supposed to differ from #printString, and it is certainly not portable.

If I want the effect of any one of these, I am going to use it and NOT #asString.  If I want to convert an Array or a Set to a String with the same elements, I am going to expect #asString to do it.  When #asString is inconsistent with all the other #asWHATEVER methods, I am going to regard it as a bug.

How about a compromise?

Collection>>asString
  ^(self allSatisfy: [:each | each isCharacter])
     ifTrue: [String withAll: self]
     ifFalse: [super asString]





On Fri, 15 May 2020 at 02:30, Stéphane Ducasse <[hidden email]> wrote:


On 11 May 2020, at 23:19, Richard O'Keefe <[hidden email]> wrote:

I was saying that I expected #($a $b $c) asString ==> 'abc’.

To me it makes no sense. 
I do not understand what is asString in fact. 


If you want something that can be read back, that's what #storeString is for,

On Tue, 12 May 2020 at 01:28, Stéphane Ducasse
<[hidden email]> wrote:



On 5 May 2020, at 16:16, Richard O'Keefe <[hidden email]> wrote:

By the way, while playing with this problem, I ran into a moderately
painful issue.

There is a reason that Smalltalk has both #printString (to get a
printable representation of an object) and #asString (to convert a
sequence to another kind of sequence with the same elements.)  If I
*want* #printString, I know where to find it.  The definition in my
Smalltalk no reads

  asString
    "What should #($a $b $c) do?
    - Blue Book, Inside Smalltalk, Apple Smalltalk-80:
      there is no #asString.
    - ANSI, VW, Dolphin, CSOM:
      #asString is defined on characters and strings
      (and things like file names and URIs that are sort of strings),
      so expect an error report.
    - VisualAge Smalltalk:
      '($a $b $c)'
    - Squeak and Pharo:
      '#($a $b $c)'
    - GNU Smalltalk, Smalltalk/X, and astc:
      'abc'
     I don't intend any gratuitous incompatibility, but when there
     is no consensus to be compatible with, one must pick something,
     and this seems most useful.
    "
    ^String withAll: self

Does anyone here know WHY Squeak and Pharo do what they do here?


Oops I did not see the quotes on my screen..

#( a b c) asString
'#(#a #b #c)’

this is unclear to me why this is not good but I have no strong opinion
that this is good.

I worked on printString for literals because I wanted to have
self evaluating properties for basic literal like in Scheme and others.
where
#t

#t

And I payed attention that we get the same for literal arrays.
Now the conversion is open to me.

#($a $b $c) asString

'#($a $b $c)’

In fact I do not really understand why a string

#($a $b $c) asString would be '(a b c)’
and its use
if this is to nicely display in the ui I would have
displayString doing it.

S.



On Wed, 6 May 2020 at 01:20, Richard O'Keefe <[hidden email]> wrote:


The irony is that the code I was responding to ISN'T obviously correct.
Indeed, I found it rather puzzling.
The problem specification says that the input string may contain digits
AND SPACES.  The original message includes this:

Strings of length 1 or less are not valid. Spaces are allowed in the
input, but they should be stripped before checking. All other
non-digit characters are disallowed.

Now it isn't clear what "disallowed" means.  I took it to mean "may occur and
should simply mean the input is rejected as invalid."  Perhaps "may not occur"
was the intention.  So we shall not quibble about such characters.

But I can't for the life of me figure out how Trygve's code checks for spaces.
One reason this is an issue is that the behaviour of #digitValue is not
consistent between systems.
Character space digitValue
  does not exist in the ANSI standard
  answers -1 in many Smalltalks (which is a pain)
  answers a positive integer that can't be mistake for a digit in my Smalltalk
  raises an exception in some Smalltalks.

This is a comment I now have in my Smalltalk library for #digitValue
    "This is in the Blue Book, but unspecified on non-digits.
     Squeak, Pharo, Dolphin, VW, VAST, and Apple Smalltalk-80
     answer -1 for characters that are not digits (or ASCII letters),
     which is unfortunate but consistent with Inside Smalltalk
     which specifies this result for non-digits.
     ST/X and GST raise an exception which is worse.
     Digitalk ST/V documentation doesn't specify the result.
     This selector is *much* easier to use safely if it
     returns a 'large' (>= 36) value for non-digits."

Let's compare three versions, the two I compared last time,
and the "version A" code I discussed before, which to my mind
is fairly readable.

"Don't add slowness": 1 (normalised time)
"Trygve's code":  6.5
"High level code": 30.6 (or 4.7 times slower than Trygve's)

Here's the "High level code".
    ^(aString allSatisfy: [:each | each isSpace or: [each isDigit]]) and: [
      |digitsReverse|
      digitsReverse := (aString select: [:each | each isDigit]) reverse.
      digitsReverse size > 1 and: [
        |evens odds evenSum oddSum|
        odds  := digitsReverse withIndexSelect: [:y :i | i odd].
        evens := digitsReverse withIndexSelect: [:x :i | i even].
        oddSum  := odds  detectSum: [:y | y digitValue].
        evenSum := evens detectSum: [:x |
                     #(0 2 4 6 8 1 3 5 7 9) at: x digitValue + 1].
        (oddSum + evenSum) \\ 10 = 0]]

This is the kind of code I was recommending that Roelof write.

As a rough guide, by counting traversals (including ones inside existing
methods), I'd expect the "high level" code to be at least 10 times slower
than the "no added slowness" code.

We are in vehement agreement that there is a time to write high level
really obvious easily testable and debuggable code, and that's most
of the time, especially with programming exercises.

I hope that we are also in agreement that factors of 30 (or even 6)
*can* be a serious problem.  I mean, if I wanted something that slow,
I'd use Ruby.

I hope we are also agreed that (with the exception of investigations
like this one) the time to hack on something to make it faster is AFTER
you have profiled it and determined that you have a problem.

But I respectfully suggest that there is a difference taking slowness OUT
and simply not going out of your way to add slowness in the first place.

I'd also like to remark that my preference for methods that traverse a
sequence exactly once has more to do with Smalltalk protocols than
with efficiency.  If the only method I perform on an object is #do:
the method will work just as well for readable streams as for
collections.  If the only method I perform on an object is #reverseDo:
the method will work just as well for Read[Write]Streams as for
SequenceReadableCollections, at least in my library.   It's just like
trying to write #mean so that it works for Durations as well as Numbers.

Oh heck, I suppose I should point out that much of the overheads in
this case could be eliminated by a Self-style compiler doing dynamic
inlining + loop fusion.    There's no reason *in principle*, given enough
people, money, and time, that the differences couldn't be greatly
reduced in Pharo.

On Tue, 5 May 2020 at 21:50, Trygve Reenskaug <[hidden email]> wrote:


Richard,

Thank you for looking at the code. It is comforting to learn that the code has been executed for a large number of examples without breaking. The code is not primarily written for execution but for being read and checked by the human end user. It would be nice if we could also check that it gave the right answers, but I don't know how to do that.

The first question is: Can a human domain expert read the code and sign their name for its correctness?


When this is achieved, a programming expert will transcribe the first code to a professional quality program. This time, the second code should be reviewed by an independent programmer who signs their name for its correct transcription from the first version.

--Trygve

PS: In his 1991 Turing Award Lecture, Tony Hoare said: "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."

--Trygve

On tirsdag.05.05.2020 04:41, Richard O'Keefe wrote:

As a coding experiment, I adapted Trygve  Reenskoug's code to my
Smalltalk compiler, put in my code slightly tweaked, and benchmarked
them on randomly generated data.

Result: a factor of 6.3.

In Squeak it was a factor of ten.

I had not, in all honesty, expected it to to be so high.

On Tue, 5 May 2020 at 02:00, Trygve Reenskaug <[hidden email]> wrote:

A coding experiment.
Consider a Scrum development environment. Every programming team has an end user as a member.
The team's task is to code a credit card validity check.
A first goal is that the user representative shall read the code and agree that it is a correct rendering of their code checker:

  luhnTest: trialNumber
      | s1 odd s2 even charValue reverse |
-----------------------------------------------
" Luhn test according to Rosetta"
"Reverse the order of the digits in the number."
  reverse := trialNumber reversed.
"Take the first, third, ... and every other odd digit in the reversed digits and sum them to form the partial sum s1"
  s1 := 0.
  odd := true.
  reverse do:
      [:char |
          odd
              ifTrue: [
                  s1 := s1 + char digitValue.
              ].
              odd := odd not
      ].
"Taking the second, fourth ... and every other even digit in the reversed digits:
Multiply each digit by two and sum the digits if the answer is greater than nine to form partial sums for the even digits"
  "The subtracting 9 gives the same answer. "
"Sum the partial sums of the even digits to form s2"
  s2 := 0.
  even := false.
  reverse do:
      [:char |
          even
              ifTrue: [
                  charValue := char digitValue * 2.
                  charValue > 9 ifTrue: [charValue := charValue - 9].
                  s2 := s2 + charValue
              ].
              even := even not
      ].
"If s1 + s2 ends in zero then the original number is in the form of a valid credit card number as verified by the Luhn test."
  ^(s1 + s2) asString last = $0
---------------------------------
Once this step is completed, the next step will be to make the code right without altering the algorithm (refactoring). The result should be readable and follow the team's conventions.


P.S. code attached.


--

The essence of object orientation is that objects collaborate  to achieve a goal.
Trygve Reenskaug      mailto: [hidden email]
Morgedalsvn. 5A       http://folk.uio.no/trygver/
N-0378 Oslo             http://fullOO.info
Norway                     Tel: (+47) 468 58 625



--------------------------------------------
Stéphane Ducasse
http://stephane.ducasse.free.fr / http://www.pharo.org
03 59 35 87 52
Assistant: Julie Jonas
FAX 03 59 57 78 50
TEL 03 59 35 86 16
S. Ducasse - Inria
40, avenue Halley,
Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
Villeneuve d'Ascq 59650
France



--------------------------------------------
Stéphane Ducasse
03 59 35 87 52
Assistant: Julie Jonas 
FAX 03 59 57 78 50
TEL 03 59 35 86 16
S. Ducasse - Inria
40, avenue Halley, 
Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
Villeneuve d'Ascq 59650
France

Reply | Threaded
Open this post in threaded view
|

Re: mentor question 4

Richard O'Keefe
PS: I should make it clear that I do not expect anArray asString
to work for *all* arrays.  In my Smalltalk, #(true false) asString
raises an exception, and so does #(65 66 67) asString.


On Sat, 16 May 2020 at 00:42, Richard O'Keefe <[hidden email]> wrote:
Why does it make no sense?
To a first approximation,
 aCollection asWHATEVER
=
 WHATEVER withAll: aCollection

Consider
 'CAB' asSet
 'CAB' asArray
 'CAB' asSortedCollection
 'CAB' asBag
and so on.  They all mean "an instance of the class named
in the selector with the elements of the receiver."
Why should #asString be any different?

Why on earth does #($a $b $c) asString make no sense
when #($a $b $c) as: String
-- which we expect to do the same thing -- DOES apparently
make sense?

Note that when I'm talking about #asString, I haven't the slightest
thought in my mind about user interfaces.  I am talking about a
selector in the 'converting' category which I expect to act just
like other methods in the same category.  I am talking about
expecting _ as: String and _ asString to be consistent the way
that _ as: Array and _ asArray are.  I am talking about "if it
is OK to convert a String to an Array, why can't I convert that
Array back to a String using the same pattern of code"?

#printString exists.
#storeString exists.
#displayString exists -- although I have never seen a clear description of exactly what it is supposed to do and how it is supposed to differ from #printString, and it is certainly not portable.

If I want the effect of any one of these, I am going to use it and NOT #asString.  If I want to convert an Array or a Set to a String with the same elements, I am going to expect #asString to do it.  When #asString is inconsistent with all the other #asWHATEVER methods, I am going to regard it as a bug.

How about a compromise?

Collection>>asString
  ^(self allSatisfy: [:each | each isCharacter])
     ifTrue: [String withAll: self]
     ifFalse: [super asString]





On Fri, 15 May 2020 at 02:30, Stéphane Ducasse <[hidden email]> wrote:


On 11 May 2020, at 23:19, Richard O'Keefe <[hidden email]> wrote:

I was saying that I expected #($a $b $c) asString ==> 'abc’.

To me it makes no sense. 
I do not understand what is asString in fact. 


If you want something that can be read back, that's what #storeString is for,

On Tue, 12 May 2020 at 01:28, Stéphane Ducasse
<[hidden email]> wrote:



On 5 May 2020, at 16:16, Richard O'Keefe <[hidden email]> wrote:

By the way, while playing with this problem, I ran into a moderately
painful issue.

There is a reason that Smalltalk has both #printString (to get a
printable representation of an object) and #asString (to convert a
sequence to another kind of sequence with the same elements.)  If I
*want* #printString, I know where to find it.  The definition in my
Smalltalk no reads

  asString
    "What should #($a $b $c) do?
    - Blue Book, Inside Smalltalk, Apple Smalltalk-80:
      there is no #asString.
    - ANSI, VW, Dolphin, CSOM:
      #asString is defined on characters and strings
      (and things like file names and URIs that are sort of strings),
      so expect an error report.
    - VisualAge Smalltalk:
      '($a $b $c)'
    - Squeak and Pharo:
      '#($a $b $c)'
    - GNU Smalltalk, Smalltalk/X, and astc:
      'abc'
     I don't intend any gratuitous incompatibility, but when there
     is no consensus to be compatible with, one must pick something,
     and this seems most useful.
    "
    ^String withAll: self

Does anyone here know WHY Squeak and Pharo do what they do here?


Oops I did not see the quotes on my screen..

#( a b c) asString
'#(#a #b #c)’

this is unclear to me why this is not good but I have no strong opinion
that this is good.

I worked on printString for literals because I wanted to have
self evaluating properties for basic literal like in Scheme and others.
where
#t

#t

And I payed attention that we get the same for literal arrays.
Now the conversion is open to me.

#($a $b $c) asString

'#($a $b $c)’

In fact I do not really understand why a string

#($a $b $c) asString would be '(a b c)’
and its use
if this is to nicely display in the ui I would have
displayString doing it.

S.



On Wed, 6 May 2020 at 01:20, Richard O'Keefe <[hidden email]> wrote:


The irony is that the code I was responding to ISN'T obviously correct.
Indeed, I found it rather puzzling.
The problem specification says that the input string may contain digits
AND SPACES.  The original message includes this:

Strings of length 1 or less are not valid. Spaces are allowed in the
input, but they should be stripped before checking. All other
non-digit characters are disallowed.

Now it isn't clear what "disallowed" means.  I took it to mean "may occur and
should simply mean the input is rejected as invalid."  Perhaps "may not occur"
was the intention.  So we shall not quibble about such characters.

But I can't for the life of me figure out how Trygve's code checks for spaces.
One reason this is an issue is that the behaviour of #digitValue is not
consistent between systems.
Character space digitValue
  does not exist in the ANSI standard
  answers -1 in many Smalltalks (which is a pain)
  answers a positive integer that can't be mistake for a digit in my Smalltalk
  raises an exception in some Smalltalks.

This is a comment I now have in my Smalltalk library for #digitValue
    "This is in the Blue Book, but unspecified on non-digits.
     Squeak, Pharo, Dolphin, VW, VAST, and Apple Smalltalk-80
     answer -1 for characters that are not digits (or ASCII letters),
     which is unfortunate but consistent with Inside Smalltalk
     which specifies this result for non-digits.
     ST/X and GST raise an exception which is worse.
     Digitalk ST/V documentation doesn't specify the result.
     This selector is *much* easier to use safely if it
     returns a 'large' (>= 36) value for non-digits."

Let's compare three versions, the two I compared last time,
and the "version A" code I discussed before, which to my mind
is fairly readable.

"Don't add slowness": 1 (normalised time)
"Trygve's code":  6.5
"High level code": 30.6 (or 4.7 times slower than Trygve's)

Here's the "High level code".
    ^(aString allSatisfy: [:each | each isSpace or: [each isDigit]]) and: [
      |digitsReverse|
      digitsReverse := (aString select: [:each | each isDigit]) reverse.
      digitsReverse size > 1 and: [
        |evens odds evenSum oddSum|
        odds  := digitsReverse withIndexSelect: [:y :i | i odd].
        evens := digitsReverse withIndexSelect: [:x :i | i even].
        oddSum  := odds  detectSum: [:y | y digitValue].
        evenSum := evens detectSum: [:x |
                     #(0 2 4 6 8 1 3 5 7 9) at: x digitValue + 1].
        (oddSum + evenSum) \\ 10 = 0]]

This is the kind of code I was recommending that Roelof write.

As a rough guide, by counting traversals (including ones inside existing
methods), I'd expect the "high level" code to be at least 10 times slower
than the "no added slowness" code.

We are in vehement agreement that there is a time to write high level
really obvious easily testable and debuggable code, and that's most
of the time, especially with programming exercises.

I hope that we are also in agreement that factors of 30 (or even 6)
*can* be a serious problem.  I mean, if I wanted something that slow,
I'd use Ruby.

I hope we are also agreed that (with the exception of investigations
like this one) the time to hack on something to make it faster is AFTER
you have profiled it and determined that you have a problem.

But I respectfully suggest that there is a difference taking slowness OUT
and simply not going out of your way to add slowness in the first place.

I'd also like to remark that my preference for methods that traverse a
sequence exactly once has more to do with Smalltalk protocols than
with efficiency.  If the only method I perform on an object is #do:
the method will work just as well for readable streams as for
collections.  If the only method I perform on an object is #reverseDo:
the method will work just as well for Read[Write]Streams as for
SequenceReadableCollections, at least in my library.   It's just like
trying to write #mean so that it works for Durations as well as Numbers.

Oh heck, I suppose I should point out that much of the overheads in
this case could be eliminated by a Self-style compiler doing dynamic
inlining + loop fusion.    There's no reason *in principle*, given enough
people, money, and time, that the differences couldn't be greatly
reduced in Pharo.

On Tue, 5 May 2020 at 21:50, Trygve Reenskaug <[hidden email]> wrote:


Richard,

Thank you for looking at the code. It is comforting to learn that the code has been executed for a large number of examples without breaking. The code is not primarily written for execution but for being read and checked by the human end user. It would be nice if we could also check that it gave the right answers, but I don't know how to do that.

The first question is: Can a human domain expert read the code and sign their name for its correctness?


When this is achieved, a programming expert will transcribe the first code to a professional quality program. This time, the second code should be reviewed by an independent programmer who signs their name for its correct transcription from the first version.

--Trygve

PS: In his 1991 Turing Award Lecture, Tony Hoare said: "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."

--Trygve

On tirsdag.05.05.2020 04:41, Richard O'Keefe wrote:

As a coding experiment, I adapted Trygve  Reenskoug's code to my
Smalltalk compiler, put in my code slightly tweaked, and benchmarked
them on randomly generated data.

Result: a factor of 6.3.

In Squeak it was a factor of ten.

I had not, in all honesty, expected it to to be so high.

On Tue, 5 May 2020 at 02:00, Trygve Reenskaug <[hidden email]> wrote:

A coding experiment.
Consider a Scrum development environment. Every programming team has an end user as a member.
The team's task is to code a credit card validity check.
A first goal is that the user representative shall read the code and agree that it is a correct rendering of their code checker:

  luhnTest: trialNumber
      | s1 odd s2 even charValue reverse |
-----------------------------------------------
" Luhn test according to Rosetta"
"Reverse the order of the digits in the number."
  reverse := trialNumber reversed.
"Take the first, third, ... and every other odd digit in the reversed digits and sum them to form the partial sum s1"
  s1 := 0.
  odd := true.
  reverse do:
      [:char |
          odd
              ifTrue: [
                  s1 := s1 + char digitValue.
              ].
              odd := odd not
      ].
"Taking the second, fourth ... and every other even digit in the reversed digits:
Multiply each digit by two and sum the digits if the answer is greater than nine to form partial sums for the even digits"
  "The subtracting 9 gives the same answer. "
"Sum the partial sums of the even digits to form s2"
  s2 := 0.
  even := false.
  reverse do:
      [:char |
          even
              ifTrue: [
                  charValue := char digitValue * 2.
                  charValue > 9 ifTrue: [charValue := charValue - 9].
                  s2 := s2 + charValue
              ].
              even := even not
      ].
"If s1 + s2 ends in zero then the original number is in the form of a valid credit card number as verified by the Luhn test."
  ^(s1 + s2) asString last = $0
---------------------------------
Once this step is completed, the next step will be to make the code right without altering the algorithm (refactoring). The result should be readable and follow the team's conventions.


P.S. code attached.


--

The essence of object orientation is that objects collaborate  to achieve a goal.
Trygve Reenskaug      mailto: [hidden email]
Morgedalsvn. 5A       http://folk.uio.no/trygver/
N-0378 Oslo             http://fullOO.info
Norway                     Tel: (+47) 468 58 625



--------------------------------------------
Stéphane Ducasse
http://stephane.ducasse.free.fr / http://www.pharo.org
03 59 35 87 52
Assistant: Julie Jonas
FAX 03 59 57 78 50
TEL 03 59 35 86 16
S. Ducasse - Inria
40, avenue Halley,
Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
Villeneuve d'Ascq 59650
France



--------------------------------------------
Stéphane Ducasse
03 59 35 87 52
Assistant: Julie Jonas 
FAX 03 59 57 78 50
TEL 03 59 35 86 16
S. Ducasse - Inria
40, avenue Halley, 
Parc Scientifique de la Haute Borne, Bât.A, Park Plaza
Villeneuve d'Ascq 59650
France

Reply | Threaded
Open this post in threaded view
|

Re: mentor question 4

Ben Coman
In reply to this post by Richard O'Keefe


On Fri, 15 May 2020 at 20:43, Richard O'Keefe <[hidden email]> wrote:
Why does it make no sense?
To a first approximation,
 aCollection asWHATEVER
=
 WHATEVER withAll: aCollection

Consider
 'CAB' asSet
 'CAB' asArray
 'CAB' asSortedCollection
 'CAB' asBag
and so on.  They all mean "an instance of the class named
in the selector with the elements of the receiver."
Why should #asString be any different?

Good point.  

There may be counter arguments (I've only considered this briefly)
but where possible it seems reasonable that the asXXXX selectors should be able to round-trip.
i.e... 'abc' asByteArray asString ==> 'abc'
in which case... 'abc' asArray asString ==> '#($a $b $c)  seems wrong.
That result seems more suitable to #printString or #displayString,
particularly considering Richards next insight...
 
Why on earth does #($a $b $c) asString make no sense
when #($a $b $c) as: String
-- which we expect to do the same thing -- DOES apparently
make sense?

cheers -ben
 
123