Smalltalk › Squeak › Squeak - Dev

Re: [Pharo-users] mentor question 4

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

5 messages Options

Trygve

Re: [Pharo-users] mentor question 4

Richard,

Thank you for looking at the code. It is comforting to learn that the code has been executed for a large number of examples without breaking. The code is not primarily written for execution but for being read and checked by the human end user. It would be nice if we could also check that it gave the right answers, but I don't know how to do that.

The first question is: Can a human domain expert read the code and sign their name for its correctness?

When this is achieved, a programming expert will transcribe the first code to a professional quality program. This time, the second code should be reviewed by an independent programmer who signs their name for its correct transcription from the first version.

--Trygve

PS: In his 1991 Turing Award Lecture, Tony Hoare said: "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."

--Trygve

On tirsdag.05.05.2020 04:41, Richard O'Keefe wrote:

As a coding experiment, I adapted Trygve  Reenskoug's code to my
Smalltalk compiler, put in my code slightly tweaked, and benchmarked
them on randomly generated data.

Result: a factor of 6.3.

In Squeak it was a factor of ten.

I had not, in all honesty, expected it to to be so high.

On Tue, 5 May 2020 at 02:00, Trygve Reenskaug [hidden email] wrote:

 A coding experiment.
Consider a Scrum development environment. Every programming team has an end user as a member.
The team's task is to code a credit card validity check.
A first goal is that the user representative shall read the code and agree that it is a correct rendering of their code checker:

    luhnTest: trialNumber
        | s1 odd s2 even charValue reverse |
-----------------------------------------------
" Luhn test according to Rosetta"
"Reverse the order of the digits in the number."
    reverse := trialNumber reversed.
"Take the first, third, ... and every other odd digit in the reversed digits and sum them to form the partial sum s1"
    s1 := 0.
    odd := true.
    reverse do:
        [:char |
            odd
                ifTrue: [
                    s1 := s1 + char digitValue.
                ].
                odd := odd not
        ].
"Taking the second, fourth ... and every other even digit in the reversed digits:
Multiply each digit by two and sum the digits if the answer is greater than nine to form partial sums for the even digits"
    "The subtracting 9 gives the same answer. "
"Sum the partial sums of the even digits to form s2"
    s2 := 0.
    even := false.
    reverse do:
        [:char |
            even
                ifTrue: [
                    charValue := char digitValue * 2.
                    charValue > 9 ifTrue: [charValue := charValue - 9].
                    s2 := s2 + charValue
                ].
                even := even not
        ].
"If s1 + s2 ends in zero then the original number is in the form of a valid credit card number as verified by the Luhn test."
    ^(s1 + s2) asString last = $0
---------------------------------
Once this step is completed, the next step will be to make the code right without altering the algorithm (refactoring). The result should be readable and follow the team's conventions.


P.S. code attached.

The essence of object orientation is that objects collaborate to achieve a goal.
Trygve Reenskaug      [hidden email]
Morgedalsvn. 5A       http://folk.uio.no/trygver/
N-0378 Oslo             http://fullOO.info
Norway                     Tel: (+47) 468 58 625

Trygve

Re: [Pharo-users] mentor question 4

Unfortunately, I have lost the original problem specification, so I have answered a different question. Could you please help me by repeating the original problem specification in full?

I agree that my solution is not 'obviously correct' since it doesn't answer the right question. In one version of the solution, I had a prologue that catered for spaces, etc. Not having a textual form of the algorithm that provided for the extensions, I chose to omit them.

On tirsdag.05.05.2020 15:20, Richard O'Keefe wrote:

The irony is that the code I was responding to ISN'T obviously correct.
Indeed, I found it rather puzzling.
The problem specification says that the input string may contain digits
AND SPACES.  The original message includes this:

Strings of length 1 or less are not valid. Spaces are allowed in the
input, but they should be stripped before checking. All other
non-digit characters are disallowed.

Now it isn't clear what "disallowed" means.  I took it to mean "may occur and
should simply mean the input is rejected as invalid."  Perhaps "may not occur"
was the intention.  So we shall not quibble about such characters.

But I can't for the life of me figure out how Trygve's code checks for spaces.
One reason this is an issue is that the behaviour of #digitValue is not
consistent between systems.
  Character space digitValue
    does not exist in the ANSI standard
    answers -1 in many Smalltalks (which is a pain)
    answers a positive integer that can't be mistake for a digit in my Smalltalk
    raises an exception in some Smalltalks.

This is a comment I now have in my Smalltalk library for #digitValue
      "This is in the Blue Book, but unspecified on non-digits.
       Squeak, Pharo, Dolphin, VW, VAST, and Apple Smalltalk-80
       answer -1 for characters that are not digits (or ASCII letters),
       which is unfortunate but consistent with Inside Smalltalk
       which specifies this result for non-digits.
       ST/X and GST raise an exception which is worse.
       Digitalk ST/V documentation doesn't specify the result.
       This selector is *much* easier to use safely if it
       returns a 'large' (>= 36) value for non-digits."

Let's compare three versions, the two I compared last time,
and the "version A" code I discussed before, which to my mind
is fairly readable.

"Don't add slowness": 1 (normalised time)
"Trygve's code":  6.5
"High level code": 30.6 (or 4.7 times slower than Trygve's)

Here's the "High level code".
      ^(aString allSatisfy: [:each | each isSpace or: [each isDigit]]) and: [
        |digitsReverse|
        digitsReverse := (aString select: [:each | each isDigit]) reverse.
        digitsReverse size > 1 and: [
          |evens odds evenSum oddSum|
          odds  := digitsReverse withIndexSelect: [:y :i | i odd].
          evens := digitsReverse withIndexSelect: [:x :i | i even].
          oddSum  := odds  detectSum: [:y | y digitValue].
          evenSum := evens detectSum: [:x |
                       #(0 2 4 6 8 1 3 5 7 9) at: x digitValue + 1].
          (oddSum + evenSum) \\ 10 = 0]]

This is the kind of code I was recommending that Roelof write.

As a rough guide, by counting traversals (including ones inside existing
methods), I'd expect the "high level" code to be at least 10 times slower
than the "no added slowness" code.

We are in vehement agreement that there is a time to write high level
really obvious easily testable and debuggable code, and that's most
of the time, especially with programming exercises.

I hope that we are also in agreement that factors of 30 (or even 6)
*can* be a serious problem.  I mean, if I wanted something that slow,
I'd use Ruby.

I hope we are also agreed that (with the exception of investigations
like this one) the time to hack on something to make it faster is AFTER
you have profiled it and determined that you have a problem.

But I respectfully suggest that there is a difference taking slowness OUT
and simply not going out of your way to add slowness in the first place.

I'd also like to remark that my preference for methods that traverse a
sequence exactly once has more to do with Smalltalk protocols than
with efficiency.  If the only method I perform on an object is #do:
the method will work just as well for readable streams as for
collections.  If the only method I perform on an object is #reverseDo:
the method will work just as well for Read[Write]Streams as for
SequenceReadableCollections, at least in my library.   It's just like
trying to write #mean so that it works for Durations as well as Numbers.

Oh heck, I suppose I should point out that much of the overheads in
this case could be eliminated by a Self-style compiler doing dynamic
inlining + loop fusion.    There's no reason *in principle*, given enough
people, money, and time, that the differences couldn't be greatly
reduced in Pharo.

On Tue, 5 May 2020 at 21:50, Trygve Reenskaug [hidden email] wrote:

Richard,

The first question is: Can a human domain expert read the code and sign their name for its correctness?

--Trygve

PS: In his 1991 Turing Award Lecture, Tony Hoare said: "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."

--Trygve

On tirsdag.05.05.2020 04:41, Richard O'Keefe wrote:

As a coding experiment, I adapted Trygve Reenskoug's code to my
Smalltalk compiler, put in my code slightly tweaked, and benchmarked
them on randomly generated data.

Result: a factor of 6.3.

In Squeak it was a factor of ten.

I had not, in all honesty, expected it to to be so high.

On Tue, 5 May 2020 at 02:00, Trygve Reenskaug [hidden email] wrote:

A coding experiment.
Consider a Scrum development environment. Every programming team has an end user as a member.
The team's task is to code a credit card validity check.
A first goal is that the user representative shall read the code and agree that it is a correct rendering of their code checker:

luhnTest: trialNumber
| s1 odd s2 even charValue reverse |
-----------------------------------------------
" Luhn test according to Rosetta"
"Reverse the order of the digits in the number."
reverse := trialNumber reversed.
"Take the first, third, ... and every other odd digit in the reversed digits and sum them to form the partial sum s1"
s1 := 0.
odd := true.
reverse do:
[:char |
odd
ifTrue: [
s1 := s1 + char digitValue.
].
odd := odd not
].
"Taking the second, fourth ... and every other even digit in the reversed digits:
Multiply each digit by two and sum the digits if the answer is greater than nine to form partial sums for the even digits"
"The subtracting 9 gives the same answer. "
"Sum the partial sums of the even digits to form s2"
s2 := 0.
even := false.
reverse do:
[:char |
even
ifTrue: [
charValue := char digitValue * 2.
charValue > 9 ifTrue: [charValue := charValue - 9].
s2 := s2 + charValue
].
even := even not
].
"If s1 + s2 ends in zero then the original number is in the form of a valid credit card number as verified by the Luhn test."
^(s1 + s2) asString last = $0
---------------------------------
Once this step is completed, the next step will be to make the code right without altering the algorithm (refactoring). The result should be readable and follow the team's conventions.

P.S. code attached.

The essence of object orientation is that objects collaborate to achieve a goal.
Trygve Reenskaug mailto: [hidden email]
Morgedalsvn. 5A http://folk.uio.no/trygver/
N-0378 Oslo http://fullOO.info
Norway Tel: (+47) 468 58 625

Stéphane Ducasse

Re: [Pharo-users] mentor question 4

In reply to this post by Trygve

By the way, while playing with this problem, I ran into a moderately
painful issue.

There is a reason that Smalltalk has both #printString (to get a
printable representation of an object) and #asString (to convert a
sequence to another kind of sequence with the same elements.) If I
*want* #printString, I know where to find it. The definition in my
Smalltalk no reads

   asString
     "What should #($a $b $c) do?
     - Blue Book, Inside Smalltalk, Apple Smalltalk-80:
       there is no #asString.
     - ANSI, VW, Dolphin, CSOM:
       #asString is defined on characters and strings
       (and things like file names and URIs that are sort of strings),
       so expect an error report.
     - VisualAge Smalltalk:
       '($a $b $c)'
     - Squeak and Pharo:
       '#($a $b $c)'
     - GNU Smalltalk, Smalltalk/X, and astc:
       'abc'
      I don't intend any gratuitous incompatibility, but when there
      is no consensus to be compatible with, one must pick something,
      and this seems most useful.
     "
     ^String withAll: self

The asString looks wrong to me. I will report it as a bug.

Now in Pharo we should really introduce display string

#($a $b $c)

Does anyone here know WHY Squeak and Pharo do what they do here?

On Wed, 6 May 2020 at 01:20, Richard O'Keefe <[hidden email]> wrote:

The irony is that the code I was responding to ISN'T obviously correct.
Indeed, I found it rather puzzling.
The problem specification says that the input string may contain digits
AND SPACES. The original message includes this:

Strings of length 1 or less are not valid. Spaces are allowed in the
input, but they should be stripped before checking. All other
non-digit characters are disallowed.

Now it isn't clear what "disallowed" means. I took it to mean "may occur and
should simply mean the input is rejected as invalid." Perhaps "may not occur"
was the intention. So we shall not quibble about such characters.

But I can't for the life of me figure out how Trygve's code checks for spaces.
One reason this is an issue is that the behaviour of #digitValue is not
consistent between systems.
Character space digitValue
   does not exist in the ANSI standard
   answers -1 in many Smalltalks (which is a pain)
   answers a positive integer that can't be mistake for a digit in my Smalltalk
   raises an exception in some Smalltalks.

This is a comment I now have in my Smalltalk library for #digitValue
     "This is in the Blue Book, but unspecified on non-digits.
      Squeak, Pharo, Dolphin, VW, VAST, and Apple Smalltalk-80
      answer -1 for characters that are not digits (or ASCII letters),
      which is unfortunate but consistent with Inside Smalltalk
      which specifies this result for non-digits.
      ST/X and GST raise an exception which is worse.
      Digitalk ST/V documentation doesn't specify the result.
      This selector is *much* easier to use safely if it
      returns a 'large' (>= 36) value for non-digits."

Let's compare three versions, the two I compared last time,
and the "version A" code I discussed before, which to my mind
is fairly readable.

"Don't add slowness": 1 (normalised time)
"Trygve's code": 6.5
"High level code": 30.6 (or 4.7 times slower than Trygve's)

Here's the "High level code".
     ^(aString allSatisfy: [:each | each isSpace or: [each isDigit]]) and: [
       |digitsReverse|
       digitsReverse := (aString select: [:each | each isDigit]) reverse.
       digitsReverse size > 1 and: [
         |evens odds evenSum oddSum|
         odds := digitsReverse withIndexSelect: [:y :i | i odd].
         evens := digitsReverse withIndexSelect: [:x :i | i even].
         oddSum := odds detectSum: [:y | y digitValue].
         evenSum := evens detectSum: [:x |
                      #(0 2 4 6 8 1 3 5 7 9) at: x digitValue + 1].
         (oddSum + evenSum) \\ 10 = 0]]

This is the kind of code I was recommending that Roelof write.

As a rough guide, by counting traversals (including ones inside existing
methods), I'd expect the "high level" code to be at least 10 times slower
than the "no added slowness" code.

We are in vehement agreement that there is a time to write high level
really obvious easily testable and debuggable code, and that's most
of the time, especially with programming exercises.

I hope that we are also in agreement that factors of 30 (or even 6)
*can* be a serious problem. I mean, if I wanted something that slow,
I'd use Ruby.

I hope we are also agreed that (with the exception of investigations
like this one) the time to hack on something to make it faster is AFTER
you have profiled it and determined that you have a problem.

But I respectfully suggest that there is a difference taking slowness OUT
and simply not going out of your way to add slowness in the first place.

I'd also like to remark that my preference for methods that traverse a
sequence exactly once has more to do with Smalltalk protocols than
with efficiency. If the only method I perform on an object is #do:
the method will work just as well for readable streams as for
collections. If the only method I perform on an object is #reverseDo:
the method will work just as well for Read[Write]Streams as for
SequenceReadableCollections, at least in my library.   It's just like
trying to write #mean so that it works for Durations as well as Numbers.

Oh heck, I suppose I should point out that much of the overheads in
this case could be eliminated by a Self-style compiler doing dynamic
inlining + loop fusion.    There's no reason *in principle*, given enough
people, money, and time, that the differences couldn't be greatly
reduced in Pharo.

On Tue, 5 May 2020 at 21:50, Trygve Reenskaug <[hidden email]> wrote:

Richard,

Thank you for looking at the code. It is comforting to learn that the code has been executed for a large number of examples without breaking. The code is not primarily written for execution but for being read and checked by the human end user. It would be nice if we could also check that it gave the right answers, but I don't know how to do that.

The first question is: Can a human domain expert read the code and sign their name for its correctness?

When this is achieved, a programming expert will transcribe the first code to a professional quality program. This time, the second code should be reviewed by an independent programmer who signs their name for its correct transcription from the first version.

--Trygve

PS: In his 1991 Turing Award Lecture, Tony Hoare said: "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."

--Trygve

On tirsdag.05.05.2020 04:41, Richard O'Keefe wrote:

As a coding experiment, I adapted Trygve Reenskoug's code to my
Smalltalk compiler, put in my code slightly tweaked, and benchmarked
them on randomly generated data.

Result: a factor of 6.3.

In Squeak it was a factor of ten.

I had not, in all honesty, expected it to to be so high.

On Tue, 5 May 2020 at 02:00, Trygve Reenskaug <[hidden email]> wrote:

A coding experiment.
Consider a Scrum development environment. Every programming team has an end user as a member.
The team's task is to code a credit card validity check.
A first goal is that the user representative shall read the code and agree that it is a correct rendering of their code checker:

   luhnTest: trialNumber
       | s1 odd s2 even charValue reverse |
-----------------------------------------------
" Luhn test according to Rosetta"
"Reverse the order of the digits in the number."
   reverse := trialNumber reversed.
"Take the first, third, ... and every other odd digit in the reversed digits and sum them to form the partial sum s1"
   s1 := 0.
   odd := true.
   reverse do:
       [:char |
           odd
               ifTrue: [
                   s1 := s1 + char digitValue.
               ].
               odd := odd not
       ].
"Taking the second, fourth ... and every other even digit in the reversed digits:
Multiply each digit by two and sum the digits if the answer is greater than nine to form partial sums for the even digits"
   "The subtracting 9 gives the same answer. "
"Sum the partial sums of the even digits to form s2"
   s2 := 0.
   even := false.
   reverse do:
       [:char |
           even
               ifTrue: [
                   charValue := char digitValue * 2.
                   charValue > 9 ifTrue: [charValue := charValue - 9].
                   s2 := s2 + charValue
               ].
               even := even not
       ].
"If s1 + s2 ends in zero then the original number is in the form of a valid credit card number as verified by the Luhn test."
   ^(s1 + s2) asString last = $0
---------------------------------
Once this step is completed, the next step will be to make the code right without altering the algorithm (refactoring). The result should be readable and follow the team's conventions.

P.S. code attached.

--

The essence of object orientation is that objects collaborate to achieve a goal.
Trygve Reenskaug      mailto: [hidden email]
Morgedalsvn. 5A       http://folk.uio.no/trygver/
N-0378 Oslo             http://fullOO.info
Norway                     Tel: (+47) 468 58 625

--------------------------------------------

Stéphane Ducasse

http://stephane.ducasse.free.fr / http://www.pharo.org

03 59 35 87 52

Assistant: Julie Jonas

FAX 03 59 57 78 50

TEL 03 59 35 86 16

S. Ducasse - Inria

40, avenue Halley,

Parc Scientifique de la Haute Borne, Bât.A, Park Plaza

Villeneuve d'Ascq 59650

France

Stéphane Ducasse

Re: [Pharo-users] mentor question 4

In reply to this post by Trygve

On 5 May 2020, at 16:16, Richard O'Keefe <[hidden email]> wrote:

By the way, while playing with this problem, I ran into a moderately
painful issue.

There is a reason that Smalltalk has both #printString (to get a
printable representation of an object) and #asString (to convert a
sequence to another kind of sequence with the same elements.) If I
*want* #printString, I know where to find it. The definition in my
Smalltalk no reads

   asString
     "What should #($a $b $c) do?
     - Blue Book, Inside Smalltalk, Apple Smalltalk-80:
       there is no #asString.
     - ANSI, VW, Dolphin, CSOM:
       #asString is defined on characters and strings
       (and things like file names and URIs that are sort of strings),
       so expect an error report.
     - VisualAge Smalltalk:
       '($a $b $c)'
     - Squeak and Pharo:
       '#($a $b $c)'
     - GNU Smalltalk, Smalltalk/X, and astc:
       'abc'
      I don't intend any gratuitous incompatibility, but when there
      is no consensus to be compatible with, one must pick something,
      and this seems most useful.
     "
     ^String withAll: self

Does anyone here know WHY Squeak and Pharo do what they do here?

Oops I did not see the quotes on my screen..

#( a b c) asString

>>> '#(#a #b #c)’

this is unclear to me why this is not good but I have no strong opinion

that this is good.

I worked on printString for literals because I wanted to have

self evaluating properties for basic literal like in Scheme and others.

where

>>>

And I payed attention that we get the same for literal arrays.

Now the conversion is open to me.

#($a $b $c) asString

>>>

'#($a $b $c)’

In fact I do not really understand why a string

#($a $b $c) asString would be '(a b c)’

and its use

if this is to nicely display in the ui I would have

displayString doing it.

On Wed, 6 May 2020 at 01:20, Richard O'Keefe <[hidden email]> wrote:

The irony is that the code I was responding to ISN'T obviously correct.
Indeed, I found it rather puzzling.
The problem specification says that the input string may contain digits
AND SPACES. The original message includes this:

Strings of length 1 or less are not valid. Spaces are allowed in the
input, but they should be stripped before checking. All other
non-digit characters are disallowed.

Now it isn't clear what "disallowed" means. I took it to mean "may occur and
should simply mean the input is rejected as invalid." Perhaps "may not occur"
was the intention. So we shall not quibble about such characters.

But I can't for the life of me figure out how Trygve's code checks for spaces.
One reason this is an issue is that the behaviour of #digitValue is not
consistent between systems.
Character space digitValue
   does not exist in the ANSI standard
   answers -1 in many Smalltalks (which is a pain)
   answers a positive integer that can't be mistake for a digit in my Smalltalk
   raises an exception in some Smalltalks.

This is a comment I now have in my Smalltalk library for #digitValue
     "This is in the Blue Book, but unspecified on non-digits.
      Squeak, Pharo, Dolphin, VW, VAST, and Apple Smalltalk-80
      answer -1 for characters that are not digits (or ASCII letters),
      which is unfortunate but consistent with Inside Smalltalk
      which specifies this result for non-digits.
      ST/X and GST raise an exception which is worse.
      Digitalk ST/V documentation doesn't specify the result.
      This selector is *much* easier to use safely if it
      returns a 'large' (>= 36) value for non-digits."

Let's compare three versions, the two I compared last time,
and the "version A" code I discussed before, which to my mind
is fairly readable.

"Don't add slowness": 1 (normalised time)
"Trygve's code": 6.5
"High level code": 30.6 (or 4.7 times slower than Trygve's)

Here's the "High level code".
     ^(aString allSatisfy: [:each | each isSpace or: [each isDigit]]) and: [
       |digitsReverse|
       digitsReverse := (aString select: [:each | each isDigit]) reverse.
       digitsReverse size > 1 and: [
         |evens odds evenSum oddSum|
         odds := digitsReverse withIndexSelect: [:y :i | i odd].
         evens := digitsReverse withIndexSelect: [:x :i | i even].
         oddSum := odds detectSum: [:y | y digitValue].
         evenSum := evens detectSum: [:x |
                      #(0 2 4 6 8 1 3 5 7 9) at: x digitValue + 1].
         (oddSum + evenSum) \\ 10 = 0]]

This is the kind of code I was recommending that Roelof write.

As a rough guide, by counting traversals (including ones inside existing
methods), I'd expect the "high level" code to be at least 10 times slower
than the "no added slowness" code.

We are in vehement agreement that there is a time to write high level
really obvious easily testable and debuggable code, and that's most
of the time, especially with programming exercises.

I hope that we are also in agreement that factors of 30 (or even 6)
*can* be a serious problem. I mean, if I wanted something that slow,
I'd use Ruby.

I hope we are also agreed that (with the exception of investigations
like this one) the time to hack on something to make it faster is AFTER
you have profiled it and determined that you have a problem.

But I respectfully suggest that there is a difference taking slowness OUT
and simply not going out of your way to add slowness in the first place.

I'd also like to remark that my preference for methods that traverse a
sequence exactly once has more to do with Smalltalk protocols than
with efficiency. If the only method I perform on an object is #do:
the method will work just as well for readable streams as for
collections. If the only method I perform on an object is #reverseDo:
the method will work just as well for Read[Write]Streams as for
SequenceReadableCollections, at least in my library.   It's just like
trying to write #mean so that it works for Durations as well as Numbers.

Oh heck, I suppose I should point out that much of the overheads in
this case could be eliminated by a Self-style compiler doing dynamic
inlining + loop fusion.    There's no reason *in principle*, given enough
people, money, and time, that the differences couldn't be greatly
reduced in Pharo.

On Tue, 5 May 2020 at 21:50, Trygve Reenskaug <[hidden email]> wrote:

Richard,

Thank you for looking at the code. It is comforting to learn that the code has been executed for a large number of examples without breaking. The code is not primarily written for execution but for being read and checked by the human end user. It would be nice if we could also check that it gave the right answers, but I don't know how to do that.

The first question is: Can a human domain expert read the code and sign their name for its correctness?

When this is achieved, a programming expert will transcribe the first code to a professional quality program. This time, the second code should be reviewed by an independent programmer who signs their name for its correct transcription from the first version.

--Trygve

PS: In his 1991 Turing Award Lecture, Tony Hoare said: "There are two ways of constructing a software design: One way is to make it so simple that there are obviously no deficiencies and the other is to make it so complicated that there are no obvious deficiencies. The first method is far more difficult."

--Trygve

On tirsdag.05.05.2020 04:41, Richard O'Keefe wrote:

As a coding experiment, I adapted Trygve Reenskoug's code to my
Smalltalk compiler, put in my code slightly tweaked, and benchmarked
them on randomly generated data.

Result: a factor of 6.3.

In Squeak it was a factor of ten.

I had not, in all honesty, expected it to to be so high.

On Tue, 5 May 2020 at 02:00, Trygve Reenskaug <[hidden email]> wrote:

A coding experiment.
Consider a Scrum development environment. Every programming team has an end user as a member.
The team's task is to code a credit card validity check.
A first goal is that the user representative shall read the code and agree that it is a correct rendering of their code checker:

   luhnTest: trialNumber
       | s1 odd s2 even charValue reverse |
-----------------------------------------------
" Luhn test according to Rosetta"
"Reverse the order of the digits in the number."
   reverse := trialNumber reversed.
"Take the first, third, ... and every other odd digit in the reversed digits and sum them to form the partial sum s1"
   s1 := 0.
   odd := true.
   reverse do:
       [:char |
           odd
               ifTrue: [
                   s1 := s1 + char digitValue.
               ].
               odd := odd not
       ].
"Taking the second, fourth ... and every other even digit in the reversed digits:
Multiply each digit by two and sum the digits if the answer is greater than nine to form partial sums for the even digits"
   "The subtracting 9 gives the same answer. "
"Sum the partial sums of the even digits to form s2"
   s2 := 0.
   even := false.
   reverse do:
       [:char |
           even
               ifTrue: [
                   charValue := char digitValue * 2.
                   charValue > 9 ifTrue: [charValue := charValue - 9].
                   s2 := s2 + charValue
               ].
               even := even not
       ].
"If s1 + s2 ends in zero then the original number is in the form of a valid credit card number as verified by the Luhn test."
   ^(s1 + s2) asString last = $0
---------------------------------
Once this step is completed, the next step will be to make the code right without altering the algorithm (refactoring). The result should be readable and follow the team's conventions.

P.S. code attached.

--

The essence of object orientation is that objects collaborate to achieve a goal.
Trygve Reenskaug      mailto: [hidden email]
Morgedalsvn. 5A       http://folk.uio.no/trygver/
N-0378 Oslo             http://fullOO.info
Norway                     Tel: (+47) 468 58 625

--------------------------------------------

Stéphane Ducasse

http://stephane.ducasse.free.fr / http://www.pharo.org

03 59 35 87 52

Assistant: Julie Jonas

FAX 03 59 57 78 50

TEL 03 59 35 86 16

S. Ducasse - Inria

40, avenue Halley,

Parc Scientifique de la Haute Borne, Bât.A, Park Plaza

Villeneuve d'Ascq 59650

France

Trygve

Re: [Pharo-users] mentor question 4

In reply to this post by Trygve

Hi Richard,
Many thanks for the copy of your question. Your problem fascinates me because it is an excellent example of an important class of problems: Translate an exact rendering of an algorithm in text form into the same algorithm in code form. I am using Squeak, but my code may also work in Pharo. The problem is: "Given a number determine whether or not it is valid per the Luhn formula. The task is to check if a given string is valid.".

I like that you start with a precise rendering of the algorithm in an exact textual form, and also that you have augmented it with the intermediate results of an example execution. I lost you when you, in a later version, unnecessarily reversed the number. Maybe the reason for the reversal is that while you are aware of the aString do: [] method; you may have missed its companion: aString reverseDo: []. Luhn starts from the right and moves left, doubling the character every second digit. This part of the algorithm can be directly expressed in Squeak:

    isSecond := false.
    Number reverseDo: [:char |
        isSecond
            ifTrue: [“double”]
            ifFalse: [“do not double”]
        isSecond := isSecond not].

I have written 3 example methods where I first include the algorithm text as a comment. I then interlace the text with the corresponding Squeak statements to create an executable method that is an exact translation from text to Squeak. I have attached my program in BabyExperiments.st to this mail:

LuhnTest>>luhnTest1: aNumber
    " A method with a worked example inline. "
    " This method translates the problem text to Squeak and checks the given intermediate results."
    “ It only works for aNumber = '4539 1488 0343 6467' “

LuhnTest>>luhnTest2: aNumber
    " luhnTest1: with inline example removed. "
" Validates aNumber and returns true or false.
“ aNumber can be any String of characters. “
    " This method is identical to luhnTest1:, but without the intermediate specs and computations."

LuhnTest>>luhnTest3: aNumber
    “ The Squeak code from luhnTest2: now becomes a comment to be transformed into reasonable Squeak code.

I have no occasion to claim that my three methods are ‘obviously correct’. The writer of a text, be it a love letter or a piece of code, is notoriously bad at proofing their own text. An independent reviewer is therefore needed to check it. (I am a great believer in peer review as a tool for getting it right the first time).

LuhnTest>>testLuhnTest
    “ This method runs luhnTest2: with 11 different numbers; some valid and some not. ”
    “ Some border cases are especially interesting because they can easily be validated mentally: '3', '34', '35', '356, '357'.

This experiment has been a learning experience for me. Again, thank you for sharing it, Richard
--Trygve

On 2020.05.06 06:36, Richard O'Keefe wrote:

I have forwarded a copy of the original message privately.
The problem is also described in RosettaCode.
One issue is that the input is described in the original message and
in the code I was responding to as "a number", but is in fact a string.


On Wed, 6 May 2020 at 04:04, Trygve Reenskaug [hidden email] wrote:

Unfortunately, I have lost the original problem specification, so I have answered a different question. Could you please help me by repeating the original problem specification in full?

I agree that my solution is not 'obviously correct' since it doesn't answer the right question. In one version of the solution, I had a prologue that catered for spaces, etc. Not having a textual form of the algorithm that provided for the extensions, I chose to omit them.

On tirsdag.05.05.2020 15:20, Richard O'Keefe wrote:

The irony is that the code I was responding to ISN'T obviously correct.
Indeed, I found it rather puzzling.
The problem specification says that the input string may contain digits
AND SPACES. The original message includes this:

Strings of length 1 or less are not valid. Spaces are allowed in the
input, but they should be stripped before checking. All other
non-digit characters are disallowed.

Now it isn't clear what "disallowed" means. I took it to mean "may occur and
should simply mean the input is rejected as invalid." Perhaps "may not occur"
was the intention. So we shall not quibble about such characters.

But I can't for the life of me figure out how Trygve's code checks for spaces.
One reason this is an issue is that the behaviour of #digitValue is not
consistent between systems.
Character space digitValue
does not exist in the ANSI standard
answers -1 in many Smalltalks (which is a pain)
answers a positive integer that can't be mistake for a digit in my Smalltalk
raises an exception in some Smalltalks.

This is a comment I now have in my Smalltalk library for #digitValue
"This is in the Blue Book, but unspecified on non-digits.
Squeak, Pharo, Dolphin, VW, VAST, and Apple Smalltalk-80
answer -1 for characters that are not digits (or ASCII letters),
which is unfortunate but consistent with Inside Smalltalk
which specifies this result for non-digits.
ST/X and GST raise an exception which is worse.
Digitalk ST/V documentation doesn't specify the result.
This selector is *much* easier to use safely if it
returns a 'large' (>= 36) value for non-digits."

Let's compare three versions, the two I compared last time,
and the "version A" code I discussed before, which to my mind
is fairly readable.

"Don't add slowness": 1 (normalised time)
"Trygve's code": 6.5
"High level code": 30.6 (or 4.7 times slower than Trygve's)

Here's the "High level code".
^(aString allSatisfy: [:each | each isSpace or: [each isDigit]]) and: [
|digitsReverse|
digitsReverse := (aString select: [:each | each isDigit]) reverse.
digitsReverse size > 1 and: [
|evens odds evenSum oddSum|
odds := digitsReverse withIndexSelect: [:y :i | i odd].
evens := digitsReverse withIndexSelect: [:x :i | i even].
oddSum := odds detectSum: [:y | y digitValue].
evenSum := evens detectSum: [:x |
#(0 2 4 6 8 1 3 5 7 9) at: x digitValue + 1].
(oddSum + evenSum) \\ 10 = 0]]

This is the kind of code I was recommending that Roelof write.

As a rough guide, by counting traversals (including ones inside existing
methods), I'd expect the "high level" code to be at least 10 times slower
than the "no added slowness" code.

We are in vehement agreement that there is a time to write high level
really obvious easily testable and debuggable code, and that's most
of the time, especially with programming exercises.

I hope that we are also in agreement that factors of 30 (or even 6)
*can* be a serious problem. I mean, if I wanted something that slow,
I'd use Ruby.

I hope we are also agreed that (with the exception of investigations
like this one) the time to hack on something to make it faster is AFTER
you have profiled it and determined that you have a problem.

But I respectfully suggest that there is a difference taking slowness OUT
and simply not going out of your way to add slowness in the first place.

I'd also like to remark that my preference for methods that traverse a
sequence exactly once has more to do with Smalltalk protocols than
with efficiency. If the only method I perform on an object is #do:
the method will work just as well for readable streams as for
collections. If the only method I perform on an object is #reverseDo:
the method will work just as well for Read[Write]Streams as for
SequenceReadableCollections, at least in my library. It's just like
trying to write #mean so that it works for Durations as well as Numbers.

Oh heck, I suppose I should point out that much of the overheads in
this case could be eliminated by a Self-style compiler doing dynamic
inlining + loop fusion. There's no reason *in principle*, given enough
people, money, and time, that the differences couldn't be greatly
reduced in Pharo.

On Tue, 5 May 2020 at 21:50, Trygve Reenskaug [hidden email] wrote:

Richard,

The first question is: Can a human domain expert read the code and sign their name for its correctness?

--Trygve

On tirsdag.05.05.2020 04:41, Richard O'Keefe wrote:

As a coding experiment, I adapted Trygve Reenskoug's code to my
Smalltalk compiler, put in my code slightly tweaked, and benchmarked
them on randomly generated data.

Result: a factor of 6.3.

In Squeak it was a factor of ten.

I had not, in all honesty, expected it to to be so high.

On Tue, 5 May 2020 at 02:00, Trygve Reenskaug [hidden email] wrote:

P.S. code attached.

BabyExperiments.st (13K) Download Attachment