Smalltalk › Pharo › Pharo Smalltalk Developers

Do we want AST-Debugger?

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

7 messages Options

Denis Kudriashov

Do we want AST-Debugger?

Hi.

Look at AST-Interpreter project which includes AST-Debugger http://smalltalkhub.com/#!/~Pharo/AST-Interpreter. Code is so simple comparing to "bytecode simulation" approach of current debugger.

AST debugger are independent from bytecode set. It just visit AST-nodes to simulate code execution.

But current debugger simulates bytecode. It depends on current bytecode set. There are few kinds of BytecodeEncoders. To understand how current debugger is working you need to know many VM details.

So my question is why our debuggers are not based on AST interpreter? Why it is bad idea? (if it is bad).

Best regards,

Denis

Andrei Chis

Re: Do we want AST-Debugger?

Hi Denis,

No idea why the current debugger is not based on the AST interpreter or if there are any plans to move it.

However, I think it will help to make the ASTDebugger compatible with the current debugger from Pharo.

From what I see the code from ASTDebugger is indeed very simple. One option could be to create a subclass

of DebugSession that overrides those methods that control the execution. If this goes easy then we could just

reuse both the Spec and the Glamour UIs.

Do you want to give it a try or sync more on this?

Cheers,

Andrei

On Wed, May 25, 2016 at 12:12 PM, Denis Kudriashov <[hidden email]> wrote:

Hi.

Look at AST-Interpreter project which includes AST-Debugger http://smalltalkhub.com/#!/~Pharo/AST-Interpreter. Code is so simple comparing to "bytecode simulation" approach of current debugger.
AST debugger are independent from bytecode set. It just visit AST-nodes to simulate code execution.

But current debugger simulates bytecode. It depends on current bytecode set. There are few kinds of BytecodeEncoders. To understand how current debugger is working you need to know many VM details.

So my question is why our debuggers are not based on AST interpreter? Why it is bad idea? (if it is bad).

Best regards,
Denis

Eliot Miranda-2

Re: Do we want AST-Debugger?

In reply to this post by Denis Kudriashov

Hi Denis,

On May 25, 2016, at 3:12 AM, Denis Kudriashov <[hidden email]> wrote:

Hi.

Look at AST-Interpreter project which includes AST-Debugger http://smalltalkhub.com/#!/~Pharo/AST-Interpreter. Code is so simple comparing to "bytecode simulation" approach of current debugger.
AST debugger are independent from bytecode set. It just visit AST-nodes to simulate code execution.

But current debugger simulates bytecode. It depends on current bytecode set. There are few kinds of BytecodeEncoders. To understand how current debugger is working you need to know many VM details.

That's simply not true. In fact the amount known of the VM in both the bytecode debugger and the AST debugger is the same; both use VM primitives that aren't specified at the image level.

Both the AST interpreter and the bytecode interpreter include the interpreter in Smalltalk in the image. Look at Context and InstructionStream to find the Smalktalk definitions of both the interpreter routines and the implementations of the bytecodes.

So my question is why our debuggers are not based on AST interpreter? Why it is bad idea? (if it is bad).

One area where you'll find the AST debugger worse is in performance. The bytecode debugger uses perform: to execute at full speed when possible (it does interpret when doing step into or through, but not when doing over). In fact this part of the system, because it has to catch no local returns and exceptions, is the most complex, but it's not because of VM details, it's because of the use of perform: to run code being debugged to run at full speed.

Not to mention, if it isn't broke don't fix it.

But seriously, the system is based on bytecode, and that bytecode is pretty straightforward and easy to learn. Why not put in the effort of learning it?

Criticizing something in ignorance is not wise IMO.

Best regards,
Denis

_,,,^..^,,,_ (phone)

Denis Kudriashov

Re: Do we want AST-Debugger?

Hi Eliot.

2016-06-06 1:13 GMT+02:00 Eliot Miranda <[hidden email]>:

That's simply not true. In fact the amount known of the VM in both the bytecode debugger and the AST debugger is the same; both use VM primitives that aren't specified at the image level.

Both the AST interpreter and the bytecode interpreter include the interpreter in Smalltalk in the image. Look at Context and InstructionStream to find the Smalktalk definitions of both the interpreter routines and the implementations of the bytecodes.

So my question is why our debuggers are not based on AST interpreter? Why it is bad idea? (if it is bad).

One area where you'll find the AST debugger worse is in performance. The bytecode debugger uses perform: to execute at full speed when possible (it does interpret when doing step into or through, but not when doing over). In fact this part of the system, because it has to catch no local returns and exceptions, is the most complex, but it's not because of VM details, it's because of the use of perform: to run code being debugged to run at full speed.

Not to mention, if it isn't broke don't fix it.

But seriously, the system is based on bytecode, and that bytecode is pretty straightforward and easy to learn. Why not put in the effort of learning it?

Criticizing something in ignorance is not wise IMO.

I just put this method here:

InstructionStream >>interpretNextV3PlusClosureInstructionFor: client
"Send to the argument, client, a message that specifies the type of the
next instruction."
| byte type offset method |
method := self method.
byte := method at: pc.
type := byte // 16.
offset := byte \\ 16.
pc := pc+1.
"We do an inline binary search on each of the possible 16 values of type:
The old, cleaner but slowe code is retained as a comment below"
type < 8
ifTrue: [type < 4
ifTrue: [type < 2
ifTrue: [type < 1
ifTrue: ["type = 0"
^ client pushReceiverVariable: offset]
ifFalse: ["type = 1"
^ client pushTemporaryVariable: offset]]
ifFalse: [type < 3
ifTrue: ["type = 2"
^ client
pushConstant: (method literalAt: offset + 1)]
ifFalse: ["type = 3"
^ client
pushConstant: (method literalAt: offset + 17)]]]
ifFalse: [type < 6
ifTrue: [type < 5
ifTrue: ["type = 4"
^ client
pushLiteralVariable: (method literalAt: offset + 1)]
ifFalse: ["type = 5"
^ client
pushLiteralVariable: (method literalAt: offset + 17)]]
ifFalse: [type < 7
ifTrue: ["type = 6"
offset < 8
ifTrue: [^ client popIntoReceiverVariable: offset]
ifFalse: [^ client popIntoTemporaryVariable: offset - 8]]
ifFalse: ["type = 7"
offset = 0
ifTrue: [^ client pushReceiver].
offset < 8
ifTrue: [^ client
pushConstant: (SpecialConstants at: offset)].
offset = 8
ifTrue: [^ client methodReturnReceiver].
offset < 12
ifTrue: [^ client
methodReturnConstant: (SpecialConstants at: offset - 8)].
offset = 12
ifTrue: [^ client methodReturnTop].
offset = 13
ifTrue: [^ client blockReturnTop].
offset > 13
ifTrue: [^ self unusedBytecode: client at: pc - 1 ]]]]]
ifFalse: [type < 12
ifTrue: [type < 10
ifTrue: [type < 9
ifTrue: ["type = 8"
^ self
interpretV3PlusClosureExtension: offset
in: method
for: client]
ifFalse: ["type = 9 (short jumps)"
offset < 8
ifTrue: [^ client jump: offset + 1].
^ client jump: offset - 8 + 1 if: false]]
ifFalse: [type < 11
ifTrue: ["type = 10 (long jumps)"
byte := method at: pc.
pc := pc + 1.
offset < 8
ifTrue: [^ client jump: offset - 4 * 256 + byte].
^ client jump: (offset bitAnd: 3)
* 256 + byte if: offset < 12]
ifFalse: ["type = 11"
^ client
send: (Smalltalk specialSelectorAt: offset + 1)
super: false
numArgs: (Smalltalk specialNargsAt: offset + 1)]]]
ifFalse: [type = 12
ifTrue: [^ client
send: (Smalltalk specialSelectorAt: offset + 17)
super: false
numArgs: (Smalltalk specialNargsAt: offset + 17)]
ifFalse: ["type = 13, 14 or 15"
^ client
send: (method literalAt: offset + 1)
super: false
numArgs: type - 13]]].

" old code
type=0 ifTrue: [^client pushReceiverVariable: offset].
type=1 ifTrue: [^client pushTemporaryVariable: offset].
type=2 ifTrue: [^client pushConstant: (method literalAt: offset+1)].
type=3 ifTrue: [^client pushConstant: (method literalAt: offset+17)].
type=4 ifTrue: [^client pushLiteralVariable: (method literalAt: offset+1)].
type=5 ifTrue: [^client pushLiteralVariable: (method literalAt: offset+17)].
type=6
ifTrue: [offset<8
ifTrue: [^client popIntoReceiverVariable: offset]
ifFalse: [^client popIntoTemporaryVariable: offset-8]].
type=7
ifTrue: [offset=0 ifTrue: [^client pushReceiver].
offset<8 ifTrue: [^client pushConstant: (SpecialConstants at: offset)].
offset=8 ifTrue: [^client methodReturnReceiver].
offset<12 ifTrue: [^client methodReturnConstant:
(SpecialConstants at: offset-8)].
offset=12 ifTrue: [^client methodReturnTop].
offset=13 ifTrue: [^client blockReturnTop].
offset>13 ifTrue: [^self error: 'unusedBytecode']].
type=8 ifTrue: [^self interpretExtension: offset in: method for: client].
type=9
ifTrue: short jumps
[offset<8 ifTrue: [^client jump: offset+1].
^client jump: offset-8+1 if: false].
type=10
ifTrue: long jumps
[byte:= method at: pc. pc:= pc+1.
offset<8 ifTrue: [^client jump: offset-4*256 + byte].
^client jump: (offset bitAnd: 3)*256 + byte if: offset<12].
type=11
ifTrue:
[^client
send: (Smalltalk specialSelectorAt: offset+1)
super: false
numArgs: (Smalltalk specialNargsAt: offset+1)].
type=12
ifTrue:
[^client
send: (Smalltalk specialSelectorAt: offset+17)
super: false
numArgs: (Smalltalk specialNargsAt: offset+17)].
type>12
ifTrue:
[^client send: (method literalAt: offset+1)
super: false
numArgs: type-13]"

Denis Kudriashov

Re: Do we want AST-Debugger?

2016-06-06 10:45 GMT+02:00 Denis Kudriashov <[hidden email]>:

But seriously, the system is based on bytecode, and that bytecode is pretty straightforward and easy to learn. Why not put in the effort of learning it?

Criticizing something in ignorance is not wise IMO.

I just put this method here:

So to learn how #step method working people should understand this method. And it is not alone.

Denis Kudriashov

Re: Do we want AST-Debugger?

In reply to this post by Denis Kudriashov

2016-06-06 10:45 GMT+02:00 Denis Kudriashov <[hidden email]>:

One area where you'll find the AST debugger worse is in performance. The bytecode debugger uses perform: to execute at full speed when possible (it does interpret when doing step into or through, but not when doing over). In fact this part of the system, because it has to catch no local returns and exceptions, is the most complex, but it's not because of VM details, it's because of the use of perform: to run code being debugged to run at full speed.

Same approach could be applied to AST interpreter too.

Eliot Miranda-2

Re: Do we want AST-Debugger?

In reply to this post by Denis Kudriashov

Hi Denis,

On Mon, Jun 6, 2016 at 1:45 AM, Denis Kudriashov <[hidden email]> wrote:

Hi Eliot.

2016-06-06 1:13 GMT+02:00 Eliot Miranda <[hidden email]>:
That's simply not true. In fact the amount known of the VM in both the bytecode debugger and the AST debugger is the same; both use VM primitives that aren't specified at the image level.

Both the AST interpreter and the bytecode interpreter include the interpreter in Smalltalk in the image. Look at Context and InstructionStream to find the Smalktalk definitions of both the interpreter routines and the implementations of the bytecodes.

So my question is why our debuggers are not based on AST interpreter? Why it is bad idea? (if it is bad).

One area where you'll find the AST debugger worse is in performance. The bytecode debugger uses perform: to execute at full speed when possible (it does interpret when doing step into or through, but not when doing over). In fact this part of the system, because it has to catch no local returns and exceptions, is the most complex, but it's not because of VM details, it's because of the use of perform: to run code being debugged to run at full speed.

Not to mention, if it isn't broke don't fix it.

But seriously, the system is based on bytecode, and that bytecode is pretty straightforward and easy to learn. Why not put in the effort of learning it?

Criticizing something in ignorance is not wise IMO.

I just put this method here:

and what's your issue with it? It simply dispatches the standard byte code set as described in the EncoderForV3's and EncoderForV3PlusClosure's class comments:

0-15 0000iiii Push Receiver Variable #iiii

16-31 0001iiii Push Temporary Location #iiii

32-63 001iiiii Push Literal Constant #iiiii

64-95 010iiiii Push Literal Variable #iiiii

96-103 01100iii Pop and Store Receiver Variable #iii

104-111 01101iii Pop and Store Temporary Location #iii

112-119 01110iii Push (receiver, true, false, nil, -1, 0, 1, 2) [iii]

120-123 011110ii Return (receiver, true, false, nil) [ii] From Message

124-125 0111110i Return Stack Top From (Message, Block) [i]

(126-127 unassigned)

128 10000000 jjkkkkkk Push (Receiver Variable, Temporary Location, Literal Constant, Literal Variable) [jj] #kkkkkk

129 10000001 jjkkkkkk Store (Receiver Variable, Temporary Location, Illegal, Literal Variable) [jj] #kkkkkk

130 10000010 jjkkkkkk Pop and Store (Receiver Variable, Temporary Location, Illegal, Literal Variable) [jj] #kkkkkk

131 10000011 jjjkkkkk Send Literal Selector #kkkkk With jjj Arguments

132 10000100 iiijjjjj kkkkkkkk (Send, Send Super, Push Receiver Variable, Push Literal Constant, Push Literal Variable, Store Receiver Variable, Store-Pop Receiver Variable, Store Literal Variable)[iii] #kkkkkkkk jjjjj (for sends jjjjj = numArgs)

133 10000011 jjjkkkkk Send Literal Selector #kkkkk To Superclass With jjj Arguments

134 10000011 jjjkkkkk Send Literal Selector #kkkkk With jjj Arguments

135 10000111 Pop Stack Top

136 10001000 Duplicate Stack Top

137 10001001 Push Active Context

138 10001010 jkkkkkkk Push (Array new: kkkkkkk) (j = 0)

or Pop kkkkkkk elements into: (Array new: kkkkkkk) (j = 1)

139 10001011 kkkkkkkk jjjjjjjj Invoke primitive number jjjjjjjjkkkkkkkk

140 10001100 kkkkkkkk jjjjjjjj Push Temp At kkkkkkkk In Temp Vector At: jjjjjjjj

141 10001101 kkkkkkkk jjjjjjjj Store Temp At kkkkkkkk In Temp Vector At: jjjjjjjj

142 10001110 kkkkkkkk jjjjjjjj Pop and Store Temp At kkkkkkkk In Temp Vector At: jjjjjjjj

143 10001111 llllkkkk jjjjjjjj iiiiiiii Push Closure Num Copied llll Num Args kkkk BlockSize jjjjjjjjiiiiiiii

144-151 10010iii Jump iii + 1 (i.e., 1 through 8)

152-159 10011iii Pop and Jump 0n False iii +1 (i.e., 1 through 8)

160-167 10100iii jjjjjjjj Jump(iii - 4) *256+jjjjjjjj

168-171 101010ii jjjjjjjj Pop and Jump On True ii *256+jjjjjjjj

172-175 101011ii jjjjjjjj Pop and Jump On False ii *256+jjjjjjjj

176-191 1011iiii Send Arithmetic Message #iiii

192-207 1100iiii Send Special Message #iiii

208-223 1101iiii Send Literal Selector #iiii With No Arguments

224-239 1110iiii Send Literal Selector #iiii With 1 Argument

240-255 1111iiii Send Literal Selector #iiii With 2 Arguments

The method uses binary search to dispatch efficiently (a modification made by Stephane Ducasse a while back). It isn't as readable as the old method but it is quite straight-forward. Further, it isn't specific to the VM. It is used by numbers facilities, to interpret byte code in the image (see Context/s implementations of pushReceiverVariable: et al), to decompile byte code methods, to print byte code, to scan for selectors a method sends, etc. Would it help if the method included the information in the class comments? Or would it be good enough for the method's comment to reference the class comments?

BTW, there's a good article on byte code on wikipedia. Note that the VM executes byte code, not syntax trees, and so the VM and indeed all the execution machinery including the debugger, are not limited to debugging syntax trees. One can express many useful things in bytecode that can't be expressed by our syntax trees. And so if you do move towards the AST debugger you'll find that you've drastically restricted the flexibility and generality7 of the system for no benefit that I can see.

InstructionStream >>interpretNextV3PlusClosureInstructionFor: client
"Send to the argument, client, a message that specifies the type of the
next instruction."
| byte type offset method |
method := self method.
byte := method at: pc.
type := byte // 16.
offset := byte \\ 16.
pc := pc+1.
"We do an inline binary search on each of the possible 16 values of type:
The old, cleaner but slowe code is retained as a comment below"
type < 8
ifTrue: [type < 4
ifTrue: [type < 2
ifTrue: [type < 1
ifTrue: ["type = 0"
^ client pushReceiverVariable: offset]
ifFalse: ["type = 1"
^ client pushTemporaryVariable: offset]]
ifFalse: [type < 3
ifTrue: ["type = 2"
^ client
pushConstant: (method literalAt: offset + 1)]
ifFalse: ["type = 3"
^ client
pushConstant: (method literalAt: offset + 17)]]]
ifFalse: [type < 6
ifTrue: [type < 5
ifTrue: ["type = 4"
^ client
pushLiteralVariable: (method literalAt: offset + 1)]
ifFalse: ["type = 5"
^ client
pushLiteralVariable: (method literalAt: offset + 17)]]
ifFalse: [type < 7
ifTrue: ["type = 6"
offset < 8
ifTrue: [^ client popIntoReceiverVariable: offset]
ifFalse: [^ client popIntoTemporaryVariable: offset - 8]]
ifFalse: ["type = 7"
offset = 0
ifTrue: [^ client pushReceiver].
offset < 8
ifTrue: [^ client
pushConstant: (SpecialConstants at: offset)].
offset = 8
ifTrue: [^ client methodReturnReceiver].
offset < 12
ifTrue: [^ client
methodReturnConstant: (SpecialConstants at: offset - 8)].
offset = 12
ifTrue: [^ client methodReturnTop].
offset = 13
ifTrue: [^ client blockReturnTop].
offset > 13
ifTrue: [^ self unusedBytecode: client at: pc - 1 ]]]]]
ifFalse: [type < 12
ifTrue: [type < 10
ifTrue: [type < 9
ifTrue: ["type = 8"
^ self
interpretV3PlusClosureExtension: offset
in: method
for: client]
ifFalse: ["type = 9 (short jumps)"
offset < 8
ifTrue: [^ client jump: offset + 1].
^ client jump: offset - 8 + 1 if: false]]
ifFalse: [type < 11
ifTrue: ["type = 10 (long jumps)"
byte := method at: pc.
pc := pc + 1.
offset < 8
ifTrue: [^ client jump: offset - 4 * 256 + byte].
^ client jump: (offset bitAnd: 3)
* 256 + byte if: offset < 12]
ifFalse: ["type = 11"
^ client
send: (Smalltalk specialSelectorAt: offset + 1)
super: false
numArgs: (Smalltalk specialNargsAt: offset + 1)]]]
ifFalse: [type = 12
ifTrue: [^ client
send: (Smalltalk specialSelectorAt: offset + 17)
super: false
numArgs: (Smalltalk specialNargsAt: offset + 17)]
ifFalse: ["type = 13, 14 or 15"
^ client
send: (method literalAt: offset + 1)
super: false
numArgs: type - 13]]].

" old code
type=0 ifTrue: [^client pushReceiverVariable: offset].
type=1 ifTrue: [^client pushTemporaryVariable: offset].
type=2 ifTrue: [^client pushConstant: (method literalAt: offset+1)].
type=3 ifTrue: [^client pushConstant: (method literalAt: offset+17)].
type=4 ifTrue: [^client pushLiteralVariable: (method literalAt: offset+1)].
type=5 ifTrue: [^client pushLiteralVariable: (method literalAt: offset+17)].
type=6
ifTrue: [offset<8
ifTrue: [^client popIntoReceiverVariable: offset]
ifFalse: [^client popIntoTemporaryVariable: offset-8]].
type=7
ifTrue: [offset=0 ifTrue: [^client pushReceiver].
offset<8 ifTrue: [^client pushConstant: (SpecialConstants at: offset)].
offset=8 ifTrue: [^client methodReturnReceiver].
offset<12 ifTrue: [^client methodReturnConstant:
(SpecialConstants at: offset-8)].
offset=12 ifTrue: [^client methodReturnTop].
offset=13 ifTrue: [^client blockReturnTop].
offset>13 ifTrue: [^self error: 'unusedBytecode']].
type=8 ifTrue: [^self interpretExtension: offset in: method for: client].
type=9
ifTrue: short jumps
[offset<8 ifTrue: [^client jump: offset+1].
^client jump: offset-8+1 if: false].
type=10
ifTrue: long jumps
[byte:= method at: pc. pc:= pc+1.
offset<8 ifTrue: [^client jump: offset-4*256 + byte].
^client jump: (offset bitAnd: 3)*256 + byte if: offset<12].
type=11
ifTrue:
[^client
send: (Smalltalk specialSelectorAt: offset+1)
super: false
numArgs: (Smalltalk specialNargsAt: offset+1)].
type=12
ifTrue:
[^client
send: (Smalltalk specialSelectorAt: offset+17)
super: false
numArgs: (Smalltalk specialNargsAt: offset+17)].
type>12
ifTrue:
[^client send: (method literalAt: offset+1)
super: false
numArgs: type-13]"

_,,,^..^,,,_

best, Eliot