Do we want AST-Debugger?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Do we want AST-Debugger?

Denis Kudriashov
Hi.

Look at AST-Interpreter project which includes AST-Debugger http://smalltalkhub.com/#!/~Pharo/AST-Interpreter. Code is so simple comparing to "bytecode simulation" approach of current debugger. 
AST debugger are independent from bytecode set. It just visit AST-nodes to simulate code execution.

But current debugger simulates bytecode. It depends on current bytecode set. There are few kinds of BytecodeEncoders. To understand how current debugger is working you need to know many VM details.

So my question is why our debuggers are not based on AST interpreter? Why it is bad idea? (if it is bad).


Best regards,
Denis
Reply | Threaded
Open this post in threaded view
|

Re: Do we want AST-Debugger?

Andrei Chis
Hi Denis,

No idea why the current debugger is not based on the AST interpreter or if there are any plans to move it.
However, I think it will help to make the ASTDebugger compatible with the current debugger from Pharo.
From what I see the code from ASTDebugger is indeed very simple. One option could be to create a subclass
of DebugSession that overrides those methods that control the execution. If this goes easy then we could just
reuse both the Spec and the Glamour UIs.

Do you want to give it a try or sync more on this?

Cheers,
Andrei

On Wed, May 25, 2016 at 12:12 PM, Denis Kudriashov <[hidden email]> wrote:
Hi.

Look at AST-Interpreter project which includes AST-Debugger http://smalltalkhub.com/#!/~Pharo/AST-Interpreter. Code is so simple comparing to "bytecode simulation" approach of current debugger. 
AST debugger are independent from bytecode set. It just visit AST-nodes to simulate code execution.

But current debugger simulates bytecode. It depends on current bytecode set. There are few kinds of BytecodeEncoders. To understand how current debugger is working you need to know many VM details.

So my question is why our debuggers are not based on AST interpreter? Why it is bad idea? (if it is bad).


Best regards,
Denis

Reply | Threaded
Open this post in threaded view
|

Re: Do we want AST-Debugger?

Eliot Miranda-2
In reply to this post by Denis Kudriashov
Hi Denis,

On May 25, 2016, at 3:12 AM, Denis Kudriashov <[hidden email]> wrote:

Hi.

Look at AST-Interpreter project which includes AST-Debugger http://smalltalkhub.com/#!/~Pharo/AST-Interpreter. Code is so simple comparing to "bytecode simulation" approach of current debugger. 
AST debugger are independent from bytecode set. It just visit AST-nodes to simulate code execution.

But current debugger simulates bytecode. It depends on current bytecode set. There are few kinds of BytecodeEncoders. To understand how current debugger is working you need to know many VM details.

That's simply not true.  In fact the amount known of the VM in both the bytecode debugger and the AST debugger is the same; both use VM primitives that aren't specified at the image level.

Both the AST interpreter and the bytecode interpreter include the interpreter in Smalltalk in the image.  Look at Context and InstructionStream to find the Smalktalk definitions of both the interpreter routines and the implementations of the bytecodes.


So my question is why our debuggers are not based on AST interpreter? Why it is bad idea? (if it is bad).

One area where you'll find the AST debugger  worse is in performance.  The bytecode debugger uses perform: to execute at full speed when possible (it does interpret when doing step into or through, but not when doing over).  In fact this part of the system, because it has to catch no local returns and exceptions, is the most complex, but it's not because of VM details, it's because of the use of perform: to run code being debugged to run at full speed.

Not to mention, if it isn't broke don't fix it.

But seriously, the system is based on bytecode, and that bytecode is pretty straightforward and easy to learn.  Why not put in the effort of learning it?

Criticizing something in ignorance is not wise IMO.

Best regards,
Denis

_,,,^..^,,,_ (phone)
Reply | Threaded
Open this post in threaded view
|

Re: Do we want AST-Debugger?

Denis Kudriashov
Hi Eliot.

2016-06-06 1:13 GMT+02:00 Eliot Miranda <[hidden email]>:
That's simply not true.  In fact the amount known of the VM in both the bytecode debugger and the AST debugger is the same; both use VM primitives that aren't specified at the image level.

Both the AST interpreter and the bytecode interpreter include the interpreter in Smalltalk in the image.  Look at Context and InstructionStream to find the Smalktalk definitions of both the interpreter routines and the implementations of the bytecodes.


So my question is why our debuggers are not based on AST interpreter? Why it is bad idea? (if it is bad).

One area where you'll find the AST debugger  worse is in performance.  The bytecode debugger uses perform: to execute at full speed when possible (it does interpret when doing step into or through, but not when doing over).  In fact this part of the system, because it has to catch no local returns and exceptions, is the most complex, but it's not because of VM details, it's because of the use of perform: to run code being debugged to run at full speed.

Not to mention, if it isn't broke don't fix it.

But seriously, the system is based on bytecode, and that bytecode is pretty straightforward and easy to learn.  Why not put in the effort of learning it?

Criticizing something in ignorance is not wise IMO.

I just put this method here:

InstructionStream >>interpretNextV3PlusClosureInstructionFor: client 
"Send to the argument, client, a message that specifies the type of the 
next instruction."
| byte type offset method |
method := self method.  
byte := method at: pc.
type := byte // 16.  
offset := byte \\ 16.   
pc := pc+1.
"We do an inline binary search on each of the possible 16 values of type:
The old, cleaner but slowe code is retained as a comment below"
type < 8
ifTrue: [type < 4
ifTrue: [type < 2
ifTrue: [type < 1
ifTrue: ["type = 0"
^ client pushReceiverVariable: offset]
ifFalse: ["type = 1"
^ client pushTemporaryVariable: offset]]
ifFalse: [type < 3
ifTrue: ["type = 2"
^ client
pushConstant: (method literalAt: offset + 1)]
ifFalse: ["type = 3"
^ client
pushConstant: (method literalAt: offset + 17)]]]
ifFalse: [type < 6
ifTrue: [type < 5
ifTrue: ["type = 4"
^ client
pushLiteralVariable: (method literalAt: offset + 1)]
ifFalse: ["type = 5"
^ client
pushLiteralVariable: (method literalAt: offset + 17)]]
ifFalse: [type < 7
ifTrue: ["type = 6"
offset < 8
ifTrue: [^ client popIntoReceiverVariable: offset]
ifFalse: [^ client popIntoTemporaryVariable: offset - 8]]
ifFalse: ["type = 7"
offset = 0
ifTrue: [^ client pushReceiver].
offset < 8
ifTrue: [^ client
pushConstant: (SpecialConstants at: offset)].
offset = 8
ifTrue: [^ client methodReturnReceiver].
offset < 12
ifTrue: [^ client
methodReturnConstant: (SpecialConstants at: offset - 8)].
offset = 12
ifTrue: [^ client methodReturnTop].
offset = 13
ifTrue: [^ client blockReturnTop].
offset > 13
ifTrue: [^ self unusedBytecode: client at: pc - 1 ]]]]]
ifFalse: [type < 12
ifTrue: [type < 10
ifTrue: [type < 9
ifTrue: ["type = 8"
^ self
interpretV3PlusClosureExtension: offset 
in: method 
for: client]
ifFalse: ["type = 9 (short jumps)"
offset < 8
ifTrue: [^ client jump: offset + 1].
^ client jump: offset - 8 + 1 if: false]]
ifFalse: [type < 11
ifTrue: ["type = 10 (long jumps)"
byte := method at: pc.
pc := pc + 1.
offset < 8
ifTrue: [^ client jump: offset - 4 * 256 + byte].
^ client jump: (offset bitAnd: 3)
* 256 + byte if: offset < 12]
ifFalse: ["type = 11"
^ client
send: (Smalltalk specialSelectorAt: offset + 1)
super: false
numArgs: (Smalltalk specialNargsAt: offset + 1)]]]
ifFalse: [type = 12
ifTrue: [^ client
send: (Smalltalk specialSelectorAt: offset + 17)
super: false
numArgs: (Smalltalk specialNargsAt: offset + 17)]
ifFalse: ["type = 13, 14 or 15"
^ client
send: (method literalAt: offset + 1)
super: false
numArgs: type - 13]]].

"    old code 
type=0 ifTrue: [^client pushReceiverVariable: offset].
type=1 ifTrue: [^client pushTemporaryVariable: offset].
type=2 ifTrue: [^client pushConstant: (method literalAt: offset+1)].
type=3 ifTrue: [^client pushConstant: (method literalAt: offset+17)].
type=4 ifTrue: [^client pushLiteralVariable: (method literalAt: offset+1)].
type=5 ifTrue: [^client pushLiteralVariable: (method literalAt: offset+17)].
type=6 
ifTrue: [offset<8
ifTrue: [^client popIntoReceiverVariable: offset]
ifFalse: [^client popIntoTemporaryVariable: offset-8]].
type=7
ifTrue: [offset=0 ifTrue: [^client pushReceiver].
offset<8 ifTrue: [^client pushConstant: (SpecialConstants at: offset)].
offset=8 ifTrue: [^client methodReturnReceiver].
offset<12 ifTrue: [^client methodReturnConstant: 
(SpecialConstants at: offset-8)].
offset=12 ifTrue: [^client methodReturnTop].
offset=13 ifTrue: [^client blockReturnTop].
offset>13 ifTrue: [^self error: 'unusedBytecode']].
type=8 ifTrue: [^self interpretExtension: offset in: method for: client].
type=9
ifTrue:  short jumps
[offset<8 ifTrue: [^client jump: offset+1].
^client jump: offset-8+1 if: false].
type=10 
ifTrue:  long jumps
[byte:= method at: pc.  pc:= pc+1.
offset<8 ifTrue: [^client jump: offset-4*256 + byte].
^client jump: (offset bitAnd: 3)*256 + byte if: offset<12].
type=11 
ifTrue: 
[^client 
send: (Smalltalk specialSelectorAt: offset+1) 
super: false
numArgs: (Smalltalk specialNargsAt: offset+1)].
type=12 
ifTrue: 
[^client 
send: (Smalltalk specialSelectorAt: offset+17) 
super: false
numArgs: (Smalltalk specialNargsAt: offset+17)].
type>12
ifTrue: 
[^client send: (method literalAt: offset+1) 
super: false
numArgs: type-13]" 


Reply | Threaded
Open this post in threaded view
|

Re: Do we want AST-Debugger?

Denis Kudriashov

2016-06-06 10:45 GMT+02:00 Denis Kudriashov <[hidden email]>:

But seriously, the system is based on bytecode, and that bytecode is pretty straightforward and easy to learn.  Why not put in the effort of learning it?

Criticizing something in ignorance is not wise IMO.

I just put this method here:

So to learn how #step method working people should understand this method. And it is not alone.
 
Reply | Threaded
Open this post in threaded view
|

Re: Do we want AST-Debugger?

Denis Kudriashov
In reply to this post by Denis Kudriashov

2016-06-06 10:45 GMT+02:00 Denis Kudriashov <[hidden email]>:
One area where you'll find the AST debugger  worse is in performance.  The bytecode debugger uses perform: to execute at full speed when possible (it does interpret when doing step into or through, but not when doing over).  In fact this part of the system, because it has to catch no local returns and exceptions, is the most complex, but it's not because of VM details, it's because of the use of perform: to run code being debugged to run at full speed.

Same approach could be applied to AST interpreter too.
Reply | Threaded
Open this post in threaded view
|

Re: Do we want AST-Debugger?

Eliot Miranda-2
In reply to this post by Denis Kudriashov
Hi Denis,

On Mon, Jun 6, 2016 at 1:45 AM, Denis Kudriashov <[hidden email]> wrote:
Hi Eliot.

2016-06-06 1:13 GMT+02:00 Eliot Miranda <[hidden email]>:
That's simply not true.  In fact the amount known of the VM in both the bytecode debugger and the AST debugger is the same; both use VM primitives that aren't specified at the image level.

Both the AST interpreter and the bytecode interpreter include the interpreter in Smalltalk in the image.  Look at Context and InstructionStream to find the Smalktalk definitions of both the interpreter routines and the implementations of the bytecodes.


So my question is why our debuggers are not based on AST interpreter? Why it is bad idea? (if it is bad).

One area where you'll find the AST debugger  worse is in performance.  The bytecode debugger uses perform: to execute at full speed when possible (it does interpret when doing step into or through, but not when doing over).  In fact this part of the system, because it has to catch no local returns and exceptions, is the most complex, but it's not because of VM details, it's because of the use of perform: to run code being debugged to run at full speed.

Not to mention, if it isn't broke don't fix it.

But seriously, the system is based on bytecode, and that bytecode is pretty straightforward and easy to learn.  Why not put in the effort of learning it?

Criticizing something in ignorance is not wise IMO.

I just put this method here:

and what's your issue with it?  It simply dispatches the standard byte code set as described in the EncoderForV3's and EncoderForV3PlusClosure's class comments:

0-15 0000iiii Push Receiver Variable #iiii
16-31 0001iiii Push Temporary Location #iiii
32-63 001iiiii Push Literal Constant #iiiii
64-95 010iiiii Push Literal Variable #iiiii
96-103 01100iii Pop and Store Receiver Variable #iii
104-111 01101iii Pop and Store Temporary Location #iii
112-119 01110iii Push (receiver, true, false, nil, -1, 0, 1, 2) [iii]
120-123 011110ii Return (receiver, true, false, nil) [ii] From Message
124-125 0111110i Return Stack Top From (Message, Block) [i]
(126-127 unassigned)
128 10000000 jjkkkkkk Push (Receiver Variable, Temporary Location, Literal Constant, Literal Variable) [jj] #kkkkkk
129 10000001 jjkkkkkk Store (Receiver Variable, Temporary Location, Illegal, Literal Variable) [jj] #kkkkkk
130 10000010 jjkkkkkk Pop and Store (Receiver Variable, Temporary Location, Illegal, Literal Variable) [jj] #kkkkkk
131 10000011 jjjkkkkk Send Literal Selector #kkkkk With jjj Arguments
132 10000100 iiijjjjj kkkkkkkk (Send, Send Super, Push Receiver Variable, Push Literal Constant, Push Literal Variable, Store Receiver Variable, Store-Pop Receiver Variable, Store Literal Variable)[iii] #kkkkkkkk jjjjj (for sends jjjjj = numArgs)
133 10000011 jjjkkkkk Send Literal Selector #kkkkk To Superclass With jjj Arguments
134 10000011 jjjkkkkk Send Literal Selector #kkkkk With jjj Arguments
135 10000111 Pop Stack Top
136 10001000 Duplicate Stack Top
137 10001001 Push Active Context
138   10001010 jkkkkkkk Push (Array new: kkkkkkk) (j = 0)
or Pop kkkkkkk elements into: (Array new: kkkkkkk) (j = 1)
139   10001011 kkkkkkkk jjjjjjjj Invoke primitive number jjjjjjjjkkkkkkkk
140   10001100 kkkkkkkk jjjjjjjj Push Temp At kkkkkkkk In Temp Vector At: jjjjjjjj
141   10001101 kkkkkkkk jjjjjjjj Store Temp At kkkkkkkk In Temp Vector At: jjjjjjjj
142   10001110 kkkkkkkk jjjjjjjj Pop and Store Temp At kkkkkkkk In Temp Vector At: jjjjjjjj
143   10001111 llllkkkk jjjjjjjj iiiiiiii Push Closure Num Copied llll Num Args kkkk BlockSize jjjjjjjjiiiiiiii
144-151 10010iii Jump iii + 1 (i.e., 1 through 8)
152-159 10011iii Pop and Jump 0n False iii +1 (i.e., 1 through 8)
160-167 10100iii jjjjjjjj Jump(iii - 4) *256+jjjjjjjj
168-171 101010ii jjjjjjjj Pop and Jump On True ii *256+jjjjjjjj
172-175 101011ii jjjjjjjj Pop and Jump On False ii *256+jjjjjjjj
176-191 1011iiii Send Arithmetic Message #iiii
192-207 1100iiii Send Special Message #iiii
208-223 1101iiii Send Literal Selector #iiii With No Arguments
224-239 1110iiii Send Literal Selector #iiii With 1 Argument
240-255 1111iiii Send Literal Selector #iiii With 2 Arguments

The method uses binary search to dispatch efficiently (a modification made by Stephane Ducasse a while back).  It isn't as readable as the old method but it is quite straight-forward.  Further, it isn't specific to the VM.  It is used by numbers facilities, to interpret byte code in the image (see Context/s implementations of pushReceiverVariable: et al), to decompile byte code methods, to print byte code, to scan for selectors a method sends, etc.  Would it help if the method included the information in the class comments?  Or would it be good enough for the method's comment to reference the class comments?

BTW, there's a good article on byte code on wikipedia.  Note that the VM executes byte code, not syntax trees, and so the VM and indeed all the execution machinery including the debugger, are not limited to debugging syntax trees.  One can express many useful things in bytecode that can't be expressed by our syntax trees.  And so if you do move towards the AST debugger you'll find that you've drastically restricted the flexibility and generality7 of the system for no benefit that I can see.

InstructionStream >>interpretNextV3PlusClosureInstructionFor: client 
"Send to the argument, client, a message that specifies the type of the 
next instruction."
| byte type offset method |
method := self method.  
byte := method at: pc.
type := byte // 16.  
offset := byte \\ 16.   
pc := pc+1.
"We do an inline binary search on each of the possible 16 values of type:
The old, cleaner but slowe code is retained as a comment below"
type < 8
ifTrue: [type < 4
ifTrue: [type < 2
ifTrue: [type < 1
ifTrue: ["type = 0"
^ client pushReceiverVariable: offset]
ifFalse: ["type = 1"
^ client pushTemporaryVariable: offset]]
ifFalse: [type < 3
ifTrue: ["type = 2"
^ client
pushConstant: (method literalAt: offset + 1)]
ifFalse: ["type = 3"
^ client
pushConstant: (method literalAt: offset + 17)]]]
ifFalse: [type < 6
ifTrue: [type < 5
ifTrue: ["type = 4"
^ client
pushLiteralVariable: (method literalAt: offset + 1)]
ifFalse: ["type = 5"
^ client
pushLiteralVariable: (method literalAt: offset + 17)]]
ifFalse: [type < 7
ifTrue: ["type = 6"
offset < 8
ifTrue: [^ client popIntoReceiverVariable: offset]
ifFalse: [^ client popIntoTemporaryVariable: offset - 8]]
ifFalse: ["type = 7"
offset = 0
ifTrue: [^ client pushReceiver].
offset < 8
ifTrue: [^ client
pushConstant: (SpecialConstants at: offset)].
offset = 8
ifTrue: [^ client methodReturnReceiver].
offset < 12
ifTrue: [^ client
methodReturnConstant: (SpecialConstants at: offset - 8)].
offset = 12
ifTrue: [^ client methodReturnTop].
offset = 13
ifTrue: [^ client blockReturnTop].
offset > 13
ifTrue: [^ self unusedBytecode: client at: pc - 1 ]]]]]
ifFalse: [type < 12
ifTrue: [type < 10
ifTrue: [type < 9
ifTrue: ["type = 8"
^ self
interpretV3PlusClosureExtension: offset 
in: method 
for: client]
ifFalse: ["type = 9 (short jumps)"
offset < 8
ifTrue: [^ client jump: offset + 1].
^ client jump: offset - 8 + 1 if: false]]
ifFalse: [type < 11
ifTrue: ["type = 10 (long jumps)"
byte := method at: pc.
pc := pc + 1.
offset < 8
ifTrue: [^ client jump: offset - 4 * 256 + byte].
^ client jump: (offset bitAnd: 3)
* 256 + byte if: offset < 12]
ifFalse: ["type = 11"
^ client
send: (Smalltalk specialSelectorAt: offset + 1)
super: false
numArgs: (Smalltalk specialNargsAt: offset + 1)]]]
ifFalse: [type = 12
ifTrue: [^ client
send: (Smalltalk specialSelectorAt: offset + 17)
super: false
numArgs: (Smalltalk specialNargsAt: offset + 17)]
ifFalse: ["type = 13, 14 or 15"
^ client
send: (method literalAt: offset + 1)
super: false
numArgs: type - 13]]].

"    old code 
type=0 ifTrue: [^client pushReceiverVariable: offset].
type=1 ifTrue: [^client pushTemporaryVariable: offset].
type=2 ifTrue: [^client pushConstant: (method literalAt: offset+1)].
type=3 ifTrue: [^client pushConstant: (method literalAt: offset+17)].
type=4 ifTrue: [^client pushLiteralVariable: (method literalAt: offset+1)].
type=5 ifTrue: [^client pushLiteralVariable: (method literalAt: offset+17)].
type=6 
ifTrue: [offset<8
ifTrue: [^client popIntoReceiverVariable: offset]
ifFalse: [^client popIntoTemporaryVariable: offset-8]].
type=7
ifTrue: [offset=0 ifTrue: [^client pushReceiver].
offset<8 ifTrue: [^client pushConstant: (SpecialConstants at: offset)].
offset=8 ifTrue: [^client methodReturnReceiver].
offset<12 ifTrue: [^client methodReturnConstant: 
(SpecialConstants at: offset-8)].
offset=12 ifTrue: [^client methodReturnTop].
offset=13 ifTrue: [^client blockReturnTop].
offset>13 ifTrue: [^self error: 'unusedBytecode']].
type=8 ifTrue: [^self interpretExtension: offset in: method for: client].
type=9
ifTrue:  short jumps
[offset<8 ifTrue: [^client jump: offset+1].
^client jump: offset-8+1 if: false].
type=10 
ifTrue:  long jumps
[byte:= method at: pc.  pc:= pc+1.
offset<8 ifTrue: [^client jump: offset-4*256 + byte].
^client jump: (offset bitAnd: 3)*256 + byte if: offset<12].
type=11 
ifTrue: 
[^client 
send: (Smalltalk specialSelectorAt: offset+1) 
super: false
numArgs: (Smalltalk specialNargsAt: offset+1)].
type=12 
ifTrue: 
[^client 
send: (Smalltalk specialSelectorAt: offset+17) 
super: false
numArgs: (Smalltalk specialNargsAt: offset+17)].
type>12
ifTrue: 
[^client send: (method literalAt: offset+1) 
super: false
numArgs: type-13]" 
 
_,,,^..^,,,_
best, Eliot