Win32 beta test

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

Win32 beta test

Ian Piumarta

Folks,

I have updated the Win32 VM to use the latest 4.10.2 VMMaker sources and fixed several issues that prevented it from compiling on recent versions of Windows using the latest release of MinGW.  I have also attempted to simplify the build process.  It works fine for me, but that isn't saying much, so...

If you feel confident about building and/or test-driving a beta-quality Windows VM, please download 4.10.2-2612 from

    http://squeakvm.org/win32/

and let me know what needs fixing.  Many thanks!

(Extra thanks if you provide a patch at the same time. :)

Regards,
Ian

Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

David T. Lewis
 
On Sun, Sep 16, 2012 at 11:08:55AM +0900, Ian Piumarta wrote:

>
> Folks,
>
> I have updated the Win32 VM to use the latest 4.10.2 VMMaker sources and fixed several issues that prevented it from compiling on recent versions of Windows using the latest release of MinGW.  I have also attempted to simplify the build process.  It works fine for me, but that isn't saying much, so...
>
> If you feel confident about building and/or test-driving a beta-quality Windows VM, please download 4.10.2-2612 from
>
>     http://squeakvm.org/win32/
>
> and let me know what needs fixing.  Many thanks!
>
> (Extra thanks if you provide a patch at the same time. :)
>
> Regards,
> Ian

Yay! Having an updated interpreter VM that can read image format 6505 should
allow us to finally get rid of that annoying 'Images saved under Cog cannot
be opened on an interpreter again!  Really save?' warning message.

Thanks!

Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

Sungjin Chun
In reply to this post by Ian Piumarta

Is this VM compatible with old images?

Sent from my iPhone

On Sep 16, 2012, at 11:08, Ian Piumarta <[hidden email]> wrote:

>
> Folks,
>
> I have updated the Win32 VM to use the latest 4.10.2 VMMaker sources and fixed several issues that prevented it from compiling on recent versions of Windows using the latest release of MinGW.  I have also attempted to simplify the build process.  It works fine for me, but that isn't saying much, so...
>
> If you feel confident about building and/or test-driving a beta-quality Windows VM, please download 4.10.2-2612 from
>
>    http://squeakvm.org/win32/
>
> and let me know what needs fixing.  Many thanks!
>
> (Extra thanks if you provide a patch at the same time. :)
>
> Regards,
> Ian
>
Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

Ian Piumarta
 
On Sep 16, 2012, at 17:27 , Sungjin Chun wrote:

> Is this VM compatible with old images?

The oldest image I have at hand is a 3.9 from mid-2006, and it works fine.

Regards,
Ian

Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

David T. Lewis
 
On Sun, Sep 16, 2012 at 06:57:56PM +0900, Ian Piumarta wrote:
>  
> On Sep 16, 2012, at 17:27 , Sungjin Chun wrote:
>
> > Is this VM compatible with old images?
>
> The oldest image I have at hand is a 3.9 from mid-2006, and it works fine.
>

It should work with Squeak 3.6 and later, including 3.8 which is the
basis for a lot of important work.

Dave

Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

Levente Uzonyi-2
In reply to this post by Ian Piumarta
 
On Sun, 16 Sep 2012, Ian Piumarta wrote:

>
> Folks,
>
> I have updated the Win32 VM to use the latest 4.10.2 VMMaker sources and fixed several issues that prevented it from compiling on recent versions of Windows using the latest release of MinGW.  I have also attempted to simplify the build process.  It works fine for me, but that isn't saying much, so...

Thanks, this is great. This VM is slightly slower than the ones built with
the old 2.95 gcc (bytecodes ~5%, sends ~22% according to tinyBenchmarks).
Did you use the latest MinGW?

The VM also has support for all mirror primitives and the new network
code which is great, though I found some weakness in the new network code.


Levente

>
> If you feel confident about building and/or test-driving a beta-quality Windows VM, please download 4.10.2-2612 from
>
>    http://squeakvm.org/win32/
>
> and let me know what needs fixing.  Many thanks!
>
> (Extra thanks if you provide a patch at the same time. :)
>
> Regards,
> Ian
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

Karl Ramberg
In reply to this post by Ian Piumarta

On Sun, Sep 16, 2012 at 4:08 AM, Ian Piumarta <[hidden email]> wrote:

>
> Folks,
>
> I have updated the Win32 VM to use the latest 4.10.2 VMMaker sources and fixed several issues that prevented it from compiling on recent versions of Windows using the latest release of MinGW.  I have also attempted to simplify the build process.  It works fine for me, but that isn't saying much, so...
>
> If you feel confident about building and/or test-driving a beta-quality Windows VM, please download 4.10.2-2612 from
>
>     http://squeakvm.org/win32/
>
> and let me know what needs fixing.  Many thanks!
>
> (Extra thanks if you provide a patch at the same time. :)
>
> Regards,
> Ian
>

I could build a VM with this setup.


Karl
Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

Ian Piumarta
In reply to this post by Levente Uzonyi-2

Levente,

On Sep 17, 2012, at 02:17 , Levente Uzonyi wrote:

> This VM is slightly slower than the ones built with the old 2.95 gcc (bytecodes ~5%, sends ~22% according to tinyBenchmarks). Did you use the latest MinGW?

Yes, I used the new combined MSYS+MinGW distribution.  The compiler is gcc 4.7.0.  (The infamously bad 386 code was produced by gcc 4.4 and earlier.)

I removed a -mpentium that is not understood by the current compiler.  Finding the right -m switch(es) might be all that's needed to restore the lost performance.

Regards,
Ian

Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

Ian Piumarta

On Sep 17, 2012, at 08:08 , Ian Piumarta wrote:
>
>> This VM is slightly slower than the ones built with the old 2.95 gcc (bytecodes ~5%, sends ~22% according to tinyBenchmarks).
>
> I removed a -mpentium that is not understood by the current compiler.

"gcc 4.7.0" 0 tinyBenchmarks
 '933029612 bytecodes/sec; 25242859 sends/sec'
 '927536231 bytecodes/sec; 25217697 sends/sec'
 '932180245 bytecodes/sec; 24614121 sends/sec'


"gcc 2.95 -mpentium" 0 tinyBenchmarks
 '911032028 bytecodes/sec; 22911061 sends/sec'
 '896280087 bytecodes/sec; 22931829 sends/sec'
 '948148148 bytecodes/sec; 22973478 sends/sec'

There are four kinds of lies: lies, damned lies, statistic, and benchmarks running on x86. :)

Regards,
Ian

Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

David T. Lewis
In reply to this post by Levente Uzonyi-2
 
On Sun, Sep 16, 2012 at 07:17:03PM +0200, Levente Uzonyi wrote:

>
> On Sun, 16 Sep 2012, Ian Piumarta wrote:
>
> >
> >Folks,
> >
> >I have updated the Win32 VM to use the latest 4.10.2 VMMaker sources and
> >fixed several issues that prevented it from compiling on recent versions
> >of Windows using the latest release of MinGW.  I have also attempted to
> >simplify the build process.  It works fine for me, but that isn't saying
> >much, so...
>
> Thanks, this is great. This VM is slightly slower than the ones built with
> the old 2.95 gcc (bytecodes ~5%, sends ~22% according to tinyBenchmarks).
> Did you use the latest MinGW?
>
> The VM also has support for all mirror primitives and the new network
> code which is great, though I found some weakness in the new network code.

Levente,

Can you say anything more about what weakness you found in the
network code?

Thanks,
Dave

Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

Bob Arning-2
In reply to this post by David T. Lewis
 
It would be really cool if there were a list somewhere pointing to the most modern VM capable of running various image versions. E.g. what's the most modern Windows VM that will run a 3.4 image?

Cheers,
Bob

On 9/16/12 7:40 AM, David T. Lewis wrote:
 
On Sun, Sep 16, 2012 at 06:57:56PM +0900, Ian Piumarta wrote:
 
On Sep 16, 2012, at 17:27 , Sungjin Chun wrote:

Is this VM compatible with old images?
The oldest image I have at hand is a 3.9 from mid-2006, and it works fine.

It should work with Squeak 3.6 and later, including 3.8 which is the
basis for a lot of important work.

Dave



Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

Levente Uzonyi-2
In reply to this post by Ian Piumarta
 
On Mon, 17 Sep 2012, Ian Piumarta wrote:

>
> On Sep 17, 2012, at 08:08 , Ian Piumarta wrote:
>>
>>> This VM is slightly slower than the ones built with the old 2.95 gcc (bytecodes ~5%, sends ~22% according to tinyBenchmarks).
>>
>> I removed a -mpentium that is not understood by the current compiler.
>
> "gcc 4.7.0" 0 tinyBenchmarks
> '933029612 bytecodes/sec; 25242859 sends/sec'
> '927536231 bytecodes/sec; 25217697 sends/sec'
> '932180245 bytecodes/sec; 24614121 sends/sec'
>
>
> "gcc 2.95 -mpentium" 0 tinyBenchmarks
> '911032028 bytecodes/sec; 22911061 sends/sec'
> '896280087 bytecodes/sec; 22931829 sends/sec'
> '948148148 bytecodes/sec; 22973478 sends/sec'
>
> There are four kinds of lies: lies, damned lies, statistic, and benchmarks running on x86. :)

So something has changed in the code too. I compared the VM you built
with the VM I built last year using the old 2.95 gcc:
http://leves.web.elte.hu/squeak/SqueakVM-Win32-4.4.9-2358-non-official-bin.zip


Levente

>
> Regards,
> Ian
>
>
Reply | Threaded
Open this post in threaded view
|

SocketPlugin issues (was: Re: [Vm-dev] Win32 beta test)

Levente Uzonyi-2
In reply to this post by David T. Lewis
 
On Mon, 17 Sep 2012, David T. Lewis wrote:

> Levente,
>
> Can you say anything more about what weakness you found in the
> network code?

All issues I found are related to name lookup. To do a name lookup the new
code requires multiple primitive calls (see SocketAddressInformation >>
#forHost:service:flags:addressFamily:socketType:protocol:). The plugin
uses static variables to store the result of the name lookup
(hostNameInfo, servNameInfo and nameInfoValid). This means that only one
name can be looked up at a time.

The image side code doesn't prevent simultaneous access to these static
variables, so the can get into an unexpected state (see SocketAddress >>
#hostName).

Another issue is that the plugin doesn't allocate objects (strings), so 2
primitive calls have to be done to fetch a string (see SocketAddress >>
#hostName again). One requests the size of the string, the other copies
the data to a string it receives as argument.

I could reproduce a deadlock-like state by evaluating:

NetNameResolver addressesForName: 'amazon.com'

It's sometimes possible to interrupt the process to get a debugger, but
since the primitives are called by the debugger too (see SocketAddress >>
#printOn:), the image will hang if you try to use it.


Levente

>
> Thanks,
> Dave
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

David T. Lewis
In reply to this post by Bob Arning-2
 
I do not have this information, but I agree that it would be really nice
to have.

Dave

On Mon, Sep 17, 2012 at 12:32:36PM -0400, Bob Arning wrote:

>  
> It would be really cool if there were a list somewhere pointing to the
> most modern VM capable of running various image versions. E.g. what's
> the most modern Windows VM that will run a 3.4 image?
>
> Cheers,
> Bob
>
> On 9/16/12 7:40 AM, David T. Lewis wrote:
> >  
> >On Sun, Sep 16, 2012 at 06:57:56PM +0900, Ian Piumarta wrote:
> >>  
> >>On Sep 16, 2012, at 17:27 , Sungjin Chun wrote:
> >>
> >>>Is this VM compatible with old images?
> >>The oldest image I have at hand is a 3.9 from mid-2006, and it works fine.
> >>
> >It should work with Squeak 3.6 and later, including 3.8 which is the
> >basis for a lot of important work.
> >
> >Dave
> >
> >
>

Reply | Threaded
Open this post in threaded view
|

Re: SocketPlugin issues (was: Re: [Vm-dev] Win32 beta test)

David T. Lewis
In reply to this post by Levente Uzonyi-2
 
Thanks Levente, this is very helpful.

It sounds like these are problems in the new network code on the image
side (and I'm responsible for causing that). The updated VM is probably
different only in that it provides the IPv6 primitives, which in turn
expose the bugs on the image side. So I expect that the problems you
describe should also happen with a unix interpreter VM (I'll check
and find out as soon as I can).

I note for the record that Andreas is fully entitled to say "I told
you so!" at this point ;)

To the extent that these are Squeak image problems, evaluating
"NetNameResolver useOldNetwork: true" should make the symptoms
go away.

Other comments in line below.

On Mon, Sep 17, 2012 at 07:01:47PM +0200, Levente Uzonyi wrote:

>
> On Mon, 17 Sep 2012, David T. Lewis wrote:
>
> >Levente,
> >
> >Can you say anything more about what weakness you found in the
> >network code?
>
> All issues I found are related to name lookup. To do a name lookup the new
> code requires multiple primitive calls (see SocketAddressInformation >>
> #forHost:service:flags:addressFamily:socketType:protocol:). The plugin
> uses static variables to store the result of the name lookup
> (hostNameInfo, servNameInfo and nameInfoValid). This means that only one
> name can be looked up at a time.

I noticed that, and attempted to provide some protection for it with
a semaphore (ResolverMutex) in class NetNameResolver. Apparently I did
not do a very good job of it though.

>
> The image side code doesn't prevent simultaneous access to these static
> variables, so the can get into an unexpected state (see SocketAddress >>
> #hostName).
>
> Another issue is that the plugin doesn't allocate objects (strings), so 2
> primitive calls have to be done to fetch a string (see SocketAddress >>
> #hostName again). One requests the size of the string, the other copies
> the data to a string it receives as argument.
>
> I could reproduce a deadlock-like state by evaluating:
>
> NetNameResolver addressesForName: 'amazon.com'
>
> It's sometimes possible to interrupt the process to get a debugger, but
> since the primitives are called by the debugger too (see SocketAddress >>
> #printOn:), the image will hang if you try to use it.
>

That seems likely to be a problem related to the semaphore in NetNameResolver.

If this turns out to be a problem with the network support in Squeak
trunk, we can take the discussion back to squeak-dev for resolution.

Dave

Reply | Threaded
Open this post in threaded view
|

Re: Win32 beta test

Sungjin Chun
In reply to this post by David T. Lewis
 
My MVC only 3.8.1 based images work well.

On Tue, Sep 18, 2012 at 4:13 AM, David T. Lewis <[hidden email]> wrote:

>
> I do not have this information, but I agree that it would be really nice
> to have.
>
> Dave
>
> On Mon, Sep 17, 2012 at 12:32:36PM -0400, Bob Arning wrote:
>>
>> It would be really cool if there were a list somewhere pointing to the
>> most modern VM capable of running various image versions. E.g. what's
>> the most modern Windows VM that will run a 3.4 image?
>>
>> Cheers,
>> Bob
>>
>> On 9/16/12 7:40 AM, David T. Lewis wrote:
>> >
>> >On Sun, Sep 16, 2012 at 06:57:56PM +0900, Ian Piumarta wrote:
>> >>
>> >>On Sep 16, 2012, at 17:27 , Sungjin Chun wrote:
>> >>
>> >>>Is this VM compatible with old images?
>> >>The oldest image I have at hand is a 3.9 from mid-2006, and it works fine.
>> >>
>> >It should work with Squeak 3.6 and later, including 3.8 which is the
>> >basis for a lot of important work.
>> >
>> >Dave
>> >
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: SocketPlugin issues (was: Re: [Vm-dev] Win32 beta test)

David T. Lewis
In reply to this post by David T. Lewis
 
There seem to be some issues with the new networking code (in Squeak,
for IPv6 support) when running on a Windows VM that has the IPv6
primitives in the SocketPlugin.

I booted up Windows and tried Levente's deadlock test:

> > I could reproduce a deadlock-like state by evaluating:
> >
> > NetNameResolver addressesForName: 'amazon.com'
> >

I tried this both with Ian's beta interpreter VM, and one of Eliot's
recent Cog VMs. Both VMs have the IPv6 primitives now, and both of
them show similar issues. I did not see actual deadlocks, but what
I did see was extremely long primitive calls that make the image feel
like it is deadlocked. The #primitiveResolverGetNameInfo call is a
source of problems, and there may be others.

It looks to me like some of the new primitives are invoking some very
slow system functions on Windows, and the Squeak network code updates
cause these primitives to be called if they are available in the VM,
so the newer VMs are having problems. I have not seen these issues
on Linux, so it may reflect differences in the networking support
for different operating systems.

Levente, is this consistent with what you were seeing?

Dave

On Mon, Sep 17, 2012 at 03:37:45PM -0400, David T. Lewis wrote:

> Thanks Levente, this is very helpful.
>
> It sounds like these are problems in the new network code on the image
> side (and I'm responsible for causing that). The updated VM is probably
> different only in that it provides the IPv6 primitives, which in turn
> expose the bugs on the image side. So I expect that the problems you
> describe should also happen with a unix interpreter VM (I'll check
> and find out as soon as I can).
>
> I note for the record that Andreas is fully entitled to say "I told
> you so!" at this point ;)
>
> To the extent that these are Squeak image problems, evaluating
> "NetNameResolver useOldNetwork: true" should make the symptoms
> go away.
>
> Other comments in line below.
>
> On Mon, Sep 17, 2012 at 07:01:47PM +0200, Levente Uzonyi wrote:
> >
> > On Mon, 17 Sep 2012, David T. Lewis wrote:
> >
> > >Levente,
> > >
> > >Can you say anything more about what weakness you found in the
> > >network code?
> >
> > All issues I found are related to name lookup. To do a name lookup the new
> > code requires multiple primitive calls (see SocketAddressInformation >>
> > #forHost:service:flags:addressFamily:socketType:protocol:). The plugin
> > uses static variables to store the result of the name lookup
> > (hostNameInfo, servNameInfo and nameInfoValid). This means that only one
> > name can be looked up at a time.
>
> I noticed that, and attempted to provide some protection for it with
> a semaphore (ResolverMutex) in class NetNameResolver. Apparently I did
> not do a very good job of it though.
>
> >
> > The image side code doesn't prevent simultaneous access to these static
> > variables, so the can get into an unexpected state (see SocketAddress >>
> > #hostName).
> >
> > Another issue is that the plugin doesn't allocate objects (strings), so 2
> > primitive calls have to be done to fetch a string (see SocketAddress >>
> > #hostName again). One requests the size of the string, the other copies
> > the data to a string it receives as argument.
> >
> > I could reproduce a deadlock-like state by evaluating:
> >
> > NetNameResolver addressesForName: 'amazon.com'
> >
> > It's sometimes possible to interrupt the process to get a debugger, but
> > since the primitives are called by the debugger too (see SocketAddress >>
> > #printOn:), the image will hang if you try to use it.
> >
>
> That seems likely to be a problem related to the semaphore in NetNameResolver.
>
> If this turns out to be a problem with the network support in Squeak
> trunk, we can take the discussion back to squeak-dev for resolution.
>
> Dave
Reply | Threaded
Open this post in threaded view
|

Re: SocketPlugin issues (was: Re: [Vm-dev] Win32 beta test)

Levente Uzonyi-2
 
On Wed, 19 Sep 2012, David T. Lewis wrote:

> There seem to be some issues with the new networking code (in Squeak,
> for IPv6 support) when running on a Windows VM that has the IPv6
> primitives in the SocketPlugin.
>
> I booted up Windows and tried Levente's deadlock test:
>
>>> I could reproduce a deadlock-like state by evaluating:
>>>
>>> NetNameResolver addressesForName: 'amazon.com'
>>>
>
> I tried this both with Ian's beta interpreter VM, and one of Eliot's
> recent Cog VMs. Both VMs have the IPv6 primitives now, and both of
> them show similar issues. I did not see actual deadlocks, but what
> I did see was extremely long primitive calls that make the image feel
> like it is deadlocked. The #primitiveResolverGetNameInfo call is a
> source of problems, and there may be others.
>
> It looks to me like some of the new primitives are invoking some very
> slow system functions on Windows, and the Squeak network code updates
> cause these primitives to be called if they are available in the VM,
> so the newer VMs are having problems. I have not seen these issues
> on Linux, so it may reflect differences in the networking support
> for different operating systems.
>
> Levente, is this consistent with what you were seeing?

It is, though I didn't check which primitive takes too long. I checked the
implementation of #primitiveResolverGetNameInfo now on win32 and it seems
to be okay, pretty much the same as on unix/linux. I see no reason why it
would take so long to respond.

In the meantime I found another thing I don't like about the new
primitives. The timeout for the namelookup is ignored, so now we have to
wait till the primitive returns.

Actually I don't like the way how the name lookup is implemented. I think
it might worth moving the DNS lookup to Squeak. The only thing the VM
needs to provide then is a primitive which returns the IP addresses
of the nameservers to be used (though the system can work without that
too). Here are the pros and cons I came up with so far:
Pros:
- the code is in Smalltalk, so it's platform independent
- concurrent name lookups become possible
- no more long waits on the VM side
Cons:
- extra complexity, since a DNS client has to be implemented
- the OS's DNS cache can't be used


Levente

>
> Dave
>
> On Mon, Sep 17, 2012 at 03:37:45PM -0400, David T. Lewis wrote:
>> Thanks Levente, this is very helpful.
>>
>> It sounds like these are problems in the new network code on the image
>> side (and I'm responsible for causing that). The updated VM is probably
>> different only in that it provides the IPv6 primitives, which in turn
>> expose the bugs on the image side. So I expect that the problems you
>> describe should also happen with a unix interpreter VM (I'll check
>> and find out as soon as I can).
>>
>> I note for the record that Andreas is fully entitled to say "I told
>> you so!" at this point ;)
>>
>> To the extent that these are Squeak image problems, evaluating
>> "NetNameResolver useOldNetwork: true" should make the symptoms
>> go away.
>>
>> Other comments in line below.
>>
>> On Mon, Sep 17, 2012 at 07:01:47PM +0200, Levente Uzonyi wrote:
>>>
>>> On Mon, 17 Sep 2012, David T. Lewis wrote:
>>>
>>>> Levente,
>>>>
>>>> Can you say anything more about what weakness you found in the
>>>> network code?
>>>
>>> All issues I found are related to name lookup. To do a name lookup the new
>>> code requires multiple primitive calls (see SocketAddressInformation >>
>>> #forHost:service:flags:addressFamily:socketType:protocol:). The plugin
>>> uses static variables to store the result of the name lookup
>>> (hostNameInfo, servNameInfo and nameInfoValid). This means that only one
>>> name can be looked up at a time.
>>
>> I noticed that, and attempted to provide some protection for it with
>> a semaphore (ResolverMutex) in class NetNameResolver. Apparently I did
>> not do a very good job of it though.
>>
>>>
>>> The image side code doesn't prevent simultaneous access to these static
>>> variables, so the can get into an unexpected state (see SocketAddress >>
>>> #hostName).
>>>
>>> Another issue is that the plugin doesn't allocate objects (strings), so 2
>>> primitive calls have to be done to fetch a string (see SocketAddress >>
>>> #hostName again). One requests the size of the string, the other copies
>>> the data to a string it receives as argument.
>>>
>>> I could reproduce a deadlock-like state by evaluating:
>>>
>>> NetNameResolver addressesForName: 'amazon.com'
>>>
>>> It's sometimes possible to interrupt the process to get a debugger, but
>>> since the primitives are called by the debugger too (see SocketAddress >>
>>> #printOn:), the image will hang if you try to use it.
>>>
>>
>> That seems likely to be a problem related to the semaphore in NetNameResolver.
>>
>> If this turns out to be a problem with the network support in Squeak
>> trunk, we can take the discussion back to squeak-dev for resolution.
>>
>> Dave
>
Reply | Threaded
Open this post in threaded view
|

Re: SocketPlugin issues (was: Re: [Vm-dev] Win32 beta test)

David T. Lewis
 
We have an upcoming image release, so I think that we should consider
de-activating the new network support for that release. I think that
can be done by just setting UseOldNetwork to true. Right now it is
automatically set at image startup, depending on whether the IPv6
primitives are available in the image. We could turn it back on in
trunk after the release, and deal with the issues then.

It would not be good to release with network issues that might affect
the entire Windows user base.

Dave


On Thu, Sep 20, 2012 at 07:50:42PM +0200, Levente Uzonyi wrote:

> On Wed, 19 Sep 2012, David T. Lewis wrote:
>
> >There seem to be some issues with the new networking code (in Squeak,
> >for IPv6 support) when running on a Windows VM that has the IPv6
> >primitives in the SocketPlugin.
> >
> >I booted up Windows and tried Levente's deadlock test:
> >
> >>>I could reproduce a deadlock-like state by evaluating:
> >>>
> >>>NetNameResolver addressesForName: 'amazon.com'
> >>>
> >
> >I tried this both with Ian's beta interpreter VM, and one of Eliot's
> >recent Cog VMs. Both VMs have the IPv6 primitives now, and both of
> >them show similar issues. I did not see actual deadlocks, but what
> >I did see was extremely long primitive calls that make the image feel
> >like it is deadlocked. The #primitiveResolverGetNameInfo call is a
> >source of problems, and there may be others.
> >
> >It looks to me like some of the new primitives are invoking some very
> >slow system functions on Windows, and the Squeak network code updates
> >cause these primitives to be called if they are available in the VM,
> >so the newer VMs are having problems. I have not seen these issues
> >on Linux, so it may reflect differences in the networking support
> >for different operating systems.
> >
> >Levente, is this consistent with what you were seeing?
>
> It is, though I didn't check which primitive takes too long. I checked the
> implementation of #primitiveResolverGetNameInfo now on win32 and it seems
> to be okay, pretty much the same as on unix/linux. I see no reason why it
> would take so long to respond.
>
> In the meantime I found another thing I don't like about the new
> primitives. The timeout for the namelookup is ignored, so now we have to
> wait till the primitive returns.
>
> Actually I don't like the way how the name lookup is implemented. I think
> it might worth moving the DNS lookup to Squeak. The only thing the VM
> needs to provide then is a primitive which returns the IP addresses
> of the nameservers to be used (though the system can work without that
> too). Here are the pros and cons I came up with so far:
> Pros:
> - the code is in Smalltalk, so it's platform independent
> - concurrent name lookups become possible
> - no more long waits on the VM side
> Cons:
> - extra complexity, since a DNS client has to be implemented
> - the OS's DNS cache can't be used
>
>
> Levente
>
> >
> >Dave
> >
> >On Mon, Sep 17, 2012 at 03:37:45PM -0400, David T. Lewis wrote:
> >>Thanks Levente, this is very helpful.
> >>
> >>It sounds like these are problems in the new network code on the image
> >>side (and I'm responsible for causing that). The updated VM is probably
> >>different only in that it provides the IPv6 primitives, which in turn
> >>expose the bugs on the image side. So I expect that the problems you
> >>describe should also happen with a unix interpreter VM (I'll check
> >>and find out as soon as I can).
> >>
> >>I note for the record that Andreas is fully entitled to say "I told
> >>you so!" at this point ;)
> >>
> >>To the extent that these are Squeak image problems, evaluating
> >>"NetNameResolver useOldNetwork: true" should make the symptoms
> >>go away.
> >>
> >>Other comments in line below.
> >>
> >>On Mon, Sep 17, 2012 at 07:01:47PM +0200, Levente Uzonyi wrote:
> >>>
> >>>On Mon, 17 Sep 2012, David T. Lewis wrote:
> >>>
> >>>>Levente,
> >>>>
> >>>>Can you say anything more about what weakness you found in the
> >>>>network code?
> >>>
> >>>All issues I found are related to name lookup. To do a name lookup the
> >>>new
> >>>code requires multiple primitive calls (see SocketAddressInformation >>
> >>>#forHost:service:flags:addressFamily:socketType:protocol:). The plugin
> >>>uses static variables to store the result of the name lookup
> >>>(hostNameInfo, servNameInfo and nameInfoValid). This means that only one
> >>>name can be looked up at a time.
> >>
> >>I noticed that, and attempted to provide some protection for it with
> >>a semaphore (ResolverMutex) in class NetNameResolver. Apparently I did
> >>not do a very good job of it though.
> >>
> >>>
> >>>The image side code doesn't prevent simultaneous access to these static
> >>>variables, so the can get into an unexpected state (see SocketAddress >>
> >>>#hostName).
> >>>
> >>>Another issue is that the plugin doesn't allocate objects (strings), so 2
> >>>primitive calls have to be done to fetch a string (see SocketAddress >>
> >>>#hostName again). One requests the size of the string, the other copies
> >>>the data to a string it receives as argument.
> >>>
> >>>I could reproduce a deadlock-like state by evaluating:
> >>>
> >>>NetNameResolver addressesForName: 'amazon.com'
> >>>
> >>>It's sometimes possible to interrupt the process to get a debugger, but
> >>>since the primitives are called by the debugger too (see SocketAddress >>
> >>>#printOn:), the image will hang if you try to use it.
> >>>
> >>
> >>That seems likely to be a problem related to the semaphore in
> >>NetNameResolver.
> >>
> >>If this turns out to be a problem with the network support in Squeak
> >>trunk, we can take the discussion back to squeak-dev for resolution.
> >>
> >>Dave
> >
Reply | Threaded
Open this post in threaded view
|

Issues in the new network support (was: SocketPlugin issues (was: Re: [Vm-dev] Win32 beta test))

David T. Lewis
 
Moving the discussion to squeak-dev. Follow-up related to the Squeak image
should be on squeak-dev, discussion of the primitives can stay on vm-dev.

Dave

On Thu, Sep 20, 2012 at 02:11:35PM -0400, David T. Lewis wrote:

>  
> We have an upcoming image release, so I think that we should consider
> de-activating the new network support for that release. I think that
> can be done by just setting UseOldNetwork to true. Right now it is
> automatically set at image startup, depending on whether the IPv6
> primitives are available in the image. We could turn it back on in
> trunk after the release, and deal with the issues then.
>
> It would not be good to release with network issues that might affect
> the entire Windows user base.
>
> Dave
>
>
> On Thu, Sep 20, 2012 at 07:50:42PM +0200, Levente Uzonyi wrote:
> > On Wed, 19 Sep 2012, David T. Lewis wrote:
> >
> > >There seem to be some issues with the new networking code (in Squeak,
> > >for IPv6 support) when running on a Windows VM that has the IPv6
> > >primitives in the SocketPlugin.
> > >
> > >I booted up Windows and tried Levente's deadlock test:
> > >
> > >>>I could reproduce a deadlock-like state by evaluating:
> > >>>
> > >>>NetNameResolver addressesForName: 'amazon.com'
> > >>>
> > >
> > >I tried this both with Ian's beta interpreter VM, and one of Eliot's
> > >recent Cog VMs. Both VMs have the IPv6 primitives now, and both of
> > >them show similar issues. I did not see actual deadlocks, but what
> > >I did see was extremely long primitive calls that make the image feel
> > >like it is deadlocked. The #primitiveResolverGetNameInfo call is a
> > >source of problems, and there may be others.
> > >
> > >It looks to me like some of the new primitives are invoking some very
> > >slow system functions on Windows, and the Squeak network code updates
> > >cause these primitives to be called if they are available in the VM,
> > >so the newer VMs are having problems. I have not seen these issues
> > >on Linux, so it may reflect differences in the networking support
> > >for different operating systems.
> > >
> > >Levente, is this consistent with what you were seeing?
> >
> > It is, though I didn't check which primitive takes too long. I checked the
> > implementation of #primitiveResolverGetNameInfo now on win32 and it seems
> > to be okay, pretty much the same as on unix/linux. I see no reason why it
> > would take so long to respond.
> >
> > In the meantime I found another thing I don't like about the new
> > primitives. The timeout for the namelookup is ignored, so now we have to
> > wait till the primitive returns.
> >
> > Actually I don't like the way how the name lookup is implemented. I think
> > it might worth moving the DNS lookup to Squeak. The only thing the VM
> > needs to provide then is a primitive which returns the IP addresses
> > of the nameservers to be used (though the system can work without that
> > too). Here are the pros and cons I came up with so far:
> > Pros:
> > - the code is in Smalltalk, so it's platform independent
> > - concurrent name lookups become possible
> > - no more long waits on the VM side
> > Cons:
> > - extra complexity, since a DNS client has to be implemented
> > - the OS's DNS cache can't be used
> >
> >
> > Levente
> >
> > >
> > >Dave
> > >
> > >On Mon, Sep 17, 2012 at 03:37:45PM -0400, David T. Lewis wrote:
> > >>Thanks Levente, this is very helpful.
> > >>
> > >>It sounds like these are problems in the new network code on the image
> > >>side (and I'm responsible for causing that). The updated VM is probably
> > >>different only in that it provides the IPv6 primitives, which in turn
> > >>expose the bugs on the image side. So I expect that the problems you
> > >>describe should also happen with a unix interpreter VM (I'll check
> > >>and find out as soon as I can).
> > >>
> > >>I note for the record that Andreas is fully entitled to say "I told
> > >>you so!" at this point ;)
> > >>
> > >>To the extent that these are Squeak image problems, evaluating
> > >>"NetNameResolver useOldNetwork: true" should make the symptoms
> > >>go away.
> > >>
> > >>Other comments in line below.
> > >>
> > >>On Mon, Sep 17, 2012 at 07:01:47PM +0200, Levente Uzonyi wrote:
> > >>>
> > >>>On Mon, 17 Sep 2012, David T. Lewis wrote:
> > >>>
> > >>>>Levente,
> > >>>>
> > >>>>Can you say anything more about what weakness you found in the
> > >>>>network code?
> > >>>
> > >>>All issues I found are related to name lookup. To do a name lookup the
> > >>>new
> > >>>code requires multiple primitive calls (see SocketAddressInformation >>
> > >>>#forHost:service:flags:addressFamily:socketType:protocol:). The plugin
> > >>>uses static variables to store the result of the name lookup
> > >>>(hostNameInfo, servNameInfo and nameInfoValid). This means that only one
> > >>>name can be looked up at a time.
> > >>
> > >>I noticed that, and attempted to provide some protection for it with
> > >>a semaphore (ResolverMutex) in class NetNameResolver. Apparently I did
> > >>not do a very good job of it though.
> > >>
> > >>>
> > >>>The image side code doesn't prevent simultaneous access to these static
> > >>>variables, so the can get into an unexpected state (see SocketAddress >>
> > >>>#hostName).
> > >>>
> > >>>Another issue is that the plugin doesn't allocate objects (strings), so 2
> > >>>primitive calls have to be done to fetch a string (see SocketAddress >>
> > >>>#hostName again). One requests the size of the string, the other copies
> > >>>the data to a string it receives as argument.
> > >>>
> > >>>I could reproduce a deadlock-like state by evaluating:
> > >>>
> > >>>NetNameResolver addressesForName: 'amazon.com'
> > >>>
> > >>>It's sometimes possible to interrupt the process to get a debugger, but
> > >>>since the primitives are called by the debugger too (see SocketAddress >>
> > >>>#printOn:), the image will hang if you try to use it.
> > >>>
> > >>
> > >>That seems likely to be a problem related to the semaphore in
> > >>NetNameResolver.
> > >>
> > >>If this turns out to be a problem with the network support in Squeak
> > >>trunk, we can take the discussion back to squeak-dev for resolution.
> > >>
> > >>Dave
> > >