mixing open and closed source within Pharo

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

mixing open and closed source within Pharo

Ben Coman

What ideas are floating around about mixing open source and closed
source using Pharo?  I am implementing an IEC Standard object model for
electrical power systems to provide a platform for developing electrical
applications.  I am considering the case where a company may maintain
the model of their electrical power distribution network in the open
source platform, but then buy various commercial plug-ins perform
different calculations upon the shared model.  Here are the options I
can imagine...

1. Using fuel to load binary packages within the one image without the
source.  Currently available technology but viewing and decompiling
bytecode is still possible - but to what degree this enables reverse
engineering I am not sure.

2. Having VM support for restricting displaying/decompiling bytecode.  
To avoid the ease of switching to another VM without this restriction,
the fuel package could be encrypted with a key match one compiled inside
the required VM.

3. Running multiple images on a single VM such that the VM passing
message calls efficiently between the two images like an "enterprise
bus*" - one open source and one closed source.  Any common base objects
between the images might be shared on a copy-on-write basis.


What are your thoughts?

cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: mixing open and closed source within Pharo

Igor Stasenko
On 30 May 2012 16:04, Ben Coman <[hidden email]> wrote:

>
> What ideas are floating around about mixing open source and closed source
> using Pharo?  I am implementing an IEC Standard object model for electrical
> power systems to provide a platform for developing electrical applications.
>  I am considering the case where a company may maintain the model of their
> electrical power distribution network in the open source platform, but then
> buy various commercial plug-ins perform different calculations upon the
> shared model.  Here are the options I can imagine...
>
> 1. Using fuel to load binary packages within the one image without the
> source.  Currently available technology but viewing and decompiling bytecode
> is still possible - but to what degree this enables reverse engineering I am
> not sure.

Decompiler is able to fully reproduce the source code of method.
only variable names is lost, but you can see everything else quite clear.

>
> 2. Having VM support for restricting displaying/decompiling bytecode.  To
> avoid the ease of switching to another VM without this restriction, the fuel
> package could be encrypted with a key match one compiled inside the required
> VM.

this is not an option.
A current Debugger implementation implies that you have access to CM bytecodes.

>
> 3. Running multiple images on a single VM such that the VM passing message
> calls efficiently between the two images like an "enterprise bus*" - one
> open source and one closed source.  Any common base objects between the
> images might be shared on a copy-on-write basis.
>
>
> What are your thoughts?
>

I think option 3 is most vital:
you can communicate between two images like between two parties who
don't trust each other (so images should handshake, use encryption
key(s), and try to log-in one to other), then if succeed, one can be
able to see source code  use  remote debugger etc
Nothing new under the sun...

> cheers -ben
>



--
Best regards,
Igor Stasenko.

Reply | Threaded
Open this post in threaded view
|

Re: mixing open and closed source within Pharo

Frank Shearar-3
On 30 May 2012 18:36, Igor Stasenko <[hidden email]> wrote:

> On 30 May 2012 16:04, Ben Coman <[hidden email]> wrote:
>>
>> What ideas are floating around about mixing open source and closed source
>> using Pharo?  I am implementing an IEC Standard object model for electrical
>> power systems to provide a platform for developing electrical applications.
>>  I am considering the case where a company may maintain the model of their
>> electrical power distribution network in the open source platform, but then
>> buy various commercial plug-ins perform different calculations upon the
>> shared model.  Here are the options I can imagine...
>>
>> 1. Using fuel to load binary packages within the one image without the
>> source.  Currently available technology but viewing and decompiling bytecode
>> is still possible - but to what degree this enables reverse engineering I am
>> not sure.
>
> Decompiler is able to fully reproduce the source code of method.
> only variable names is lost, but you can see everything else quite clear.

I don't know the conditions, but Decompiler can certainly keep
variable names _sometimes_. Maybe Pharo's and Squeak's Decompiler have
diverged? Or maybe it's because the .changes file is available or
something?

frank

>> 2. Having VM support for restricting displaying/decompiling bytecode.  To
>> avoid the ease of switching to another VM without this restriction, the fuel
>> package could be encrypted with a key match one compiled inside the required
>> VM.
>
> this is not an option.
> A current Debugger implementation implies that you have access to CM bytecodes.
>
>>
>> 3. Running multiple images on a single VM such that the VM passing message
>> calls efficiently between the two images like an "enterprise bus*" - one
>> open source and one closed source.  Any common base objects between the
>> images might be shared on a copy-on-write basis.
>>
>>
>> What are your thoughts?
>>
>
> I think option 3 is most vital:
> you can communicate between two images like between two parties who
> don't trust each other (so images should handshake, use encryption
> key(s), and try to log-in one to other), then if succeed, one can be
> able to see source code  use  remote debugger etc
> Nothing new under the sun...
>
>> cheers -ben
>>
>
>
>
> --
> Best regards,
> Igor Stasenko.
>

Reply | Threaded
Open this post in threaded view
|

Re: mixing open and closed source within Pharo

Marcus Denker-4

On May 30, 2012, at 8:01 PM, Frank Shearar wrote:

> On 30 May 2012 18:36, Igor Stasenko <[hidden email]> wrote:
>> On 30 May 2012 16:04, Ben Coman <[hidden email]> wrote:
>>>
>>> What ideas are floating around about mixing open source and closed source
>>> using Pharo?  I am implementing an IEC Standard object model for electrical
>>> power systems to provide a platform for developing electrical applications.
>>>  I am considering the case where a company may maintain the model of their
>>> electrical power distribution network in the open source platform, but then
>>> buy various commercial plug-ins perform different calculations upon the
>>> shared model.  Here are the options I can imagine...
>>>
>>> 1. Using fuel to load binary packages within the one image without the
>>> source.  Currently available technology but viewing and decompiling bytecode
>>> is still possible - but to what degree this enables reverse engineering I am
>>> not sure.
>>
>> Decompiler is able to fully reproduce the source code of method.
>> only variable names is lost, but you can see everything else quite clear.
>
> I don't know the conditions, but Decompiler can certainly keep
> variable names _sometimes_.

yes, when there is the original source.

> Maybe Pharo's and Squeak's Decompiler have
> diverged?

Not yet but soon :-)

> Or maybe it's because the .changes file is available or
> something?
>

yes.


--
Marcus Denker -- http://marcusdenker.de


Reply | Threaded
Open this post in threaded view
|

Re: mixing open and closed source within Pharo

Ben Coman
In reply to this post by Igor Stasenko
Igor Stasenko wrote:

> On 30 May 2012 16:04, Ben Coman <[hidden email]> wrote:
>  
>> What ideas are floating around about mixing open source and closed source
>> using Pharo?  I am implementing an IEC Standard object model for electrical
>> power systems to provide a platform for developing electrical applications.
>>  I am considering the case where a company may maintain the model of their
>> electrical power distribution network in the open source platform, but then
>> buy various commercial plug-ins perform different calculations upon the
>> shared model.  Here are the options I can imagine...
>>
>> 1. Using fuel to load binary packages within the one image without the
>> source.  Currently available technology but viewing and decompiling bytecode
>> is still possible - but to what degree this enables reverse engineering I am
>> not sure.
>>    
>
> Decompiler is able to fully reproduce the source code of method.
> only variable names is lost, but you can see everything else quite clear.
>
>  
>> 2. Having VM support for restricting displaying/decompiling bytecode.  To
>> avoid the ease of switching to another VM without this restriction, the fuel
>> package could be encrypted with a key match one compiled inside the required
>> VM.
>>    
>
> this is not an option.
> A current Debugger implementation implies that you have access to CM bytecodes.
>  

A required implication of (2.) would be that you could not debug
"through" that package.  While this might be unfortunate from an
open-source debugging view point, in comparison to having the whole
application delivered closed-source with development tools stripped - I
would still consider this to be a step up.  Ignoring the current
implementation of Debugger, could something like this be reasonably
achievable?

To expand on this with a specific use case... The VM could internally
generate a public/private key pair.  When requesting a plug-in from an
App Store, the public key is sent which the App Store uses to encrypt
the bytecode of the plug-in.  Once downloaded into the image, upon
execution the VM receives the encrypted bytecode, decrypts it with its
private key and caches the decrypted bytecode internally, such that it
is never visible to the image.


(btw, I'm assuming CM bytecodes was a typo meant to be VM bytecodes? or
otherwise what is CM?)

>  
>> 3. Running multiple images on a single VM such that the VM passing message
>> calls efficiently between the two images like an "enterprise bus*" - one
>> open source and one closed source.  Any common base objects between the
>> images might be shared on a copy-on-write basis.
>>
>>
>> What are your thoughts?
>>
>>    
>
> I think option 3 is most vital:
> you can communicate between two images like between two parties who
> don't trust each other (so images should handshake, use encryption
> key(s), and try to log-in one to other), then if succeed, one can be
> able to see source code  use  remote debugger etc
> Nothing new under the sun...
>
>  
>> cheers -ben
>>
>>    
>
>
>
>  


Reply | Threaded
Open this post in threaded view
|

Re: mixing open and closed source within Pharo

Nicolas Cellier
2012/6/3 Ben Coman <[hidden email]>:

> Igor Stasenko wrote:
>>
>> On 30 May 2012 16:04, Ben Coman <[hidden email]> wrote:
>>
>>>
>>> What ideas are floating around about mixing open source and closed source
>>> using Pharo?  I am implementing an IEC Standard object model for
>>> electrical
>>> power systems to provide a platform for developing electrical
>>> applications.
>>>  I am considering the case where a company may maintain the model of
>>> their
>>> electrical power distribution network in the open source platform, but
>>> then
>>> buy various commercial plug-ins perform different calculations upon the
>>> shared model.  Here are the options I can imagine...
>>>
>>> 1. Using fuel to load binary packages within the one image without the
>>> source.  Currently available technology but viewing and decompiling
>>> bytecode
>>> is still possible - but to what degree this enables reverse engineering I
>>> am
>>> not sure.
>>>
>>
>>
>> Decompiler is able to fully reproduce the source code of method.
>> only variable names is lost, but you can see everything else quite clear.
>>
>>
>>>
>>> 2. Having VM support for restricting displaying/decompiling bytecode.  To
>>> avoid the ease of switching to another VM without this restriction, the
>>> fuel
>>> package could be encrypted with a key match one compiled inside the
>>> required
>>> VM.
>>>
>>
>>
>> this is not an option.
>> A current Debugger implementation implies that you have access to CM
>> bytecodes.
>>
>
>
> A required implication of (2.) would be that you could not debug "through"
> that package.  While this might be unfortunate from an open-source debugging
> view point, in comparison to having the whole application delivered
> closed-source with development tools stripped - I would still consider this
> to be a step up.  Ignoring the current implementation of Debugger, could
> something like this be reasonably achievable?
> To expand on this with a specific use case... The VM could internally
> generate a public/private key pair.  When requesting a plug-in from an App
> Store, the public key is sent which the App Store uses to encrypt the
> bytecode of the plug-in.  Once downloaded into the image, upon execution the
> VM receives the encrypted bytecode, decrypts it with its private key and
> caches the decrypted bytecode internally, such that it is never visible to
> the image.
>

Then what prevents an attacker to dump process memory and retrieve the
bytecodes from well known image structure patterns, or a custom image
tracer?

Nicolas

>
> (btw, I'm assuming CM bytecodes was a typo meant to be VM bytecodes? or
> otherwise what is CM?)
>
>>
>>>
>>> 3. Running multiple images on a single VM such that the VM passing
>>> message
>>> calls efficiently between the two images like an "enterprise bus*" - one
>>> open source and one closed source.  Any common base objects between the
>>> images might be shared on a copy-on-write basis.
>>>
>>>
>>> What are your thoughts?
>>>
>>>
>>
>>
>> I think option 3 is most vital:
>> you can communicate between two images like between two parties who
>> don't trust each other (so images should handshake, use encryption
>> key(s), and try to log-in one to other), then if succeed, one can be
>> able to see source code  use  remote debugger etc
>> Nothing new under the sun...
>>
>>
>>>
>>> cheers -ben
>>>
>>>
>>
>>
>>
>>
>>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: mixing open and closed source within Pharo

Mariano Martinez Peck
In reply to this post by Ben Coman


(btw, I'm assuming CM bytecodes was a typo meant to be VM bytecodes? or otherwise what is CM?)


CompiledMethod ;)
 

 
3. Running multiple images on a single VM such that the VM passing message
calls efficiently between the two images like an "enterprise bus*" - one
open source and one closed source.  Any common base objects between the
images might be shared on a copy-on-write basis.


What are your thoughts?

   

I think option 3 is most vital:
you can communicate between two images like between two parties who
don't trust each other (so images should handshake, use encryption
key(s), and try to log-in one to other), then if succeed, one can be
able to see source code  use  remote debugger etc
Nothing new under the sun...

 
cheers -ben

   



 





--
Mariano
http://marianopeck.wordpress.com

Reply | Threaded
Open this post in threaded view
|

Re: mixing open and closed source within Pharo

Ben Coman
In reply to this post by Nicolas Cellier
Nicolas Cellier wrote:

> 2012/6/3 Ben Coman <[hidden email]>:
>  
>> Igor Stasenko wrote:
>>    
>>> On 30 May 2012 16:04, Ben Coman <[hidden email]> wrote:
>>>
>>>      
>>>> What ideas are floating around about mixing open source and closed source
>>>> using Pharo?  I am implementing an IEC Standard object model for
>>>> electrical
>>>> power systems to provide a platform for developing electrical
>>>> applications.
>>>>  I am considering the case where a company may maintain the model of
>>>> their
>>>> electrical power distribution network in the open source platform, but
>>>> then
>>>> buy various commercial plug-ins perform different calculations upon the
>>>> shared model.  Here are the options I can imagine...
>>>>
>>>> 1. Using fuel to load binary packages within the one image without the
>>>> source.  Currently available technology but viewing and decompiling
>>>> bytecode
>>>> is still possible - but to what degree this enables reverse engineering I
>>>> am
>>>> not sure.
>>>>
>>>>        
>>> Decompiler is able to fully reproduce the source code of method.
>>> only variable names is lost, but you can see everything else quite clear.
>>>
>>>
>>>      
>>>> 2. Having VM support for restricting displaying/decompiling bytecode.  To
>>>> avoid the ease of switching to another VM without this restriction, the
>>>> fuel
>>>> package could be encrypted with a key match one compiled inside the
>>>> required
>>>> VM.
>>>>
>>>>        
>>> this is not an option.
>>> A current Debugger implementation implies that you have access to CM
>>> bytecodes.
>>>
>>>      
>> A required implication of (2.) would be that you could not debug "through"
>> that package.  While this might be unfortunate from an open-source debugging
>> view point, in comparison to having the whole application delivered
>> closed-source with development tools stripped - I would still consider this
>> to be a step up.  Ignoring the current implementation of Debugger, could
>> something like this be reasonably achievable?
>> To expand on this with a specific use case... The VM could internally
>> generate a public/private key pair.  When requesting a plug-in from an App
>> Store, the public key is sent which the App Store uses to encrypt the
>> bytecode of the plug-in.  Once downloaded into the image, upon execution the
>> VM receives the encrypted bytecode, decrypts it with its private key and
>> caches the decrypted bytecode internally, such that it is never visible to
>> the image.
>>
>>    
>
> Then what prevents an attacker to dump process memory and retrieve the
> bytecodes from well known image structure patterns, or a custom image
> tracer?
>
> Nicolas
>
>  

Ultimately at that level - there is no security.  This only raises the
level of difficulty by requiring a certain conjunction of skill,
motivation and ethics.  It is no worse than the same attack targetted at
option (3.).  However there is now some incentive for commercial
companies to invest the the effort into releasing an App to execute on
top of an otherwise open-source platform.  The advantage is that it if
ten plugins were required to do proprietary calculations on one dataset
and integrate the results into a single screen, then running within the
one image seems more elegant than ten images running on ten VMs.

>> (btw, I'm assuming CM bytecodes was a typo meant to be VM bytecodes? or
>> otherwise what is CM?)
>>
>>    
>>>> 3. Running multiple images on a single VM such that the VM passing
>>>> message
>>>> calls efficiently between the two images like an "enterprise bus*" - one
>>>> open source and one closed source.  Any common base objects between the
>>>> images might be shared on a copy-on-write basis.
>>>>
>>>>
>>>> What are your thoughts?
>>>>
>>>>
>>>>        
>>> I think option 3 is most vital:
>>> you can communicate between two images like between two parties who
>>> don't trust each other (so images should handshake, use encryption
>>> key(s), and try to log-in one to other), then if succeed, one can be
>>> able to see source code  use  remote debugger etc
>>> Nothing new under the sun...
>>>
>>>
>>>      
>>>> cheers -ben
>>>>
>>>>
>>>>        
>>>
>>>
>>>
>>>      
>>
>>    
>
>
>  


Reply | Threaded
Open this post in threaded view
|

Re: mixing open and closed source within Pharo

Marcus Denker-4

On Jun 3, 2012, at 5:26 PM, Ben Coman wrote:
>>
>
> Ultimately at that level - there is no security.  This only raises the level of difficulty by requiring a certain conjunction of skill, motivation and ethics.  It is no worse than the same attack targetted at option (3.).  However there is now some incentive for commercial companies to invest the the effort into releasing an App to execute on top of an otherwise open-source platform.  The advantage is that it if ten plugins were required to do proprietary calculations on one dataset and integrate the results into a single screen, then running within the one image seems more elegant than ten images running on ten VMs.



How do people do it with Java? Java .jar files are as easy to decompile as Smalltalk bytecode...

        Marcus

--
Marcus Denker -- http://marcusdenker.de


Reply | Threaded
Open this post in threaded view
|

Re: mixing open and closed source within Pharo

Phil B
On Jun 3, 2012, at 11:31 AM, Marcus Denker wrote:

>
> How do people do it with Java? Java .jar files are as easy to decompile as Smalltalk bytecode...
>

ProGuard (http://proguard.sourceforge.net/) is a popular method.  In its obfuscation phase it will rename most class and method names (i.e. anything that doesn't need to be externally visible) to meaningless names so MyClass.MyMethod becomes something like a.a.  It can also generate a mapping file so that when bug reports / crashes occur, the developer can map the gibberish names back to the original names.  Before the obfuscation phase, it has optimization and shrinking phases to eliminate code that doesn't need to be included for release (debugging code etc.)

> Marcus
>
> --
> Marcus Denker -- http://marcusdenker.de
>
>

Thanks,
Phil
Reply | Threaded
Open this post in threaded view
|

Re: mixing open and closed source within Pharo

Igor Stasenko
On 3 June 2012 18:01, Phil (list) <[hidden email]> wrote:
> On Jun 3, 2012, at 11:31 AM, Marcus Denker wrote:
>
>>
>> How do people do it with Java? Java .jar files are as easy to decompile as Smalltalk bytecode...
>>
>
> ProGuard (http://proguard.sourceforge.net/) is a popular method.  In its obfuscation phase it will rename most class and method names (i.e. anything that doesn't need to be externally visible) to meaningless names so MyClass.MyMethod becomes something like a.a.  It can also generate a mapping file so that when bug reports / crashes occur, the developer can map the gibberish names back to the original names.  Before the obfuscation phase, it has optimization and shrinking phases to eliminate code that doesn't need to be included for release (debugging code etc.)
>

so, what prevents others from writing de-obfuscating tool and then
read nice sources?

no matter what you do, as long as you distribute executable code to
masses, it can be reverse-engineered. You can only make it harder.
But there is good way to prevent this from happening: stop selling
binary files,
start selling real support for your software, ship updates regularly,
make customers happy.
Then you will be immune from any piracy, because nobody will think
pirating your software,
because it is useless without your support.


>>       Marcus
>>
>> --
>> Marcus Denker -- http://marcusdenker.de
>>
>>
>
> Thanks,
> Phil



--
Best regards,
Igor Stasenko.

Reply | Threaded
Open this post in threaded view
|

Re: mixing open and closed source within Pharo

Phil B
Igor,

On Jun 4, 2012, at 1:55 AM, Igor Stasenko wrote:

> On 3 June 2012 18:01, Phil (list) <[hidden email]> wrote:
>> On Jun 3, 2012, at 11:31 AM, Marcus Denker wrote:
>>
>>>
>>> How do people do it with Java? Java .jar files are as easy to decompile as Smalltalk bytecode...
>>>
>>
>> ProGuard (http://proguard.sourceforge.net/) is a popular method.  In its obfuscation phase it will rename most class and method names (i.e. anything that doesn't need to be externally visible) to meaningless names so MyClass.MyMethod becomes something like a.a.  It can also generate a mapping file so that when bug reports / crashes occur, the developer can map the gibberish names back to the original names.  Before the obfuscation phase, it has optimization and shrinking phases to eliminate code that doesn't need to be included for release (debugging code etc.)
>>
>
> so, what prevents others from writing de-obfuscating tool and then
> read nice sources?

De-obfuscating what?  The reverse mapping resides with the developer.  So a determined hacker can see that class 'a' has methods 'a', 'b', 'c' and has instance vars 'a', 'b', 'c', 'd', and 'e'. (Literally, that's what the names get turned into)  Granted, there are some clues left behind (a few classes/methods that need to be externally referenced and classes/methods that your code calls out to), but *your* code has essentially been turned into high-level undocumented assembly code that won't be that much more obvious than what can be produced by a good disassembler.  One weakness of ProGuard obfuscation is that it is fairly static in its mapping (though even if they randomized it a bit, it probably wouldn't be that hard to do a structure-diff of the code to figure out how the latest release mapped to an older release using a tool like Moose)... i.e. you get a bit of naming drift as the number of classes and methods change, but the order of mapping is static.

>
> no matter what you do, as long as you distribute executable code to
> masses, it can be reverse-engineered. You can only make it harder.
> But there is good way to prevent this from happening: stop selling
> binary files,
> start selling real support for your software, ship updates regularly,
> make customers happy.
> Then you will be immune from any piracy, because nobody will think
> pirating your software,
> because it is useless without your support.
>

I agree that it does nothing to prevent reverse-engineering by someone who's dedicated and getting support provides significant value.  The situation is no different with with languages that emit true binary code: if it can be run on the users machine, it can be cracked/reverse-engineered.  However, it does make the effort quite a bit more tedious without intention revealing names.  I've decompiled several of my own obfuscated Java apps and come to the conclusion that if I were attempting to reverse-engineer my own stuff without the knowledge of why things ended up as they did, I'd have a much easier time just reimplementing it as knowing what I did doesn't help me at all with the 'why'.  Sure, if one had a highly valued application a group of like-minded individuals could figure it out in short order... but ah, what a problem to have :-)

Anyway, I was just trying to address the question of how Java apps address the issue.

>
>
> --
> Best regards,
> Igor Stasenko.
>

Thanks,
Phil