[ANN] FFICHeaderExtractor first milestone (for early code reviewers) [WAS] Can OSProcess functionality be implemented using FFI instead of plugin?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[ANN] FFICHeaderExtractor first milestone (for early code reviewers) [WAS] Can OSProcess functionality be implemented using FFI instead of plugin?

Mariano Martinez Peck
Hi guys, 

OK, I have a first working version and so I wanted to share it with you.

I have not yet the time to start writing the doc since I just finished the first pass on the code. Tomorrow I will start with the doc. But I thought some of you may be interested in taking a look even without formal "doc" (and some feedback/iteration may avoid re-writing docs..).

If you have no clue what I am talking about, then this summary is for you:

----------
When we  use FFI  to call a certain library it's quite common that we need to pass as argument certain constants (for example, SIGKILL to kill()). These constants are defined in C header files and can even change it's value in different paltforms. 
These constants also are sometimes defined by the C preprocessor and so there is not way to get those values from FFI. If you don't have the value of those constants, you cannot make the FFI call. 
----------

I have tested the tool in OSX and CentOS using latest Pharo 5.0. It won't work in Windows right now.  As usual, all classes and methods have comments and there are enough tests. 

At the end, I decided the C program will output a very naive Smalltalk literal array kind of thingy. The tool then parses that output and directly creates a init method (which is compiled into the SharedPool class) for that platform which is then called automatically at startup (only if initialization is needed). 

As for real examples, I started to write constants for libc:  signal.h (to use kill()) , wait.h (to use wait() famility), fcntl.h (to use ... xxx()) , and errno.h. You can take a look to the package 'FFICHeaderExtractor-LibC'.

Note that for running the tests you need 'cc' findable by path in OSX and 'gcc' in Unix. 

To load the code in a latest Pharo 5.0, execute:

Metacello new
    baseline: 'FFICHeaderExtractor';
    repository: 'github://marianopeck/FFICHeaderExtractor:master/repository';
    load.

Any feedback is appreciated. 

I will start writing the doc now.

BTW: Big thanks to Eliot Miranda which helped me answering noob questions and providing useful code and guidelines. 

Best,






On Sat, Jan 23, 2016 at 1:12 PM, Eliot Miranda <[hidden email]> wrote:
 
Hi Denis,

On Jan 23, 2016, at 7:30 AM, Denis Kudriashov <[hidden email]> wrote:


2016-01-22 22:35 GMT+01:00 Eliot Miranda <[hidden email]>:
Let's measure this.  Let's say we have 8 platforms (that's an underestimate, because different Linux distributions may have different values for certain constants), but 8, which is 4 basic platforms times 32- & 64-bits.  We have Mac x86 32-bit, Mac x64 64-bit, Windows x86 32-bit, Windows x64 64-bit, Linux x86 32-bit, Linux ARM 32-bit, Linux x64 64-bit, and soon enough there will be more.  Further, there may be different versions over time.

So each of those initialization methods has
- 1 slot for the global variable to be assigned
- 1 slot for the literal value to assign to it
- 3 bytes of bytecode per initialization for small methods, 4 for large methods.  Let's say 4.

So the overhead in 32-bits is 12 bytes per constant, and in 64-bits is 20 bytes.  So the overhead per constant for all platforms is 96 bytes per constant in 32-bits and 160 bytes per constant for 64-bits.  A full system with sockets, files, a database connexion etc could easily exceed 100 constants.  I think it would be nearer 1000.  So the overheads are in the 10- to 100-k byte range (100k ~= 0.5% of the image) on 32-bits.  That's low but it's also pure overhead.  Every GC has to visit them.  Every senders and implementors has to visit them, but they offer nothing of value.  Whereas the small parser for whatever notation is used to store the constants externally (if they are needed in a given deployment) has a small constant overhead; its simple code.

Further, you still need the machinery to export the constants to be able to generate these initialization methods.  If you've got the machinery and you don't need the methods why bother to generate the methods?

As the Scots say, many a mickle makes a muckle.

Thank's Eliot for such detailed explanation. It makes sense. 
But personally I prefer Smalltalk solution although Smalltalk itself is pure overhead comparing to C.

I can see the draw of the pure Smalltalk. Simplicity and brows ability.  But imagine a tiny headless image deployed on containers, say 2mb.  Now 100kb of initialization code doesn't look so good :-).  Anyway I'm beating a dead horse.  Mariano is generating initialization methods.


My question was raised by Mariano idea to save ston files in methods. I think it can reduce problems which you described.
But then literal array syntax can be more suitable than ston. 

I just want to be clear, I'm neutral about the notation used to export info from the C file.  Liberal array syntax, chunk source format, ston, xml.  It doesn't matter as long as it's convenient at expressing an attribute dictionary from names to attributes such as value, size, offset.  Don't get hung up on the specific notation.  If one were to go with the external file the only real requirements are that it be reasonably compact and quick to parse.  That might kill xml but leave plenty of other candidates.


_,,,^..^,,,_ (phone)




--
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] FFICHeaderExtractor first milestone (for early code reviewers) [WAS] Can OSProcess functionality be implemented using FFI instead of plugin?

Max Leske
Yay!

On 25 Jan 2016, at 23:27, Mariano Martinez Peck <[hidden email]> wrote:

Hi guys, 

OK, I have a first working version and so I wanted to share it with you.

I have not yet the time to start writing the doc since I just finished the first pass on the code. Tomorrow I will start with the doc. But I thought some of you may be interested in taking a look even without formal "doc" (and some feedback/iteration may avoid re-writing docs..).

If you have no clue what I am talking about, then this summary is for you:

----------
When we  use FFI  to call a certain library it's quite common that we need to pass as argument certain constants (for example, SIGKILL to kill()). These constants are defined in C header files and can even change it's value in different paltforms. 
These constants also are sometimes defined by the C preprocessor and so there is not way to get those values from FFI. If you don't have the value of those constants, you cannot make the FFI call. 
----------

I have tested the tool in OSX and CentOS using latest Pharo 5.0. It won't work in Windows right now.  As usual, all classes and methods have comments and there are enough tests. 

At the end, I decided the C program will output a very naive Smalltalk literal array kind of thingy. The tool then parses that output and directly creates a init method (which is compiled into the SharedPool class) for that platform which is then called automatically at startup (only if initialization is needed). 

As for real examples, I started to write constants for libc:  signal.h (to use kill()) , wait.h (to use wait() famility), fcntl.h (to use ... xxx()) , and errno.h. You can take a look to the package 'FFICHeaderExtractor-LibC'.

Note that for running the tests you need 'cc' findable by path in OSX and 'gcc' in Unix. 

To load the code in a latest Pharo 5.0, execute:

Metacello new
    baseline: 'FFICHeaderExtractor';
    repository: '<a href="github://marianopeck/FFICHeaderExtractor:master/repository'" class="">github://marianopeck/FFICHeaderExtractor:master/repository';
    load.

Any feedback is appreciated. 

I will start writing the doc now.

BTW: Big thanks to Eliot Miranda which helped me answering noob questions and providing useful code and guidelines. 

Best,






On Sat, Jan 23, 2016 at 1:12 PM, Eliot Miranda <[hidden email]> wrote:
 
Hi Denis,

On Jan 23, 2016, at 7:30 AM, Denis Kudriashov <[hidden email]> wrote:


2016-01-22 22:35 GMT+01:00 Eliot Miranda <[hidden email]>:
Let's measure this.  Let's say we have 8 platforms (that's an underestimate, because different Linux distributions may have different values for certain constants), but 8, which is 4 basic platforms times 32- & 64-bits.  We have Mac x86 32-bit, Mac x64 64-bit, Windows x86 32-bit, Windows x64 64-bit, Linux x86 32-bit, Linux ARM 32-bit, Linux x64 64-bit, and soon enough there will be more.  Further, there may be different versions over time.

So each of those initialization methods has
- 1 slot for the global variable to be assigned
- 1 slot for the literal value to assign to it
- 3 bytes of bytecode per initialization for small methods, 4 for large methods.  Let's say 4.

So the overhead in 32-bits is 12 bytes per constant, and in 64-bits is 20 bytes.  So the overhead per constant for all platforms is 96 bytes per constant in 32-bits and 160 bytes per constant for 64-bits.  A full system with sockets, files, a database connexion etc could easily exceed 100 constants.  I think it would be nearer 1000.  So the overheads are in the 10- to 100-k byte range (100k ~= 0.5% of the image) on 32-bits.  That's low but it's also pure overhead.  Every GC has to visit them.  Every senders and implementors has to visit them, but they offer nothing of value.  Whereas the small parser for whatever notation is used to store the constants externally (if they are needed in a given deployment) has a small constant overhead; its simple code.

Further, you still need the machinery to export the constants to be able to generate these initialization methods.  If you've got the machinery and you don't need the methods why bother to generate the methods?

As the Scots say, many a mickle makes a muckle.

Thank's Eliot for such detailed explanation. It makes sense. 
But personally I prefer Smalltalk solution although Smalltalk itself is pure overhead comparing to C.

I can see the draw of the pure Smalltalk. Simplicity and brows ability.  But imagine a tiny headless image deployed on containers, say 2mb.  Now 100kb of initialization code doesn't look so good :-).  Anyway I'm beating a dead horse.  Mariano is generating initialization methods.


My question was raised by Mariano idea to save ston files in methods. I think it can reduce problems which you described.
But then literal array syntax can be more suitable than ston. 

I just want to be clear, I'm neutral about the notation used to export info from the C file.  Liberal array syntax, chunk source format, ston, xml.  It doesn't matter as long as it's convenient at expressing an attribute dictionary from names to attributes such as value, size, offset.  Don't get hung up on the specific notation.  If one were to go with the external file the only real requirements are that it be reasonably compact and quick to parse.  That might kill xml but leave plenty of other candidates.


_,,,^..^,,,_ (phone)




--

Reply | Threaded
Open this post in threaded view
|

Re: [ANN] FFICHeaderExtractor first milestone (for early code reviewers) [WAS] Can OSProcess functionality be implemented using FFI instead of plugin?

stepharo
In reply to this post by Mariano Martinez Peck
super!

Stef


Le 25/1/16 23:27, Mariano Martinez Peck a écrit :
Hi guys, 

OK, I have a first working version and so I wanted to share it with you.

I have not yet the time to start writing the doc since I just finished the first pass on the code. Tomorrow I will start with the doc. But I thought some of you may be interested in taking a look even without formal "doc" (and some feedback/iteration may avoid re-writing docs..).

If you have no clue what I am talking about, then this summary is for you:

----------
When we  use FFI  to call a certain library it's quite common that we need to pass as argument certain constants (for example, SIGKILL to kill()). These constants are defined in C header files and can even change it's value in different paltforms. 
These constants also are sometimes defined by the C preprocessor and so there is not way to get those values from FFI. If you don't have the value of those constants, you cannot make the FFI call. 
----------

I have tested the tool in OSX and CentOS using latest Pharo 5.0. It won't work in Windows right now.  As usual, all classes and methods have comments and there are enough tests. 

At the end, I decided the C program will output a very naive Smalltalk literal array kind of thingy. The tool then parses that output and directly creates a init method (which is compiled into the SharedPool class) for that platform which is then called automatically at startup (only if initialization is needed). 

As for real examples, I started to write constants for libc:  signal.h (to use kill()) , wait.h (to use wait() famility), fcntl.h (to use ... xxx()) , and errno.h. You can take a look to the package 'FFICHeaderExtractor-LibC'.

Note that for running the tests you need 'cc' findable by path in OSX and 'gcc' in Unix. 

To load the code in a latest Pharo 5.0, execute:

Metacello new
    baseline: 'FFICHeaderExtractor';
    repository: 'github://marianopeck/FFICHeaderExtractor:master/repository';
    load.
Any feedback is appreciated. 
I will start writing the doc now.
BTW: Big thanks to Eliot Miranda which helped me answering noob questions and providing useful code and guidelines. 
Best,


On Sat, Jan 23, 2016 at 1:12 PM, Eliot Miranda <[hidden email]> wrote:
 
Hi Denis,

On Jan 23, 2016, at 7:30 AM, Denis Kudriashov <[hidden email]> wrote:


2016-01-22 22:35 GMT+01:00 Eliot Miranda <[hidden email]>:
Let's measure this.  Let's say we have 8 platforms (that's an underestimate, because different Linux distributions may have different values for certain constants), but 8, which is 4 basic platforms times 32- & 64-bits.  We have Mac x86 32-bit, Mac x64 64-bit, Windows x86 32-bit, Windows x64 64-bit, Linux x86 32-bit, Linux ARM 32-bit, Linux x64 64-bit, and soon enough there will be more.  Further, there may be different versions over time.

So each of those initialization methods has
- 1 slot for the global variable to be assigned
- 1 slot for the literal value to assign to it
- 3 bytes of bytecode per initialization for small methods, 4 for large methods.  Let's say 4.

So the overhead in 32-bits is 12 bytes per constant, and in 64-bits is 20 bytes.  So the overhead per constant for all platforms is 96 bytes per constant in 32-bits and 160 bytes per constant for 64-bits.  A full system with sockets, files, a database connexion etc could easily exceed 100 constants.  I think it would be nearer 1000.  So the overheads are in the 10- to 100-k byte range (100k ~= 0.5% of the image) on 32-bits.  That's low but it's also pure overhead.  Every GC has to visit them.  Every senders and implementors has to visit them, but they offer nothing of value.  Whereas the small parser for whatever notation is used to store the constants externally (if they are needed in a given deployment) has a small constant overhead; its simple code.

Further, you still need the machinery to export the constants to be able to generate these initialization methods.  If you've got the machinery and you don't need the methods why bother to generate the methods?

As the Scots say, many a mickle makes a muckle.

Thank's Eliot for such detailed explanation. It makes sense. 
But personally I prefer Smalltalk solution although Smalltalk itself is pure overhead comparing to C.

I can see the draw of the pure Smalltalk. Simplicity and brows ability.  But imagine a tiny headless image deployed on containers, say 2mb.  Now 100kb of initialization code doesn't look so good :-).  Anyway I'm beating a dead horse.  Mariano is generating initialization methods.


My question was raised by Mariano idea to save ston files in methods. I think it can reduce problems which you described.
But then literal array syntax can be more suitable than ston. 

I just want to be clear, I'm neutral about the notation used to export info from the C file.  Liberal array syntax, chunk source format, ston, xml.  It doesn't matter as long as it's convenient at expressing an attribute dictionary from names to attributes such as value, size, offset.  Don't get hung up on the specific notation.  If one were to go with the external file the only real requirements are that it be reasonably compact and quick to parse.  That might kill xml but leave plenty of other candidates.


_,,,^..^,,,_ (phone)




--

Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] re: FFICHeaderExtractor first milestone (for early code reviewers)

Mariano Martinez Peck
In reply to this post by Mariano Martinez Peck


On Tue, Jan 26, 2016 at 8:34 AM, Craig Latta <[hidden email]> wrote:


     Nice. I was writing little C programs to tell me various
constants... it sounds like this automates that and keeps the
interaction in Smalltalk.

Yeah, exactly. It is automate that and the result can be stored as "init" methods (one per platform) in the shared pools directly. 
Also, it will take care of initializing them (searching the correct init method for the current platform) at startup.

Note that I called the project FFICHeaderExtractor and not FFICConstantsExtractor. If time allows, I would also like to get info from structs: sizeof and how they are defined internally. That would be yet another feature to let more things to be done via FFI. See https://github.com/marianopeck/FFICHeaderExtractor/issues/1  But that would require some more effort!

 


-C

--
Craig Latta
Black Page Digital
postbus 10784
1001ET Amsterdam, Netherlands
[hidden email]
<a href="tel:%2B31%20%20%206%202757%207177" value="+31627577177">+31 6 2757 7177 (SMS ok)
<a href="tel:%2B%201%20415%20%20287%203547" value="+14152873547">+ 1 415 287 3547 (no SMS)




--
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] FFICHeaderExtractor first milestone (for early code reviewers) [WAS] Can OSProcess functionality be implemented using FFI instead of plugin?

Damien Pollet
In reply to this post by stepharo
Hi Mariano, I just saw your comment in OSSUnixSystemAccessor >> getcwd and wondered if there's a feasible plan for errno? What if the VM provided a known accessor function around the global/macro/whatever errno is defined as?

On 26 January 2016 at 08:15, stepharo <[hidden email]> wrote:
super!

Stef


Le 25/1/16 23:27, Mariano Martinez Peck a écrit :
Hi guys, 

OK, I have a first working version and so I wanted to share it with you.

I have not yet the time to start writing the doc since I just finished the first pass on the code. Tomorrow I will start with the doc. But I thought some of you may be interested in taking a look even without formal "doc" (and some feedback/iteration may avoid re-writing docs..).

If you have no clue what I am talking about, then this summary is for you:

----------
When we  use FFI  to call a certain library it's quite common that we need to pass as argument certain constants (for example, SIGKILL to kill()). These constants are defined in C header files and can even change it's value in different paltforms. 
These constants also are sometimes defined by the C preprocessor and so there is not way to get those values from FFI. If you don't have the value of those constants, you cannot make the FFI call. 
----------

I have tested the tool in OSX and CentOS using latest Pharo 5.0. It won't work in Windows right now.  As usual, all classes and methods have comments and there are enough tests. 

At the end, I decided the C program will output a very naive Smalltalk literal array kind of thingy. The tool then parses that output and directly creates a init method (which is compiled into the SharedPool class) for that platform which is then called automatically at startup (only if initialization is needed). 

As for real examples, I started to write constants for libc:  signal.h (to use kill()) , wait.h (to use wait() famility), fcntl.h (to use ... xxx()) , and errno.h. You can take a look to the package 'FFICHeaderExtractor-LibC'.

Note that for running the tests you need 'cc' findable by path in OSX and 'gcc' in Unix. 

To load the code in a latest Pharo 5.0, execute:

Metacello new
    baseline: 'FFICHeaderExtractor';
    repository: 'github://marianopeck/FFICHeaderExtractor:master/repository';
    load.
Any feedback is appreciated. 
I will start writing the doc now.
BTW: Big thanks to Eliot Miranda which helped me answering noob questions and providing useful code and guidelines. 
Best,


On Sat, Jan 23, 2016 at 1:12 PM, Eliot Miranda <[hidden email][hidden email]> wrote:
 
Hi Denis,

On Jan 23, 2016, at 7:30 AM, Denis Kudriashov <[hidden email][hidden email]> wrote:


2016-01-22 22:35 GMT+01:00 Eliot Miranda <[hidden email][hidden email]>:
Let's measure this.  Let's say we have 8 platforms (that's an underestimate, because different Linux distributions may have different values for certain constants), but 8, which is 4 basic platforms times 32- & 64-bits.  We have Mac x86 32-bit, Mac x64 64-bit, Windows x86 32-bit, Windows x64 64-bit, Linux x86 32-bit, Linux ARM 32-bit, Linux x64 64-bit, and soon enough there will be more.  Further, there may be different versions over time.

So each of those initialization methods has
- 1 slot for the global variable to be assigned
- 1 slot for the literal value to assign to it
- 3 bytes of bytecode per initialization for small methods, 4 for large methods.  Let's say 4.

So the overhead in 32-bits is 12 bytes per constant, and in 64-bits is 20 bytes.  So the overhead per constant for all platforms is 96 bytes per constant in 32-bits and 160 bytes per constant for 64-bits.  A full system with sockets, files, a database connexion etc could easily exceed 100 constants.  I think it would be nearer 1000.  So the overheads are in the 10- to 100-k byte range (100k ~= 0.5% of the image) on 32-bits.  That's low but it's also pure overhead.  Every GC has to visit them.  Every senders and implementors has to visit them, but they offer nothing of value.  Whereas the small parser for whatever notation is used to store the constants externally (if they are needed in a given deployment) has a small constant overhead; its simple code.

Further, you still need the machinery to export the constants to be able to generate these initialization methods.  If you've got the machinery and you don't need the methods why bother to generate the methods?

As the Scots say, many a mickle makes a muckle.

Thank's Eliot for such detailed explanation. It makes sense. 
But personally I prefer Smalltalk solution although Smalltalk itself is pure overhead comparing to C.

I can see the draw of the pure Smalltalk. Simplicity and brows ability.  But imagine a tiny headless image deployed on containers, say 2mb.  Now 100kb of initialization code doesn't look so good :-).  Anyway I'm beating a dead horse.  Mariano is generating initialization methods.


My question was raised by Mariano idea to save ston files in methods. I think it can reduce problems which you described.
But then literal array syntax can be more suitable than ston. 

I just want to be clear, I'm neutral about the notation used to export info from the C file.  Liberal array syntax, chunk source format, ston, xml.  It doesn't matter as long as it's convenient at expressing an attribute dictionary from names to attributes such as value, size, offset.  Don't get hung up on the specific notation.  If one were to go with the external file the only real requirements are that it be reasonably compact and quick to parse.  That might kill xml but leave plenty of other candidates.


_,,,^..^,,,_ (phone)




--




--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet
Reply | Threaded
Open this post in threaded view
|

Re: [ANN] FFICHeaderExtractor first milestone (for early code reviewers) [WAS] Can OSProcess functionality be implemented using FFI instead of plugin?

Mariano Martinez Peck


On Fri, Mar 18, 2016 at 10:06 AM, Damien Pollet <[hidden email]> wrote:
Hi Mariano, I just saw your comment in OSSUnixSystemAccessor >> getcwd and wondered if there's a feasible plan for errno? What if the VM provided a known accessor function around the global/macro/whatever errno is defined as?



Hi Damien,

Unfortunately, that comment you read there happens not only for getcwd() but for every function that writes into errno. I already sent this to the mailing list and was discussed [1]. I have also chatted with Esteban about this, and Eliot said in VW VM they have something like this. 

To conclude, we lack such a support from FFI backend. We hope to add that at some point. Maybe Ronie will have some time to play with this?



 
On 26 January 2016 at 08:15, stepharo <[hidden email]> wrote:
super!

Stef


Le 25/1/16 23:27, Mariano Martinez Peck a écrit :
Hi guys, 

OK, I have a first working version and so I wanted to share it with you.

I have not yet the time to start writing the doc since I just finished the first pass on the code. Tomorrow I will start with the doc. But I thought some of you may be interested in taking a look even without formal "doc" (and some feedback/iteration may avoid re-writing docs..).

If you have no clue what I am talking about, then this summary is for you:

----------
When we  use FFI  to call a certain library it's quite common that we need to pass as argument certain constants (for example, SIGKILL to kill()). These constants are defined in C header files and can even change it's value in different paltforms. 
These constants also are sometimes defined by the C preprocessor and so there is not way to get those values from FFI. If you don't have the value of those constants, you cannot make the FFI call. 
----------

I have tested the tool in OSX and CentOS using latest Pharo 5.0. It won't work in Windows right now.  As usual, all classes and methods have comments and there are enough tests. 

At the end, I decided the C program will output a very naive Smalltalk literal array kind of thingy. The tool then parses that output and directly creates a init method (which is compiled into the SharedPool class) for that platform which is then called automatically at startup (only if initialization is needed). 

As for real examples, I started to write constants for libc:  signal.h (to use kill()) , wait.h (to use wait() famility), fcntl.h (to use ... xxx()) , and errno.h. You can take a look to the package 'FFICHeaderExtractor-LibC'.

Note that for running the tests you need 'cc' findable by path in OSX and 'gcc' in Unix. 

To load the code in a latest Pharo 5.0, execute:

Metacello new
    baseline: 'FFICHeaderExtractor';
    repository: 'github://marianopeck/FFICHeaderExtractor:master/repository';
    load.
Any feedback is appreciated. 
I will start writing the doc now.
BTW: Big thanks to Eliot Miranda which helped me answering noob questions and providing useful code and guidelines. 
Best,


On Sat, Jan 23, 2016 at 1:12 PM, Eliot Miranda <[hidden email][hidden email]> wrote:
 
Hi Denis,

On Jan 23, 2016, at 7:30 AM, Denis Kudriashov <[hidden email][hidden email]> wrote:


2016-01-22 22:35 GMT+01:00 Eliot Miranda <[hidden email][hidden email]>:
Let's measure this.  Let's say we have 8 platforms (that's an underestimate, because different Linux distributions may have different values for certain constants), but 8, which is 4 basic platforms times 32- & 64-bits.  We have Mac x86 32-bit, Mac x64 64-bit, Windows x86 32-bit, Windows x64 64-bit, Linux x86 32-bit, Linux ARM 32-bit, Linux x64 64-bit, and soon enough there will be more.  Further, there may be different versions over time.

So each of those initialization methods has
- 1 slot for the global variable to be assigned
- 1 slot for the literal value to assign to it
- 3 bytes of bytecode per initialization for small methods, 4 for large methods.  Let's say 4.

So the overhead in 32-bits is 12 bytes per constant, and in 64-bits is 20 bytes.  So the overhead per constant for all platforms is 96 bytes per constant in 32-bits and 160 bytes per constant for 64-bits.  A full system with sockets, files, a database connexion etc could easily exceed 100 constants.  I think it would be nearer 1000.  So the overheads are in the 10- to 100-k byte range (100k ~= 0.5% of the image) on 32-bits.  That's low but it's also pure overhead.  Every GC has to visit them.  Every senders and implementors has to visit them, but they offer nothing of value.  Whereas the small parser for whatever notation is used to store the constants externally (if they are needed in a given deployment) has a small constant overhead; its simple code.

Further, you still need the machinery to export the constants to be able to generate these initialization methods.  If you've got the machinery and you don't need the methods why bother to generate the methods?

As the Scots say, many a mickle makes a muckle.

Thank's Eliot for such detailed explanation. It makes sense. 
But personally I prefer Smalltalk solution although Smalltalk itself is pure overhead comparing to C.

I can see the draw of the pure Smalltalk. Simplicity and brows ability.  But imagine a tiny headless image deployed on containers, say 2mb.  Now 100kb of initialization code doesn't look so good :-).  Anyway I'm beating a dead horse.  Mariano is generating initialization methods.


My question was raised by Mariano idea to save ston files in methods. I think it can reduce problems which you described.
But then literal array syntax can be more suitable than ston. 

I just want to be clear, I'm neutral about the notation used to export info from the C file.  Liberal array syntax, chunk source format, ston, xml.  It doesn't matter as long as it's convenient at expressing an attribute dictionary from names to attributes such as value, size, offset.  Don't get hung up on the specific notation.  If one were to go with the external file the only real requirements are that it be reasonably compact and quick to parse.  That might kill xml but leave plenty of other candidates.


_,,,^..^,,,_ (phone)




--




--
Damien Pollet
type less, do more [ | ] http://people.untyped.org/damien.pollet



--