Understanding the role of the sources file

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Understanding the role of the sources file

kilon.alios
I was adding a short description to the UPBE about sources file , I always thought that the sources file is the file that contains the source code of the image because the image file itself stores only the bytecode.

However its just came to my attention that the sources file does not contain code that is recently installed in the image.

So how exactly the sources file works and what it is ?
Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

demarey

Le 13 janv. 2016 à 10:57, Dimitris Chloupis a écrit :

> I was adding a short description to the UPBE about sources file , I always thought that the sources file is the file that contains the source code of the image because the image file itself stores only the bytecode.
>
> However its just came to my attention that the sources file does not contain code that is recently installed in the image.
>
> So how exactly the sources file works and what it is ?


If I understood well:
- the sources file has all the source code of the released pharo version (when we generate this new source file PharoVXX.sources).
- the changes file has the source code since the release.

smime.p7s (5K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

Sven Van Caekenberghe-2
In reply to this post by kilon.alios

> On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]> wrote:
>
> I was adding a short description to the UPBE about sources file , I always thought that the sources file is the file that contains the source code of the image because the image file itself stores only the bytecode.
>
> However its just came to my attention that the sources file does not contain code that is recently installed in the image.
>
> So how exactly the sources file works and what it is ?

The main perspective is from the object point of view: methods are just objects like everything else. In order to be executable they know their byte codes (which might be JIT compiled on execution, but that is an implementation detail) and they know their source code.

Today we would probably just store the source code strings in the image (maybe compressed) as memory is pretty cheap. But way back when Smalltalk started, that was not the case. So they decided to map the source code out to files.

So method source code is a magic string (RemoteString) that points to some position in a file. There are 2 files in use: the sources file and the changes file.

The sources file is a kind of snapshot of the source code of all methods at the point of release of a major new version. That is why there is a Vxy in their name. The source file never changes once created or renewed (a process called generating the sources, see PharoSourcesCondenser).

While developing and creating new versions of methods, the new source code is appended to another file called the changes file, much like a transaction log. This is also a safety mechanism to recover 'lost' changes.

The changes file can contain multiple versions of a method. This can be reduced in size using a process called condensing the changes, see PharoChangesCondenser.

On a new release, the changes file will be (almost) empty.

HTH,

Sven



Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

kilon.alios
So I am correct that the image does not store the source code, and that the source code is stored in sources and changes. The only diffirence is that the objects have a source variable that points to the right place for finding the source code.

This is the final text if you find anything incorrect please correct me

---------------

1. The virtual machine (VM) is the only component that is different for each operating system. The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable is named:

Pharo.exe for Windows; • pharo for Linux ; and

Pharo for OSX (inside a package also named Pharo.app).
The other components below are portable across operating systems, and

can be copied and run on any appropriate virtual machine.

2. The sources file contains source code for parts of Pharo that don’t change frequently. Sources file is important because the image file format stores only the bytecode of live objects and not their source code. Typically a new sources file is generated once per major release of Pharo. For Pharo 4.0, this file is named PharoV40.sources.

3. The changes file logs of all source code modifications since the .sources file was generated. This facilitates a per method history for diffs or re- verting.That means that even if you dont manage to save the image file on a crash or you just forgot you can recover your changes from this file. Each release provides a near empty file named for the release, for example Pharo4.0.changes.

4. The image file provides a frozen in time snapshot of a running Pharo system. This is the file where the Pharo bytecode is stored and as such its a cross platform format. This is the heart of Pharo, containing the live state of all objects in the system (including classes and methods, since they are objects too). The file is named for the release (like Pharo4.0.image).

The .image and .changes files provided by a Pharo release are the starting point for a live environment that you adapt to your needs. Essentially the image file containes the compiler of the language (not the VM) , the language parser, the IDE tools, many libraries and acts a bit like a virtual Operation System that runs on top of a Virtual Machine (VM), similarly to ISO files.

As you work in Pharo, these files are modified, so you need to make sure that they are writable. The .image and .changes files are intimately linked and should always be kept together, with matching base filenames. Never edit them directly with a text editor, as .images holds your live object runtime memory, which indexes into the .changes files for the source. It is a good idea to keep a backup copy of the downloaded .image and .changes files so you can always start from a fresh image and reload your code. However the most efficient way for backing up code is to use a version control system that will provide an easier and powerful way to back up and track your changes.

The four main component files above can be placed in the same directory, although it’s also possible to put the Virtual Machine and sources file in a separate directory where everyone has read-only access to them.

If more than one image file is present in the same directory pharo will prompt you to choose an image file you want to load.

Do whatever works best for your style of working and your operating system.





On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <[hidden email]> wrote:

> On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]> wrote:
>
> I was adding a short description to the UPBE about sources file , I always thought that the sources file is the file that contains the source code of the image because the image file itself stores only the bytecode.
>
> However its just came to my attention that the sources file does not contain code that is recently installed in the image.
>
> So how exactly the sources file works and what it is ?

The main perspective is from the object point of view: methods are just objects like everything else. In order to be executable they know their byte codes (which might be JIT compiled on execution, but that is an implementation detail) and they know their source code.

Today we would probably just store the source code strings in the image (maybe compressed) as memory is pretty cheap. But way back when Smalltalk started, that was not the case. So they decided to map the source code out to files.

So method source code is a magic string (RemoteString) that points to some position in a file. There are 2 files in use: the sources file and the changes file.

The sources file is a kind of snapshot of the source code of all methods at the point of release of a major new version. That is why there is a Vxy in their name. The source file never changes once created or renewed (a process called generating the sources, see PharoSourcesCondenser).

While developing and creating new versions of methods, the new source code is appended to another file called the changes file, much like a transaction log. This is also a safety mechanism to recover 'lost' changes.

The changes file can contain multiple versions of a method. This can be reduced in size using a process called condensing the changes, see PharoChangesCondenser.

On a new release, the changes file will be (almost) empty.

HTH,

Sven



Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

Sven Van Caekenberghe-2
Sounds about right.

Now, I would swap 1 and 4, as the image is the most important abstraction.

There is also a bit too much emphasis on (byte|source)code. This is already pretty technical (it assume you know what compilation is and so on). But I understand it must be explained here, and you did it well.

However, I would start by saying that the image is a snapshot of the object world in memory that is effectively a live Pharo system. It contains everything that is available and that exists in Pharo. This includes any objects that you created yourself, windows, browsers, open debuggers, executing processes, all meta objects as well as all representations of code.

<sidenote>
The fact that there is a sources and changes file is an implementation artefact, not something fundamental. There are ideas to change this in the future (but you do not have to mention that).
</sidenote>

Also, the VM not only executes code, it maintains the object world, which includes the ability to load and save it from and to an image. It creates a portable (cross platform) abstraction that isolates the image from the particular details of the underlying hardware and OS. In that role it implements the interface with the outside world. I would mention that second part before mentioning the code execution.

The sentence "The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed." is not 100% correct. It is possible to execute the byte code without converting it. This is called interpretation. JIT is a faster technique that includes converting (some often used) byte code to machine code and caching that.

I hope this helps (it is hard to write a 'definitive explanation' as there are some many aspects to this and it depends on the context/audience).

> On 13 Jan 2016, at 12:58, Dimitris Chloupis <[hidden email]> wrote:
>
> So I am correct that the image does not store the source code, and that the source code is stored in sources and changes. The only diffirence is that the objects have a source variable that points to the right place for finding the source code.
>
> This is the final text if you find anything incorrect please correct me
>
> ---------------
>
> 1. The virtual machine (VM) is the only component that is different for each operating system. The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable is named:
>
> • Pharo.exe for Windows; • pharo for Linux ; and
>
> • Pharo for OSX (inside a package also named Pharo.app).
> The other components below are portable across operating systems, and
>
> can be copied and run on any appropriate virtual machine.
>
> 2. The sources file contains source code for parts of Pharo that don’t change frequently. Sources file is important because the image file format stores only the bytecode of live objects and not their source code. Typically a new sources file is generated once per major release of Pharo. For Pharo 4.0, this file is named PharoV40.sources.
>
> 3. The changes file logs of all source code modifications since the .sources file was generated. This facilitates a per method history for diffs or re- verting.That means that even if you dont manage to save the image file on a crash or you just forgot you can recover your changes from this file. Each release provides a near empty file named for the release, for example Pharo4.0.changes.
>
> 4. The image file provides a frozen in time snapshot of a running Pharo system. This is the file where the Pharo bytecode is stored and as such its a cross platform format. This is the heart of Pharo, containing the live state of all objects in the system (including classes and methods, since they are objects too). The file is named for the release (like Pharo4.0.image).
>
> The .image and .changes files provided by a Pharo release are the starting point for a live environment that you adapt to your needs. Essentially the image file containes the compiler of the language (not the VM) , the language parser, the IDE tools, many libraries and acts a bit like a virtual Operation System that runs on top of a Virtual Machine (VM), similarly to ISO files.
>
> As you work in Pharo, these files are modified, so you need to make sure that they are writable. The .image and .changes files are intimately linked and should always be kept together, with matching base filenames. Never edit them directly with a text editor, as .images holds your live object runtime memory, which indexes into the .changes files for the source. It is a good idea to keep a backup copy of the downloaded .image and .changes files so you can always start from a fresh image and reload your code. However the most efficient way for backing up code is to use a version control system that will provide an easier and powerful way to back up and track your changes.
>
> The four main component files above can be placed in the same directory, although it’s also possible to put the Virtual Machine and sources file in a separate directory where everyone has read-only access to them.
>
> If more than one image file is present in the same directory pharo will prompt you to choose an image file you want to load.
>
> Do whatever works best for your style of working and your operating system.
>
>
>
>
>
> On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <[hidden email]> wrote:
>
> > On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]> wrote:
> >
> > I was adding a short description to the UPBE about sources file , I always thought that the sources file is the file that contains the source code of the image because the image file itself stores only the bytecode.
> >
> > However its just came to my attention that the sources file does not contain code that is recently installed in the image.
> >
> > So how exactly the sources file works and what it is ?
>
> The main perspective is from the object point of view: methods are just objects like everything else. In order to be executable they know their byte codes (which might be JIT compiled on execution, but that is an implementation detail) and they know their source code.
>
> Today we would probably just store the source code strings in the image (maybe compressed) as memory is pretty cheap. But way back when Smalltalk started, that was not the case. So they decided to map the source code out to files.
>
> So method source code is a magic string (RemoteString) that points to some position in a file. There are 2 files in use: the sources file and the changes file.
>
> The sources file is a kind of snapshot of the source code of all methods at the point of release of a major new version. That is why there is a Vxy in their name. The source file never changes once created or renewed (a process called generating the sources, see PharoSourcesCondenser).
>
> While developing and creating new versions of methods, the new source code is appended to another file called the changes file, much like a transaction log. This is also a safety mechanism to recover 'lost' changes.
>
> The changes file can contain multiple versions of a method. This can be reduced in size using a process called condensing the changes, see PharoChangesCondenser.
>
> On a new release, the changes file will be (almost) empty.
>
> HTH,
>
> Sven
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

kilon.alios
I mentioned bytecode because I dont want the user to see at some point bytecode and say "What the hell is that" I want the reader to feel confident that at least understands the basic in Pharo. Also very brief explanations about bytecode I have seen in similar python tutorials. Obviously I dont want to go any deeper than that because the user wont have to worry about the technical details on a daily basis anyway.

I agree that I could add a bit more on the VM description similar to what you posted. I am curious though, wont even the interpreter generate machine code in order to execute the code  or does it use existing machine code inside the VM binary ?

On Wed, Jan 13, 2016 at 2:25 PM Sven Van Caekenberghe <[hidden email]> wrote:
Sounds about right.

Now, I would swap 1 and 4, as the image is the most important abstraction.

There is also a bit too much emphasis on (byte|source)code. This is already pretty technical (it assume you know what compilation is and so on). But I understand it must be explained here, and you did it well.

However, I would start by saying that the image is a snapshot of the object world in memory that is effectively a live Pharo system. It contains everything that is available and that exists in Pharo. This includes any objects that you created yourself, windows, browsers, open debuggers, executing processes, all meta objects as well as all representations of code.

<sidenote>
The fact that there is a sources and changes file is an implementation artefact, not something fundamental. There are ideas to change this in the future (but you do not have to mention that).
</sidenote>

Also, the VM not only executes code, it maintains the object world, which includes the ability to load and save it from and to an image. It creates a portable (cross platform) abstraction that isolates the image from the particular details of the underlying hardware and OS. In that role it implements the interface with the outside world. I would mention that second part before mentioning the code execution.

The sentence "The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed." is not 100% correct. It is possible to execute the byte code without converting it. This is called interpretation. JIT is a faster technique that includes converting (some often used) byte code to machine code and caching that.

I hope this helps (it is hard to write a 'definitive explanation' as there are some many aspects to this and it depends on the context/audience).

> On 13 Jan 2016, at 12:58, Dimitris Chloupis <[hidden email]> wrote:
>
> So I am correct that the image does not store the source code, and that the source code is stored in sources and changes. The only diffirence is that the objects have a source variable that points to the right place for finding the source code.
>
> This is the final text if you find anything incorrect please correct me
>
> ---------------
>
> 1. The virtual machine (VM) is the only component that is different for each operating system. The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable is named:
>
> • Pharo.exe for Windows; • pharo for Linux ; and
>
> • Pharo for OSX (inside a package also named Pharo.app).
> The other components below are portable across operating systems, and
>
> can be copied and run on any appropriate virtual machine.
>
> 2. The sources file contains source code for parts of Pharo that don’t change frequently. Sources file is important because the image file format stores only the bytecode of live objects and not their source code. Typically a new sources file is generated once per major release of Pharo. For Pharo 4.0, this file is named PharoV40.sources.
>
> 3. The changes file logs of all source code modifications since the .sources file was generated. This facilitates a per method history for diffs or re- verting.That means that even if you dont manage to save the image file on a crash or you just forgot you can recover your changes from this file. Each release provides a near empty file named for the release, for example Pharo4.0.changes.
>
> 4. The image file provides a frozen in time snapshot of a running Pharo system. This is the file where the Pharo bytecode is stored and as such its a cross platform format. This is the heart of Pharo, containing the live state of all objects in the system (including classes and methods, since they are objects too). The file is named for the release (like Pharo4.0.image).
>
> The .image and .changes files provided by a Pharo release are the starting point for a live environment that you adapt to your needs. Essentially the image file containes the compiler of the language (not the VM) , the language parser, the IDE tools, many libraries and acts a bit like a virtual Operation System that runs on top of a Virtual Machine (VM), similarly to ISO files.
>
> As you work in Pharo, these files are modified, so you need to make sure that they are writable. The .image and .changes files are intimately linked and should always be kept together, with matching base filenames. Never edit them directly with a text editor, as .images holds your live object runtime memory, which indexes into the .changes files for the source. It is a good idea to keep a backup copy of the downloaded .image and .changes files so you can always start from a fresh image and reload your code. However the most efficient way for backing up code is to use a version control system that will provide an easier and powerful way to back up and track your changes.
>
> The four main component files above can be placed in the same directory, although it’s also possible to put the Virtual Machine and sources file in a separate directory where everyone has read-only access to them.
>
> If more than one image file is present in the same directory pharo will prompt you to choose an image file you want to load.
>
> Do whatever works best for your style of working and your operating system.
>
>
>
>
>
> On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <[hidden email]> wrote:
>
> > On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]> wrote:
> >
> > I was adding a short description to the UPBE about sources file , I always thought that the sources file is the file that contains the source code of the image because the image file itself stores only the bytecode.
> >
> > However its just came to my attention that the sources file does not contain code that is recently installed in the image.
> >
> > So how exactly the sources file works and what it is ?
>
> The main perspective is from the object point of view: methods are just objects like everything else. In order to be executable they know their byte codes (which might be JIT compiled on execution, but that is an implementation detail) and they know their source code.
>
> Today we would probably just store the source code strings in the image (maybe compressed) as memory is pretty cheap. But way back when Smalltalk started, that was not the case. So they decided to map the source code out to files.
>
> So method source code is a magic string (RemoteString) that points to some position in a file. There are 2 files in use: the sources file and the changes file.
>
> The sources file is a kind of snapshot of the source code of all methods at the point of release of a major new version. That is why there is a Vxy in their name. The source file never changes once created or renewed (a process called generating the sources, see PharoSourcesCondenser).
>
> While developing and creating new versions of methods, the new source code is appended to another file called the changes file, much like a transaction log. This is also a safety mechanism to recover 'lost' changes.
>
> The changes file can contain multiple versions of a method. This can be reduced in size using a process called condensing the changes, see PharoChangesCondenser.
>
> On a new release, the changes file will be (almost) empty.
>
> HTH,
>
> Sven
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

Sven Van Caekenberghe-2

> On 13 Jan 2016, at 13:42, Dimitris Chloupis <[hidden email]> wrote:
>
> I mentioned bytecode because I dont want the user to see at some point bytecode and say "What the hell is that" I want the reader to feel confident that at least understands the basic in Pharo. Also very brief explanations about bytecode I have seen in similar python tutorials. Obviously I dont want to go any deeper than that because the user wont have to worry about the technical details on a daily basis anyway.
>
> I agree that I could add a bit more on the VM description similar to what you posted. I am curious though, wont even the interpreter generate machine code in order to execute the code  or does it use existing machine code inside the VM binary ?

No, a classic interpreter does not 'generate' machine code, it is just a program that reads and executes bytes codes in a loop, the interpreter 'is' machine code.

No offence, but you see why I think it is important to not try to use or explain too much complex concepts in the 1st chapter.

Learning to program is hard. It should first be done abstractly. Think about Scratch. The whole idea of Smalltalk is to create a world of interacting objects. (Even byte code is not a necessary concept at all, for example, in Pharo, you can compile (translate) to AST and execute that, I believe. There are Smalltalk implementations that compile directly to C or JavaScript). Hell, even 'compile' is not necessary, just 'accept'. See ?

> On Wed, Jan 13, 2016 at 2:25 PM Sven Van Caekenberghe <[hidden email]> wrote:
> Sounds about right.
>
> Now, I would swap 1 and 4, as the image is the most important abstraction.
>
> There is also a bit too much emphasis on (byte|source)code. This is already pretty technical (it assume you know what compilation is and so on). But I understand it must be explained here, and you did it well.
>
> However, I would start by saying that the image is a snapshot of the object world in memory that is effectively a live Pharo system. It contains everything that is available and that exists in Pharo. This includes any objects that you created yourself, windows, browsers, open debuggers, executing processes, all meta objects as well as all representations of code.
>
> <sidenote>
> The fact that there is a sources and changes file is an implementation artefact, not something fundamental. There are ideas to change this in the future (but you do not have to mention that).
> </sidenote>
>
> Also, the VM not only executes code, it maintains the object world, which includes the ability to load and save it from and to an image. It creates a portable (cross platform) abstraction that isolates the image from the particular details of the underlying hardware and OS. In that role it implements the interface with the outside world. I would mention that second part before mentioning the code execution.
>
> The sentence "The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed." is not 100% correct. It is possible to execute the byte code without converting it. This is called interpretation. JIT is a faster technique that includes converting (some often used) byte code to machine code and caching that.
>
> I hope this helps (it is hard to write a 'definitive explanation' as there are some many aspects to this and it depends on the context/audience).
>
> > On 13 Jan 2016, at 12:58, Dimitris Chloupis <[hidden email]> wrote:
> >
> > So I am correct that the image does not store the source code, and that the source code is stored in sources and changes. The only diffirence is that the objects have a source variable that points to the right place for finding the source code.
> >
> > This is the final text if you find anything incorrect please correct me
> >
> > ---------------
> >
> > 1. The virtual machine (VM) is the only component that is different for each operating system. The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable is named:
> >
> > • Pharo.exe for Windows; • pharo for Linux ; and
> >
> > • Pharo for OSX (inside a package also named Pharo.app).
> > The other components below are portable across operating systems, and
> >
> > can be copied and run on any appropriate virtual machine.
> >
> > 2. The sources file contains source code for parts of Pharo that don’t change frequently. Sources file is important because the image file format stores only the bytecode of live objects and not their source code. Typically a new sources file is generated once per major release of Pharo. For Pharo 4.0, this file is named PharoV40.sources.
> >
> > 3. The changes file logs of all source code modifications since the .sources file was generated. This facilitates a per method history for diffs or re- verting.That means that even if you dont manage to save the image file on a crash or you just forgot you can recover your changes from this file. Each release provides a near empty file named for the release, for example Pharo4.0.changes.
> >
> > 4. The image file provides a frozen in time snapshot of a running Pharo system. This is the file where the Pharo bytecode is stored and as such its a cross platform format. This is the heart of Pharo, containing the live state of all objects in the system (including classes and methods, since they are objects too). The file is named for the release (like Pharo4.0.image).
> >
> > The .image and .changes files provided by a Pharo release are the starting point for a live environment that you adapt to your needs. Essentially the image file containes the compiler of the language (not the VM) , the language parser, the IDE tools, many libraries and acts a bit like a virtual Operation System that runs on top of a Virtual Machine (VM), similarly to ISO files.
> >
> > As you work in Pharo, these files are modified, so you need to make sure that they are writable. The .image and .changes files are intimately linked and should always be kept together, with matching base filenames. Never edit them directly with a text editor, as .images holds your live object runtime memory, which indexes into the .changes files for the source. It is a good idea to keep a backup copy of the downloaded .image and .changes files so you can always start from a fresh image and reload your code. However the most efficient way for backing up code is to use a version control system that will provide an easier and powerful way to back up and track your changes.
> >
> > The four main component files above can be placed in the same directory, although it’s also possible to put the Virtual Machine and sources file in a separate directory where everyone has read-only access to them.
> >
> > If more than one image file is present in the same directory pharo will prompt you to choose an image file you want to load.
> >
> > Do whatever works best for your style of working and your operating system.
> >
> >
> >
> >
> >
> > On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <[hidden email]> wrote:
> >
> > > On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]> wrote:
> > >
> > > I was adding a short description to the UPBE about sources file , I always thought that the sources file is the file that contains the source code of the image because the image file itself stores only the bytecode.
> > >
> > > However its just came to my attention that the sources file does not contain code that is recently installed in the image.
> > >
> > > So how exactly the sources file works and what it is ?
> >
> > The main perspective is from the object point of view: methods are just objects like everything else. In order to be executable they know their byte codes (which might be JIT compiled on execution, but that is an implementation detail) and they know their source code.
> >
> > Today we would probably just store the source code strings in the image (maybe compressed) as memory is pretty cheap. But way back when Smalltalk started, that was not the case. So they decided to map the source code out to files.
> >
> > So method source code is a magic string (RemoteString) that points to some position in a file. There are 2 files in use: the sources file and the changes file.
> >
> > The sources file is a kind of snapshot of the source code of all methods at the point of release of a major new version. That is why there is a Vxy in their name. The source file never changes once created or renewed (a process called generating the sources, see PharoSourcesCondenser).
> >
> > While developing and creating new versions of methods, the new source code is appended to another file called the changes file, much like a transaction log. This is also a safety mechanism to recover 'lost' changes.
> >
> > The changes file can contain multiple versions of a method. This can be reduced in size using a process called condensing the changes, see PharoChangesCondenser.
> >
> > On a new release, the changes file will be (almost) empty.
> >
> > HTH,
> >
> > Sven
> >
> >
> >
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

kilon.alios
I assume you have never read a an introduction to C++ then :D

here is the final addition for the vm

(Vm) is the only component that is different for each operating system. The main purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed, but also to generally handle low level functionality like interpreting code, handling OS events (mouse and keyboard), calling C libraries etc. Pharo 4 comes with the Cog VM a very fast JIT VM.

I think its clear, precise and does not leave much room for confusion. Personally I think its very important for the absolute begineer to have strong foundations of understanding the fundamental of Pharo and not for things to appear magical and "dont touch this".

On Wed, Jan 13, 2016 at 2:54 PM Sven Van Caekenberghe <[hidden email]> wrote:

> On 13 Jan 2016, at 13:42, Dimitris Chloupis <[hidden email]> wrote:
>
> I mentioned bytecode because I dont want the user to see at some point bytecode and say "What the hell is that" I want the reader to feel confident that at least understands the basic in Pharo. Also very brief explanations about bytecode I have seen in similar python tutorials. Obviously I dont want to go any deeper than that because the user wont have to worry about the technical details on a daily basis anyway.
>
> I agree that I could add a bit more on the VM description similar to what you posted. I am curious though, wont even the interpreter generate machine code in order to execute the code  or does it use existing machine code inside the VM binary ?

No, a classic interpreter does not 'generate' machine code, it is just a program that reads and executes bytes codes in a loop, the interpreter 'is' machine code.

No offence, but you see why I think it is important to not try to use or explain too much complex concepts in the 1st chapter.

Learning to program is hard. It should first be done abstractly. Think about Scratch. The whole idea of Smalltalk is to create a world of interacting objects. (Even byte code is not a necessary concept at all, for example, in Pharo, you can compile (translate) to AST and execute that, I believe. There are Smalltalk implementations that compile directly to C or JavaScript). Hell, even 'compile' is not necessary, just 'accept'. See ?

> On Wed, Jan 13, 2016 at 2:25 PM Sven Van Caekenberghe <[hidden email]> wrote:
> Sounds about right.
>
> Now, I would swap 1 and 4, as the image is the most important abstraction.
>
> There is also a bit too much emphasis on (byte|source)code. This is already pretty technical (it assume you know what compilation is and so on). But I understand it must be explained here, and you did it well.
>
> However, I would start by saying that the image is a snapshot of the object world in memory that is effectively a live Pharo system. It contains everything that is available and that exists in Pharo. This includes any objects that you created yourself, windows, browsers, open debuggers, executing processes, all meta objects as well as all representations of code.
>
> <sidenote>
> The fact that there is a sources and changes file is an implementation artefact, not something fundamental. There are ideas to change this in the future (but you do not have to mention that).
> </sidenote>
>
> Also, the VM not only executes code, it maintains the object world, which includes the ability to load and save it from and to an image. It creates a portable (cross platform) abstraction that isolates the image from the particular details of the underlying hardware and OS. In that role it implements the interface with the outside world. I would mention that second part before mentioning the code execution.
>
> The sentence "The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed." is not 100% correct. It is possible to execute the byte code without converting it. This is called interpretation. JIT is a faster technique that includes converting (some often used) byte code to machine code and caching that.
>
> I hope this helps (it is hard to write a 'definitive explanation' as there are some many aspects to this and it depends on the context/audience).
>
> > On 13 Jan 2016, at 12:58, Dimitris Chloupis <[hidden email]> wrote:
> >
> > So I am correct that the image does not store the source code, and that the source code is stored in sources and changes. The only diffirence is that the objects have a source variable that points to the right place for finding the source code.
> >
> > This is the final text if you find anything incorrect please correct me
> >
> > ---------------
> >
> > 1. The virtual machine (VM) is the only component that is different for each operating system. The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable is named:
> >
> > • Pharo.exe for Windows; • pharo for Linux ; and
> >
> > • Pharo for OSX (inside a package also named Pharo.app).
> > The other components below are portable across operating systems, and
> >
> > can be copied and run on any appropriate virtual machine.
> >
> > 2. The sources file contains source code for parts of Pharo that don’t change frequently. Sources file is important because the image file format stores only the bytecode of live objects and not their source code. Typically a new sources file is generated once per major release of Pharo. For Pharo 4.0, this file is named PharoV40.sources.
> >
> > 3. The changes file logs of all source code modifications since the .sources file was generated. This facilitates a per method history for diffs or re- verting.That means that even if you dont manage to save the image file on a crash or you just forgot you can recover your changes from this file. Each release provides a near empty file named for the release, for example Pharo4.0.changes.
> >
> > 4. The image file provides a frozen in time snapshot of a running Pharo system. This is the file where the Pharo bytecode is stored and as such its a cross platform format. This is the heart of Pharo, containing the live state of all objects in the system (including classes and methods, since they are objects too). The file is named for the release (like Pharo4.0.image).
> >
> > The .image and .changes files provided by a Pharo release are the starting point for a live environment that you adapt to your needs. Essentially the image file containes the compiler of the language (not the VM) , the language parser, the IDE tools, many libraries and acts a bit like a virtual Operation System that runs on top of a Virtual Machine (VM), similarly to ISO files.
> >
> > As you work in Pharo, these files are modified, so you need to make sure that they are writable. The .image and .changes files are intimately linked and should always be kept together, with matching base filenames. Never edit them directly with a text editor, as .images holds your live object runtime memory, which indexes into the .changes files for the source. It is a good idea to keep a backup copy of the downloaded .image and .changes files so you can always start from a fresh image and reload your code. However the most efficient way for backing up code is to use a version control system that will provide an easier and powerful way to back up and track your changes.
> >
> > The four main component files above can be placed in the same directory, although it’s also possible to put the Virtual Machine and sources file in a separate directory where everyone has read-only access to them.
> >
> > If more than one image file is present in the same directory pharo will prompt you to choose an image file you want to load.
> >
> > Do whatever works best for your style of working and your operating system.
> >
> >
> >
> >
> >
> > On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <[hidden email]> wrote:
> >
> > > On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]> wrote:
> > >
> > > I was adding a short description to the UPBE about sources file , I always thought that the sources file is the file that contains the source code of the image because the image file itself stores only the bytecode.
> > >
> > > However its just came to my attention that the sources file does not contain code that is recently installed in the image.
> > >
> > > So how exactly the sources file works and what it is ?
> >
> > The main perspective is from the object point of view: methods are just objects like everything else. In order to be executable they know their byte codes (which might be JIT compiled on execution, but that is an implementation detail) and they know their source code.
> >
> > Today we would probably just store the source code strings in the image (maybe compressed) as memory is pretty cheap. But way back when Smalltalk started, that was not the case. So they decided to map the source code out to files.
> >
> > So method source code is a magic string (RemoteString) that points to some position in a file. There are 2 files in use: the sources file and the changes file.
> >
> > The sources file is a kind of snapshot of the source code of all methods at the point of release of a major new version. That is why there is a Vxy in their name. The source file never changes once created or renewed (a process called generating the sources, see PharoSourcesCondenser).
> >
> > While developing and creating new versions of methods, the new source code is appended to another file called the changes file, much like a transaction log. This is also a safety mechanism to recover 'lost' changes.
> >
> > The changes file can contain multiple versions of a method. This can be reduced in size using a process called condensing the changes, see PharoChangesCondenser.
> >
> > On a new release, the changes file will be (almost) empty.
> >
> > HTH,
> >
> > Sven
> >
> >
> >
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

Sven Van Caekenberghe-2

> On 13 Jan 2016, at 14:22, Dimitris Chloupis <[hidden email]> wrote:
>
> I assume you have never read a an introduction to C++ then :D

I have and they are too complex.

> here is the final addition for the vm
>
> (Vm) is the only component that is different for each operating system. The main purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed, but also to generally handle low level functionality like interpreting code, handling OS events (mouse and keyboard), calling C libraries etc. Pharo 4 comes with the Cog VM a very fast JIT VM.

You added more technical stuff. I tried to make my point, but you are writing it, not me, so I won't continue.

> I think its clear, precise and does not leave much room for confusion. Personally I think its very important for the absolute begineer to have strong foundations of understanding the fundamental of Pharo and not for things to appear magical and "dont touch this".

Nobody can understand everything at the same time, even experts work on partial abstractions while ignoring most other details.

A beginner has to be guided to the most important concepts first. Explaining something in a simple way is very hard (and I fail most of the time doing that).

Pharo offers curious minds unlimited opportunity to explore, while staying in the same system, but that does not mean everything should be mentioned immediately.

> On Wed, Jan 13, 2016 at 2:54 PM Sven Van Caekenberghe <[hidden email]> wrote:
>
> > On 13 Jan 2016, at 13:42, Dimitris Chloupis <[hidden email]> wrote:
> >
> > I mentioned bytecode because I dont want the user to see at some point bytecode and say "What the hell is that" I want the reader to feel confident that at least understands the basic in Pharo. Also very brief explanations about bytecode I have seen in similar python tutorials. Obviously I dont want to go any deeper than that because the user wont have to worry about the technical details on a daily basis anyway.
> >
> > I agree that I could add a bit more on the VM description similar to what you posted. I am curious though, wont even the interpreter generate machine code in order to execute the code  or does it use existing machine code inside the VM binary ?
>
> No, a classic interpreter does not 'generate' machine code, it is just a program that reads and executes bytes codes in a loop, the interpreter 'is' machine code.
>
> No offence, but you see why I think it is important to not try to use or explain too much complex concepts in the 1st chapter.
>
> Learning to program is hard. It should first be done abstractly. Think about Scratch. The whole idea of Smalltalk is to create a world of interacting objects. (Even byte code is not a necessary concept at all, for example, in Pharo, you can compile (translate) to AST and execute that, I believe. There are Smalltalk implementations that compile directly to C or JavaScript). Hell, even 'compile' is not necessary, just 'accept'. See ?
>
> > On Wed, Jan 13, 2016 at 2:25 PM Sven Van Caekenberghe <[hidden email]> wrote:
> > Sounds about right.
> >
> > Now, I would swap 1 and 4, as the image is the most important abstraction.
> >
> > There is also a bit too much emphasis on (byte|source)code. This is already pretty technical (it assume you know what compilation is and so on). But I understand it must be explained here, and you did it well.
> >
> > However, I would start by saying that the image is a snapshot of the object world in memory that is effectively a live Pharo system. It contains everything that is available and that exists in Pharo. This includes any objects that you created yourself, windows, browsers, open debuggers, executing processes, all meta objects as well as all representations of code.
> >
> > <sidenote>
> > The fact that there is a sources and changes file is an implementation artefact, not something fundamental. There are ideas to change this in the future (but you do not have to mention that).
> > </sidenote>
> >
> > Also, the VM not only executes code, it maintains the object world, which includes the ability to load and save it from and to an image. It creates a portable (cross platform) abstraction that isolates the image from the particular details of the underlying hardware and OS. In that role it implements the interface with the outside world. I would mention that second part before mentioning the code execution.
> >
> > The sentence "The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed." is not 100% correct. It is possible to execute the byte code without converting it. This is called interpretation. JIT is a faster technique that includes converting (some often used) byte code to machine code and caching that.
> >
> > I hope this helps (it is hard to write a 'definitive explanation' as there are some many aspects to this and it depends on the context/audience).
> >
> > > On 13 Jan 2016, at 12:58, Dimitris Chloupis <[hidden email]> wrote:
> > >
> > > So I am correct that the image does not store the source code, and that the source code is stored in sources and changes. The only diffirence is that the objects have a source variable that points to the right place for finding the source code.
> > >
> > > This is the final text if you find anything incorrect please correct me
> > >
> > > ---------------
> > >
> > > 1. The virtual machine (VM) is the only component that is different for each operating system. The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable is named:
> > >
> > > • Pharo.exe for Windows; • pharo for Linux ; and
> > >
> > > • Pharo for OSX (inside a package also named Pharo.app).
> > > The other components below are portable across operating systems, and
> > >
> > > can be copied and run on any appropriate virtual machine.
> > >
> > > 2. The sources file contains source code for parts of Pharo that don’t change frequently. Sources file is important because the image file format stores only the bytecode of live objects and not their source code. Typically a new sources file is generated once per major release of Pharo. For Pharo 4.0, this file is named PharoV40.sources.
> > >
> > > 3. The changes file logs of all source code modifications since the .sources file was generated. This facilitates a per method history for diffs or re- verting.That means that even if you dont manage to save the image file on a crash or you just forgot you can recover your changes from this file. Each release provides a near empty file named for the release, for example Pharo4.0.changes.
> > >
> > > 4. The image file provides a frozen in time snapshot of a running Pharo system. This is the file where the Pharo bytecode is stored and as such its a cross platform format. This is the heart of Pharo, containing the live state of all objects in the system (including classes and methods, since they are objects too). The file is named for the release (like Pharo4.0.image).
> > >
> > > The .image and .changes files provided by a Pharo release are the starting point for a live environment that you adapt to your needs. Essentially the image file containes the compiler of the language (not the VM) , the language parser, the IDE tools, many libraries and acts a bit like a virtual Operation System that runs on top of a Virtual Machine (VM), similarly to ISO files.
> > >
> > > As you work in Pharo, these files are modified, so you need to make sure that they are writable. The .image and .changes files are intimately linked and should always be kept together, with matching base filenames. Never edit them directly with a text editor, as .images holds your live object runtime memory, which indexes into the .changes files for the source. It is a good idea to keep a backup copy of the downloaded .image and .changes files so you can always start from a fresh image and reload your code. However the most efficient way for backing up code is to use a version control system that will provide an easier and powerful way to back up and track your changes.
> > >
> > > The four main component files above can be placed in the same directory, although it’s also possible to put the Virtual Machine and sources file in a separate directory where everyone has read-only access to them.
> > >
> > > If more than one image file is present in the same directory pharo will prompt you to choose an image file you want to load.
> > >
> > > Do whatever works best for your style of working and your operating system.
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <[hidden email]> wrote:
> > >
> > > > On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]> wrote:
> > > >
> > > > I was adding a short description to the UPBE about sources file , I always thought that the sources file is the file that contains the source code of the image because the image file itself stores only the bytecode.
> > > >
> > > > However its just came to my attention that the sources file does not contain code that is recently installed in the image.
> > > >
> > > > So how exactly the sources file works and what it is ?
> > >
> > > The main perspective is from the object point of view: methods are just objects like everything else. In order to be executable they know their byte codes (which might be JIT compiled on execution, but that is an implementation detail) and they know their source code.
> > >
> > > Today we would probably just store the source code strings in the image (maybe compressed) as memory is pretty cheap. But way back when Smalltalk started, that was not the case. So they decided to map the source code out to files.
> > >
> > > So method source code is a magic string (RemoteString) that points to some position in a file. There are 2 files in use: the sources file and the changes file.
> > >
> > > The sources file is a kind of snapshot of the source code of all methods at the point of release of a major new version. That is why there is a Vxy in their name. The source file never changes once created or renewed (a process called generating the sources, see PharoSourcesCondenser).
> > >
> > > While developing and creating new versions of methods, the new source code is appended to another file called the changes file, much like a transaction log. This is also a safety mechanism to recover 'lost' changes.
> > >
> > > The changes file can contain multiple versions of a method. This can be reduced in size using a process called condensing the changes, see PharoChangesCondenser.
> > >
> > > On a new release, the changes file will be (almost) empty.
> > >
> > > HTH,
> > >
> > > Sven
> > >
> > >
> > >
> >
> >
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

kilon.alios
Its an open book, if people dont like it , it can always been removed. Just because I am most active , thats does not make me the boss. But yes I do disagree that a simple explanation of what the VM really is should be ommited, because for me that is just plain bad documentation to mention something and not at least provide a basic explanation of it which was the case before my additions.

But once again its a free world, you are more than welcomed to contribute and delete etc. I am making a pass on all chapters but in the end, this is not an one man work ;)

On Wed, Jan 13, 2016 at 3:35 PM Sven Van Caekenberghe <[hidden email]> wrote:

> On 13 Jan 2016, at 14:22, Dimitris Chloupis <[hidden email]> wrote:
>
> I assume you have never read a an introduction to C++ then :D

I have and they are too complex.

> here is the final addition for the vm
>
> (Vm) is the only component that is different for each operating system. The main purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed, but also to generally handle low level functionality like interpreting code, handling OS events (mouse and keyboard), calling C libraries etc. Pharo 4 comes with the Cog VM a very fast JIT VM.

You added more technical stuff. I tried to make my point, but you are writing it, not me, so I won't continue.

> I think its clear, precise and does not leave much room for confusion. Personally I think its very important for the absolute begineer to have strong foundations of understanding the fundamental of Pharo and not for things to appear magical and "dont touch this".

Nobody can understand everything at the same time, even experts work on partial abstractions while ignoring most other details.

A beginner has to be guided to the most important concepts first. Explaining something in a simple way is very hard (and I fail most of the time doing that).

Pharo offers curious minds unlimited opportunity to explore, while staying in the same system, but that does not mean everything should be mentioned immediately.

> On Wed, Jan 13, 2016 at 2:54 PM Sven Van Caekenberghe <[hidden email]> wrote:
>
> > On 13 Jan 2016, at 13:42, Dimitris Chloupis <[hidden email]> wrote:
> >
> > I mentioned bytecode because I dont want the user to see at some point bytecode and say "What the hell is that" I want the reader to feel confident that at least understands the basic in Pharo. Also very brief explanations about bytecode I have seen in similar python tutorials. Obviously I dont want to go any deeper than that because the user wont have to worry about the technical details on a daily basis anyway.
> >
> > I agree that I could add a bit more on the VM description similar to what you posted. I am curious though, wont even the interpreter generate machine code in order to execute the code  or does it use existing machine code inside the VM binary ?
>
> No, a classic interpreter does not 'generate' machine code, it is just a program that reads and executes bytes codes in a loop, the interpreter 'is' machine code.
>
> No offence, but you see why I think it is important to not try to use or explain too much complex concepts in the 1st chapter.
>
> Learning to program is hard. It should first be done abstractly. Think about Scratch. The whole idea of Smalltalk is to create a world of interacting objects. (Even byte code is not a necessary concept at all, for example, in Pharo, you can compile (translate) to AST and execute that, I believe. There are Smalltalk implementations that compile directly to C or JavaScript). Hell, even 'compile' is not necessary, just 'accept'. See ?
>
> > On Wed, Jan 13, 2016 at 2:25 PM Sven Van Caekenberghe <[hidden email]> wrote:
> > Sounds about right.
> >
> > Now, I would swap 1 and 4, as the image is the most important abstraction.
> >
> > There is also a bit too much emphasis on (byte|source)code. This is already pretty technical (it assume you know what compilation is and so on). But I understand it must be explained here, and you did it well.
> >
> > However, I would start by saying that the image is a snapshot of the object world in memory that is effectively a live Pharo system. It contains everything that is available and that exists in Pharo. This includes any objects that you created yourself, windows, browsers, open debuggers, executing processes, all meta objects as well as all representations of code.
> >
> > <sidenote>
> > The fact that there is a sources and changes file is an implementation artefact, not something fundamental. There are ideas to change this in the future (but you do not have to mention that).
> > </sidenote>
> >
> > Also, the VM not only executes code, it maintains the object world, which includes the ability to load and save it from and to an image. It creates a portable (cross platform) abstraction that isolates the image from the particular details of the underlying hardware and OS. In that role it implements the interface with the outside world. I would mention that second part before mentioning the code execution.
> >
> > The sentence "The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed." is not 100% correct. It is possible to execute the byte code without converting it. This is called interpretation. JIT is a faster technique that includes converting (some often used) byte code to machine code and caching that.
> >
> > I hope this helps (it is hard to write a 'definitive explanation' as there are some many aspects to this and it depends on the context/audience).
> >
> > > On 13 Jan 2016, at 12:58, Dimitris Chloupis <[hidden email]> wrote:
> > >
> > > So I am correct that the image does not store the source code, and that the source code is stored in sources and changes. The only diffirence is that the objects have a source variable that points to the right place for finding the source code.
> > >
> > > This is the final text if you find anything incorrect please correct me
> > >
> > > ---------------
> > >
> > > 1. The virtual machine (VM) is the only component that is different for each operating system. The purpose of the VM is to take Pharo bytcode that is generated each time user accepts a piece of code and convert it to machine code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable is named:
> > >
> > > • Pharo.exe for Windows; • pharo for Linux ; and
> > >
> > > • Pharo for OSX (inside a package also named Pharo.app).
> > > The other components below are portable across operating systems, and
> > >
> > > can be copied and run on any appropriate virtual machine.
> > >
> > > 2. The sources file contains source code for parts of Pharo that don’t change frequently. Sources file is important because the image file format stores only the bytecode of live objects and not their source code. Typically a new sources file is generated once per major release of Pharo. For Pharo 4.0, this file is named PharoV40.sources.
> > >
> > > 3. The changes file logs of all source code modifications since the .sources file was generated. This facilitates a per method history for diffs or re- verting.That means that even if you dont manage to save the image file on a crash or you just forgot you can recover your changes from this file. Each release provides a near empty file named for the release, for example Pharo4.0.changes.
> > >
> > > 4. The image file provides a frozen in time snapshot of a running Pharo system. This is the file where the Pharo bytecode is stored and as such its a cross platform format. This is the heart of Pharo, containing the live state of all objects in the system (including classes and methods, since they are objects too). The file is named for the release (like Pharo4.0.image).
> > >
> > > The .image and .changes files provided by a Pharo release are the starting point for a live environment that you adapt to your needs. Essentially the image file containes the compiler of the language (not the VM) , the language parser, the IDE tools, many libraries and acts a bit like a virtual Operation System that runs on top of a Virtual Machine (VM), similarly to ISO files.
> > >
> > > As you work in Pharo, these files are modified, so you need to make sure that they are writable. The .image and .changes files are intimately linked and should always be kept together, with matching base filenames. Never edit them directly with a text editor, as .images holds your live object runtime memory, which indexes into the .changes files for the source. It is a good idea to keep a backup copy of the downloaded .image and .changes files so you can always start from a fresh image and reload your code. However the most efficient way for backing up code is to use a version control system that will provide an easier and powerful way to back up and track your changes.
> > >
> > > The four main component files above can be placed in the same directory, although it’s also possible to put the Virtual Machine and sources file in a separate directory where everyone has read-only access to them.
> > >
> > > If more than one image file is present in the same directory pharo will prompt you to choose an image file you want to load.
> > >
> > > Do whatever works best for your style of working and your operating system.
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <[hidden email]> wrote:
> > >
> > > > On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]> wrote:
> > > >
> > > > I was adding a short description to the UPBE about sources file , I always thought that the sources file is the file that contains the source code of the image because the image file itself stores only the bytecode.
> > > >
> > > > However its just came to my attention that the sources file does not contain code that is recently installed in the image.
> > > >
> > > > So how exactly the sources file works and what it is ?
> > >
> > > The main perspective is from the object point of view: methods are just objects like everything else. In order to be executable they know their byte codes (which might be JIT compiled on execution, but that is an implementation detail) and they know their source code.
> > >
> > > Today we would probably just store the source code strings in the image (maybe compressed) as memory is pretty cheap. But way back when Smalltalk started, that was not the case. So they decided to map the source code out to files.
> > >
> > > So method source code is a magic string (RemoteString) that points to some position in a file. There are 2 files in use: the sources file and the changes file.
> > >
> > > The sources file is a kind of snapshot of the source code of all methods at the point of release of a major new version. That is why there is a Vxy in their name. The source file never changes once created or renewed (a process called generating the sources, see PharoSourcesCondenser).
> > >
> > > While developing and creating new versions of methods, the new source code is appended to another file called the changes file, much like a transaction log. This is also a safety mechanism to recover 'lost' changes.
> > >
> > > The changes file can contain multiple versions of a method. This can be reduced in size using a process called condensing the changes, see PharoChangesCondenser.
> > >
> > > On a new release, the changes file will be (almost) empty.
> > >
> > > HTH,
> > >
> > > Sven
> > >
> > >
> > >
> >
> >
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

David Allouche
In reply to this post by Sven Van Caekenberghe-2
Hey,

Since I just recently figured that stuff out, my perspective might be useful.

I do not think it is a good idea to push external VCS too much at this point. They are important to collaboration, so they should be mentioned, but they add a lot of complexity.

Taking into account the feedback from Sven, and my own ideas, here's how I would write it. I hope this helps.

  1. The virtual machine (VM) provides the environment where the Pharo system lives. It is different for each operating system and hardware architecture, and runs as a machine language executable in the operating system. It implements the details of managing memory, executing Pharo byte-code, and communicating with the world outside of the Pharo system: files, other operating system process, and the network.
  2. The image is the state of all objects in a running Pharo system: classes, method as source and byte-code, windows, VM processes. All of those are objects. The virtual machine can load image files, and save running images back to disk.
  3. The sources file is a way to save space by putting the source code of classes outside of the image. Since an image contains byte-compiled methods, it can run without its associated sources files. This file only provides information that useful to the programmer. It is generated once per major release of Pharo, and is usually stored in the same directory as the image. Several images can use the same sources file.
  4. The changes file logs all source code changes since the generation of the sources file. It lets you examine and revert source code changes. It also serves as a journal so you can recover changes you made but did not save to the disk image, by mistake or because of a crash. An image does not need its changes file to run. Each change file belong to a single image.

NOTE: Some virtual machines run in web browsers, and are Javascript programs instead of machine language. They provide the same services as other virtual machines, they just treat Javascript as the machine language and the browser as the operating system.

The image, sources and changes files are portable across operating systems, and can be copied and run on any appropriate virtual machine. The format of the sources and changes file is defined by code stored in the image, not the VM.

A complete Pharo release contains the following files:

  • An executable virtual machine, named:
    • Pharo.exe for Windows;
    • pharo for Linux;
    • Pharo for OS X (inside a package named Pharo.app).
  • An image, named after the release: Pharo4.0.image
  • A sources file, named after the release: PharoV40.sources.
  • A nearly empty changes file, named after the image: Pharo4.0.changes

They are starting point for a live environment that you adapt to your needs. The .image file containes the byte-compiler of the language, the language parser, the IDE tools, many libraries and provide a virtual operating system that runs on top of a virtual machine.

As you work in Pharo, the .image and .changes files are modified, so you need to make sure that they are writable. The .image and .changes files are intimately linked and should always be kept together, with matching filenames. Never edit them directly with a text editor, as .images holds your live object runtime memory, which indexes into the .changes files for the source.

It is a good idea to keep a backup copy of the downloaded .image and .changes files so you can always start from a fresh image and reload your code. But to manage changes on larger projects you should also use a version control system that will provide more control to record and communicate your changes.

The four main component files above can be placed in the same directory, although it’s also possible to put the Virtual Machine and sources file in a separate directory where everyone has read-only access to them.

If more than one image file is present in the same directory pharo will prompt you to choose an image file you want to load.

Do whatever works best for your style of working and your operating system.
Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

wernerk
In reply to this post by kilon.alios
Hi Dimitris,
your formulation "...Pharo bytcode...and convert it to machine code..."
is insofar irritating to me as "convert it to machine code" would
suggest to me that a compiler is at work here. Davids "executing Pharo
byte-code" seems more understandable to me here.
werner

On 01/13/2016 02:22 PM, Dimitris Chloupis wrote:

> I assume you have never read a an introduction to C++ then :D
>
> here is the final addition for the vm
>
> (Vm) is the only component that is different for each operating system.
> The main purpose of the VM is to take Pharo bytcode that is generated
> each time user accepts a piece of code and convert it to machine code in
> order to be executed, but also to generally handle low level
> functionality like interpreting code, handling OS events (mouse and
> keyboard), calling C libraries etc. Pharo 4 comes with the Cog VM a very
> fast JIT VM.
>
> I think its clear, precise and does not leave much room for confusion.
> Personally I think its very important for the absolute begineer to have
> strong foundations of understanding the fundamental of Pharo and not for
> things to appear magical and "dont touch this".
>
> On Wed, Jan 13, 2016 at 2:54 PM Sven Van Caekenberghe <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>
>      > On 13 Jan 2016, at 13:42, Dimitris Chloupis
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      >
>      > I mentioned bytecode because I dont want the user to see at some
>     point bytecode and say "What the hell is that" I want the reader to
>     feel confident that at least understands the basic in Pharo. Also
>     very brief explanations about bytecode I have seen in similar python
>     tutorials. Obviously I dont want to go any deeper than that because
>     the user wont have to worry about the technical details on a daily
>     basis anyway.
>      >
>      > I agree that I could add a bit more on the VM description similar
>     to what you posted. I am curious though, wont even the interpreter
>     generate machine code in order to execute the code  or does it use
>     existing machine code inside the VM binary ?
>
>     No, a classic interpreter does not 'generate' machine code, it is
>     just a program that reads and executes bytes codes in a loop, the
>     interpreter 'is' machine code.
>
>     No offence, but you see why I think it is important to not try to
>     use or explain too much complex concepts in the 1st chapter.
>
>     Learning to program is hard. It should first be done abstractly.
>     Think about Scratch. The whole idea of Smalltalk is to create a
>     world of interacting objects. (Even byte code is not a necessary
>     concept at all, for example, in Pharo, you can compile (translate)
>     to AST and execute that, I believe. There are Smalltalk
>     implementations that compile directly to C or JavaScript). Hell,
>     even 'compile' is not necessary, just 'accept'. See ?
>
>      > On Wed, Jan 13, 2016 at 2:25 PM Sven Van Caekenberghe
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      > Sounds about right.
>      >
>      > Now, I would swap 1 and 4, as the image is the most important
>     abstraction.
>      >
>      > There is also a bit too much emphasis on (byte|source)code. This
>     is already pretty technical (it assume you know what compilation is
>     and so on). But I understand it must be explained here, and you did
>     it well.
>      >
>      > However, I would start by saying that the image is a snapshot of
>     the object world in memory that is effectively a live Pharo system.
>     It contains everything that is available and that exists in Pharo.
>     This includes any objects that you created yourself, windows,
>     browsers, open debuggers, executing processes, all meta objects as
>     well as all representations of code.
>      >
>      > <sidenote>
>      > The fact that there is a sources and changes file is an
>     implementation artefact, not something fundamental. There are ideas
>     to change this in the future (but you do not have to mention that).
>      > </sidenote>
>      >
>      > Also, the VM not only executes code, it maintains the object
>     world, which includes the ability to load and save it from and to an
>     image. It creates a portable (cross platform) abstraction that
>     isolates the image from the particular details of the underlying
>     hardware and OS. In that role it implements the interface with the
>     outside world. I would mention that second part before mentioning
>     the code execution.
>      >
>      > The sentence "The purpose of the VM is to take Pharo bytcode that
>     is generated each time user accepts a piece of code and convert it
>     to machine code in order to be executed." is not 100% correct. It is
>     possible to execute the byte code without converting it. This is
>     called interpretation. JIT is a faster technique that includes
>     converting (some often used) byte code to machine code and caching that.
>      >
>      > I hope this helps (it is hard to write a 'definitive explanation'
>     as there are some many aspects to this and it depends on the
>     context/audience).
>      >
>      > > On 13 Jan 2016, at 12:58, Dimitris Chloupis
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      > >
>      > > So I am correct that the image does not store the source code,
>     and that the source code is stored in sources and changes. The only
>     diffirence is that the objects have a source variable that points to
>     the right place for finding the source code.
>      > >
>      > > This is the final text if you find anything incorrect please
>     correct me
>      > >
>      > > ---------------
>      > >
>      > > 1. The virtual machine (VM) is the only component that is
>     different for each operating system. The purpose of the VM is to
>     take Pharo bytcode that is generated each time user accepts a piece
>     of code and convert it to machine code in order to be executed.
>     Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable
>     is named:
>      > >
>      > > • Pharo.exe for Windows; • pharo for Linux ; and
>      > >
>      > > • Pharo for OSX (inside a package also named Pharo.app).
>      > > The other components below are portable across operating
>     systems, and
>      > >
>      > > can be copied and run on any appropriate virtual machine.
>      > >
>      > > 2. The sources file contains source code for parts of Pharo
>     that don’t change frequently. Sources file is important because the
>     image file format stores only the bytecode of live objects and not
>     their source code. Typically a new sources file is generated once
>     per major release of Pharo. For Pharo 4.0, this file is named
>     PharoV40.sources.
>      > >
>      > > 3. The changes file logs of all source code modifications since
>     the .sources file was generated. This facilitates a per method
>     history for diffs or re- verting.That means that even if you dont
>     manage to save the image file on a crash or you just forgot you can
>     recover your changes from this file. Each release provides a near
>     empty file named for the release, for example Pharo4.0.changes.
>      > >
>      > > 4. The image file provides a frozen in time snapshot of a
>     running Pharo system. This is the file where the Pharo bytecode is
>     stored and as such its a cross platform format. This is the heart of
>     Pharo, containing the live state of all objects in the system
>     (including classes and methods, since they are objects too). The
>     file is named for the release (like Pharo4.0.image).
>      > >
>      > > The .image and .changes files provided by a Pharo release are
>     the starting point for a live environment that you adapt to your
>     needs. Essentially the image file containes the compiler of the
>     language (not the VM) , the language parser, the IDE tools, many
>     libraries and acts a bit like a virtual Operation System that runs
>     on top of a Virtual Machine (VM), similarly to ISO files.
>      > >
>      > > As you work in Pharo, these files are modified, so you need to
>     make sure that they are writable. The .image and .changes files are
>     intimately linked and should always be kept together, with matching
>     base filenames. Never edit them directly with a text editor, as
>     .images holds your live object runtime memory, which indexes into
>     the .changes files for the source. It is a good idea to keep a
>     backup copy of the downloaded .image and .changes files so you can
>     always start from a fresh image and reload your code. However the
>     most efficient way for backing up code is to use a version control
>     system that will provide an easier and powerful way to back up and
>     track your changes.
>      > >
>      > > The four main component files above can be placed in the same
>     directory, although it’s also possible to put the Virtual Machine
>     and sources file in a separate directory where everyone has
>     read-only access to them.
>      > >
>      > > If more than one image file is present in the same directory
>     pharo will prompt you to choose an image file you want to load.
>      > >
>      > > Do whatever works best for your style of working and your
>     operating system.
>      > >
>      > >
>      > >
>      > >
>      > >
>      > > On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      > >
>      > > > On 13 Jan 2016, at 10:57, Dimitris Chloupis
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      > > >
>      > > > I was adding a short description to the UPBE about sources
>     file , I always thought that the sources file is the file that
>     contains the source code of the image because the image file itself
>     stores only the bytecode.
>      > > >
>      > > > However its just came to my attention that the sources file
>     does not contain code that is recently installed in the image.
>      > > >
>      > > > So how exactly the sources file works and what it is ?
>      > >
>      > > The main perspective is from the object point of view: methods
>     are just objects like everything else. In order to be executable
>     they know their byte codes (which might be JIT compiled on
>     execution, but that is an implementation detail) and they know their
>     source code.
>      > >
>      > > Today we would probably just store the source code strings in
>     the image (maybe compressed) as memory is pretty cheap. But way back
>     when Smalltalk started, that was not the case. So they decided to
>     map the source code out to files.
>      > >
>      > > So method source code is a magic string (RemoteString) that
>     points to some position in a file. There are 2 files in use: the
>     sources file and the changes file.
>      > >
>      > > The sources file is a kind of snapshot of the source code of
>     all methods at the point of release of a major new version. That is
>     why there is a Vxy in their name. The source file never changes once
>     created or renewed (a process called generating the sources, see
>     PharoSourcesCondenser).
>      > >
>      > > While developing and creating new versions of methods, the new
>     source code is appended to another file called the changes file,
>     much like a transaction log. This is also a safety mechanism to
>     recover 'lost' changes.
>      > >
>      > > The changes file can contain multiple versions of a method.
>     This can be reduced in size using a process called condensing the
>     changes, see PharoChangesCondenser.
>      > >
>      > > On a new release, the changes file will be (almost) empty.
>      > >
>      > > HTH,
>      > >
>      > > Sven
>      > >
>      > >
>      > >
>      >
>      >
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

kilon.alios
"The virtual machine (VM) provides the environment where the Pharo system lives. It is different for each operating system and hardware architecture, and runs as a machine language executable in the operating system. It implements the details of managing memory, executing Pharo byte-code, and communicating with the world outside of the Pharo system: files, other operating system process, and the network."

No the environment is the image, the VM is basically what its names says, a machine emulated by software. The vast majority of tools, even the language itself reside on the image. VM is there in order for the code to be able to execute and to interface with the underlying Operating System. You could completely modify the VM , for example move it to the JVM and still the pharo enviroment would be intact.

"Hi Dimitris,
your formulation "...Pharo bytcode...and convert it to machine code..."
is insofar irritating to me as "convert it to machine code" would
suggest to me that a compiler is at work here. Davids "executing Pharo
byte-code" seems more understandable to me here."

Thats correct its a compiler, a byte compiler, it compiles bytecode to machine code and it does it while the code executes, this is why its called JIT , which has the meaning of Just In Time compilation, meaning that machine code is compiled just before the code is executed so several optimizations can be applied that would not be known before the execution of the code. Similar to JAVA's JIT compiler.

Note here that a compiler is not just something that produces machine code, a compiler for example can take one language and compile it to another language.

On Wed, Jan 13, 2016 at 4:58 PM Werner Kassens <[hidden email]> wrote:
Hi Dimitris,
your formulation "...Pharo bytcode...and convert it to machine code..."
is insofar irritating to me as "convert it to machine code" would
suggest to me that a compiler is at work here. Davids "executing Pharo
byte-code" seems more understandable to me here.
werner

On 01/13/2016 02:22 PM, Dimitris Chloupis wrote:
> I assume you have never read a an introduction to C++ then :D
>
> here is the final addition for the vm
>
> (Vm) is the only component that is different for each operating system.
> The main purpose of the VM is to take Pharo bytcode that is generated
> each time user accepts a piece of code and convert it to machine code in
> order to be executed, but also to generally handle low level
> functionality like interpreting code, handling OS events (mouse and
> keyboard), calling C libraries etc. Pharo 4 comes with the Cog VM a very
> fast JIT VM.
>
> I think its clear, precise and does not leave much room for confusion.
> Personally I think its very important for the absolute begineer to have
> strong foundations of understanding the fundamental of Pharo and not for
> things to appear magical and "dont touch this".
>
> On Wed, Jan 13, 2016 at 2:54 PM Sven Van Caekenberghe <[hidden email]
> <mailto:[hidden email]>> wrote:
>
>
>      > On 13 Jan 2016, at 13:42, Dimitris Chloupis
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      >
>      > I mentioned bytecode because I dont want the user to see at some
>     point bytecode and say "What the hell is that" I want the reader to
>     feel confident that at least understands the basic in Pharo. Also
>     very brief explanations about bytecode I have seen in similar python
>     tutorials. Obviously I dont want to go any deeper than that because
>     the user wont have to worry about the technical details on a daily
>     basis anyway.
>      >
>      > I agree that I could add a bit more on the VM description similar
>     to what you posted. I am curious though, wont even the interpreter
>     generate machine code in order to execute the code  or does it use
>     existing machine code inside the VM binary ?
>
>     No, a classic interpreter does not 'generate' machine code, it is
>     just a program that reads and executes bytes codes in a loop, the
>     interpreter 'is' machine code.
>
>     No offence, but you see why I think it is important to not try to
>     use or explain too much complex concepts in the 1st chapter.
>
>     Learning to program is hard. It should first be done abstractly.
>     Think about Scratch. The whole idea of Smalltalk is to create a
>     world of interacting objects. (Even byte code is not a necessary
>     concept at all, for example, in Pharo, you can compile (translate)
>     to AST and execute that, I believe. There are Smalltalk
>     implementations that compile directly to C or JavaScript). Hell,
>     even 'compile' is not necessary, just 'accept'. See ?
>
>      > On Wed, Jan 13, 2016 at 2:25 PM Sven Van Caekenberghe
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      > Sounds about right.
>      >
>      > Now, I would swap 1 and 4, as the image is the most important
>     abstraction.
>      >
>      > There is also a bit too much emphasis on (byte|source)code. This
>     is already pretty technical (it assume you know what compilation is
>     and so on). But I understand it must be explained here, and you did
>     it well.
>      >
>      > However, I would start by saying that the image is a snapshot of
>     the object world in memory that is effectively a live Pharo system.
>     It contains everything that is available and that exists in Pharo.
>     This includes any objects that you created yourself, windows,
>     browsers, open debuggers, executing processes, all meta objects as
>     well as all representations of code.
>      >
>      > <sidenote>
>      > The fact that there is a sources and changes file is an
>     implementation artefact, not something fundamental. There are ideas
>     to change this in the future (but you do not have to mention that).
>      > </sidenote>
>      >
>      > Also, the VM not only executes code, it maintains the object
>     world, which includes the ability to load and save it from and to an
>     image. It creates a portable (cross platform) abstraction that
>     isolates the image from the particular details of the underlying
>     hardware and OS. In that role it implements the interface with the
>     outside world. I would mention that second part before mentioning
>     the code execution.
>      >
>      > The sentence "The purpose of the VM is to take Pharo bytcode that
>     is generated each time user accepts a piece of code and convert it
>     to machine code in order to be executed." is not 100% correct. It is
>     possible to execute the byte code without converting it. This is
>     called interpretation. JIT is a faster technique that includes
>     converting (some often used) byte code to machine code and caching that.
>      >
>      > I hope this helps (it is hard to write a 'definitive explanation'
>     as there are some many aspects to this and it depends on the
>     context/audience).
>      >
>      > > On 13 Jan 2016, at 12:58, Dimitris Chloupis
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      > >
>      > > So I am correct that the image does not store the source code,
>     and that the source code is stored in sources and changes. The only
>     diffirence is that the objects have a source variable that points to
>     the right place for finding the source code.
>      > >
>      > > This is the final text if you find anything incorrect please
>     correct me
>      > >
>      > > ---------------
>      > >
>      > > 1. The virtual machine (VM) is the only component that is
>     different for each operating system. The purpose of the VM is to
>     take Pharo bytcode that is generated each time user accepts a piece
>     of code and convert it to machine code in order to be executed.
>     Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable
>     is named:
>      > >
>      > > • Pharo.exe for Windows; • pharo for Linux ; and
>      > >
>      > > • Pharo for OSX (inside a package also named Pharo.app).
>      > > The other components below are portable across operating
>     systems, and
>      > >
>      > > can be copied and run on any appropriate virtual machine.
>      > >
>      > > 2. The sources file contains source code for parts of Pharo
>     that don’t change frequently. Sources file is important because the
>     image file format stores only the bytecode of live objects and not
>     their source code. Typically a new sources file is generated once
>     per major release of Pharo. For Pharo 4.0, this file is named
>     PharoV40.sources.
>      > >
>      > > 3. The changes file logs of all source code modifications since
>     the .sources file was generated. This facilitates a per method
>     history for diffs or re- verting.That means that even if you dont
>     manage to save the image file on a crash or you just forgot you can
>     recover your changes from this file. Each release provides a near
>     empty file named for the release, for example Pharo4.0.changes.
>      > >
>      > > 4. The image file provides a frozen in time snapshot of a
>     running Pharo system. This is the file where the Pharo bytecode is
>     stored and as such its a cross platform format. This is the heart of
>     Pharo, containing the live state of all objects in the system
>     (including classes and methods, since they are objects too). The
>     file is named for the release (like Pharo4.0.image).
>      > >
>      > > The .image and .changes files provided by a Pharo release are
>     the starting point for a live environment that you adapt to your
>     needs. Essentially the image file containes the compiler of the
>     language (not the VM) , the language parser, the IDE tools, many
>     libraries and acts a bit like a virtual Operation System that runs
>     on top of a Virtual Machine (VM), similarly to ISO files.
>      > >
>      > > As you work in Pharo, these files are modified, so you need to
>     make sure that they are writable. The .image and .changes files are
>     intimately linked and should always be kept together, with matching
>     base filenames. Never edit them directly with a text editor, as
>     .images holds your live object runtime memory, which indexes into
>     the .changes files for the source. It is a good idea to keep a
>     backup copy of the downloaded .image and .changes files so you can
>     always start from a fresh image and reload your code. However the
>     most efficient way for backing up code is to use a version control
>     system that will provide an easier and powerful way to back up and
>     track your changes.
>      > >
>      > > The four main component files above can be placed in the same
>     directory, although it’s also possible to put the Virtual Machine
>     and sources file in a separate directory where everyone has
>     read-only access to them.
>      > >
>      > > If more than one image file is present in the same directory
>     pharo will prompt you to choose an image file you want to load.
>      > >
>      > > Do whatever works best for your style of working and your
>     operating system.
>      > >
>      > >
>      > >
>      > >
>      > >
>      > > On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      > >
>      > > > On 13 Jan 2016, at 10:57, Dimitris Chloupis
>     <[hidden email] <mailto:[hidden email]>> wrote:
>      > > >
>      > > > I was adding a short description to the UPBE about sources
>     file , I always thought that the sources file is the file that
>     contains the source code of the image because the image file itself
>     stores only the bytecode.
>      > > >
>      > > > However its just came to my attention that the sources file
>     does not contain code that is recently installed in the image.
>      > > >
>      > > > So how exactly the sources file works and what it is ?
>      > >
>      > > The main perspective is from the object point of view: methods
>     are just objects like everything else. In order to be executable
>     they know their byte codes (which might be JIT compiled on
>     execution, but that is an implementation detail) and they know their
>     source code.
>      > >
>      > > Today we would probably just store the source code strings in
>     the image (maybe compressed) as memory is pretty cheap. But way back
>     when Smalltalk started, that was not the case. So they decided to
>     map the source code out to files.
>      > >
>      > > So method source code is a magic string (RemoteString) that
>     points to some position in a file. There are 2 files in use: the
>     sources file and the changes file.
>      > >
>      > > The sources file is a kind of snapshot of the source code of
>     all methods at the point of release of a major new version. That is
>     why there is a Vxy in their name. The source file never changes once
>     created or renewed (a process called generating the sources, see
>     PharoSourcesCondenser).
>      > >
>      > > While developing and creating new versions of methods, the new
>     source code is appended to another file called the changes file,
>     much like a transaction log. This is also a safety mechanism to
>     recover 'lost' changes.
>      > >
>      > > The changes file can contain multiple versions of a method.
>     This can be reduced in size using a process called condensing the
>     changes, see PharoChangesCondenser.
>      > >
>      > > On a new release, the changes file will be (almost) empty.
>      > >
>      > > HTH,
>      > >
>      > > Sven
>      > >
>      > >
>      > >
>      >
>      >
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

Ben Coman
In reply to this post by kilon.alios
Oh! I should have guessed there'd be more alternative suggestions and held back.
Anyway, its done now, so just pick out what you like...
Now I tend to think of the Image as more than just the .image file, so...

The heart of Pharo is the *Image*.  This holds the live running state
of a Pharo system. Tools like the IDE, language parser and interactive
playground run *inside* the Image to write code and create interacting
objects which comprise your application.  Saving the Image to a
".image" file creates a frozen-in-time snapshot of the running system
state. This can be moved to another system and resumed running as if
the Image had never stopped.  The Pharo download provides
/Pharo.image/ file as the starting point for the live environment that
you adapt to your needs.

The Image runs on a Virtual Machine.  This abstracts away operating
system and CPU architecture differences to allow the same ".image"
file (the snapshot of you live system) to open and run on any support
platform. The VM executable file is a different for each platform:
* pharo for Linux and Unix
* Pharo.exe for Windows
* Pharo for OSX

Your source-code is compiled to byte-code in Image to become part of
the live system state.  This occurs as soon as a modified method is
accepted.  Only the modified method is compiled, not the whole source
code, and the bytecode is immediately runnable.  This facilitates a
very fast edit-compile-run-debug loop at the root of Pharo's
productivity.   The VM optimally interprets or just-in-time compiles
the bytecode as appropriate for best performance.

As an implementation detail carried over from when ram was scarce, the
source-code is not stored in-Image.  It is spread across two files:

* Sources for parts of Pharo (i.e. the tools) that don’t change
frequently are stored in the "*.sources" files.   It is generated as a
static file per major release. Thus for  Pharo 4.0 it is named
PharoV40.sources, and keeps that name when the image file name
changes.   The *.sources file is not essential for running the Image,
but is important because without it, you can't examine and learn from
the implementation of Pharo's tools (e.g Collections, Graphics
libraries, IDE and compiler.)

* All source code changes are journal logged to the "*.changes" file.
This facilitates easy access to per method history for diffs or
reverting.  Further, if you close or crash your image without saving,
an in-Image tool can replay your "lost" changes from this file. The
"*.changes" file is tightly coupled to the "*.image" file because it
records the source code changes for a particular Image.  The basename
of these two files must always be the same (e.g. my.image &
my.changes).  Each release provides a near empty file, for example
Pharo4.changes to match Pharo4.image.

Typically the VM and .sources files are distributed/stored together,
since both are static files (per machine per Pharo Release) and both
can be shared between multiple Images; and the dynamic .image and
.changes files are distributed/stored together.




> The other components below are portable across operating systems, and


The image file provides a frozen in time snapshot of a running Pharo
> system. This is the file where the Pharo bytecode is stored and as such its
> a cross platform format. This is the heart of Pharo, containing the live
> state of all objects in the system (including classes and methods, since
> they are objects too). The file is named for the release (like
> Pharo4.0.image).
>



On Wed, Jan 13, 2016 at 7:58 PM, Dimitris Chloupis
<[hidden email]> wrote:

> So I am correct that the image does not store the source code, and that the
> source code is stored in sources and changes. The only diffirence is that
> the objects have a source variable that points to the right place for
> finding the source code.
>
> This is the final text if you find anything incorrect please correct me
>
> ---------------
>
> 1. The virtual machine (VM) is the only component that is different for each
> operating system. The purpose of the VM is to take Pharo bytcode that is
> generated each time user accepts a piece of code and convert it to machine
> code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT
> VM. The VM executable is named:
>
> • Pharo.exe for Windows; • pharo for Linux ; and
>
> • Pharo for OSX (inside a package also named Pharo.app).
> The other components below are portable across operating systems, and
>
> can be copied and run on any appropriate virtual machine.
>
> 2. The sources file contains source code for parts of Pharo that don’t
> change frequently. Sources file is important because the image file format
> stores only the bytecode of live objects and not their source code.
> Typically a new sources file is generated once per major release of Pharo.
> For Pharo 4.0, this file is named PharoV40.sources.
>
> 3. The changes file logs of all source code modifications since the .sources
> file was generated. This facilitates a per method history for diffs or re-
> verting.That means that even if you dont manage to save the image file on a
> crash or you just forgot you can recover your changes from this file. Each
> release provides a near empty file named for the release, for example
> Pharo4.0.changes.
>
> 4. The image file provides a frozen in time snapshot of a running Pharo
> system. This is the file where the Pharo bytecode is stored and as such its
> a cross platform format. This is the heart of Pharo, containing the live
> state of all objects in the system (including classes and methods, since
> they are objects too). The file is named for the release (like
> Pharo4.0.image).
>
> The .image and .changes files provided by a Pharo release are the starting
> point for a live environment that you adapt to your needs. Essentially the
> image file containes the compiler of the language (not the VM) , the
> language parser, the IDE tools, many libraries and acts a bit like a virtual
> Operation System that runs on top of a Virtual Machine (VM), similarly to
> ISO files.

I don't get this similarity with ISO files.

>
> As you work in Pharo, these files are modified, so you need to make sure
> that they are writable. The .image and .changes files are intimately linked
> and should always be kept together, with matching base filenames. Never edit
> them directly with a text editor, as .images holds your live object runtime
> memory, which indexes into the .changes files for the source. It is a good
> idea to keep a backup copy of the downloaded .image and .changes files so
> you can always start from a fresh image and reload your code. However the
> most efficient way for backing up code is to use a version control system
> that will provide an easier and powerful way to back up and track your
> changes.
>
> The four main component files above can be placed in the same directory,
> although it’s also possible to put the Virtual Machine and sources file in a
> separate directory where everyone has read-only access to them.
>
> If more than one image file is present in the same directory pharo will
> prompt you to choose an image file you want to load.
>
> Do whatever works best for your style of working and your operating system.
>
>
>
>
>
> On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <[hidden email]> wrote:
>>
>>
>> > On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]>
>> > wrote:
>> >
>> > I was adding a short description to the UPBE about sources file , I
>> > always thought that the sources file is the file that contains the source
>> > code of the image because the image file itself stores only the bytecode.
>> >
>> > However its just came to my attention that the sources file does not
>> > contain code that is recently installed in the image.
>> >
>> > So how exactly the sources file works and what it is ?
>>
>> The main perspective is from the object point of view: methods are just
>> objects like everything else. In order to be executable they know their byte
>> codes (which might be JIT compiled on execution, but that is an
>> implementation detail) and they know their source code.
>>
>> Today we would probably just store the source code strings in the image
>> (maybe compressed) as memory is pretty cheap. But way back when Smalltalk
>> started, that was not the case. So they decided to map the source code out
>> to files.
>>
>> So method source code is a magic string (RemoteString) that points to some
>> position in a file. There are 2 files in use: the sources file and the
>> changes file.
>>
>> The sources file is a kind of snapshot of the source code of all methods
>> at the point of release of a major new version. That is why there is a Vxy
>> in their name. The source file never changes once created or renewed (a
>> process called generating the sources, see PharoSourcesCondenser).
>>
>> While developing and creating new versions of methods, the new source code
>> is appended to another file called the changes file, much like a transaction
>> log. This is also a safety mechanism to recover 'lost' changes.
>>
>> The changes file can contain multiple versions of a method. This can be
>> reduced in size using a process called condensing the changes, see
>> PharoChangesCondenser.
>>
>> On a new release, the changes file will be (almost) empty.
>>
>> HTH,
>>
>> Sven
>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

kilon.alios
great explanation Ben thanks for sharing, I agree 100% . ISO files are popular file formats used by other VMs , like VirtualBOX unlike pharo image they are not a memory dump but rather a hard disk dump which is later loaded in memory usually to emulate booting from a DVD or booting a HD drive

https://en.wikipedia.org/wiki/ISO_image

Of course because in the pharo image we store every live object that means everything the similarity in how both formats are used is very similar way.

On Wed, Jan 13, 2016 at 5:31 PM Ben Coman <[hidden email]> wrote:
Oh! I should have guessed there'd be more alternative suggestions and held back.
Anyway, its done now, so just pick out what you like...
Now I tend to think of the Image as more than just the .image file, so...

The heart of Pharo is the *Image*.  This holds the live running state
of a Pharo system. Tools like the IDE, language parser and interactive
playground run *inside* the Image to write code and create interacting
objects which comprise your application.  Saving the Image to a
".image" file creates a frozen-in-time snapshot of the running system
state. This can be moved to another system and resumed running as if
the Image had never stopped.  The Pharo download provides
/Pharo.image/ file as the starting point for the live environment that
you adapt to your needs.

The Image runs on a Virtual Machine.  This abstracts away operating
system and CPU architecture differences to allow the same ".image"
file (the snapshot of you live system) to open and run on any support
platform. The VM executable file is a different for each platform:
* pharo for Linux and Unix
* Pharo.exe for Windows
* Pharo for OSX

Your source-code is compiled to byte-code in Image to become part of
the live system state.  This occurs as soon as a modified method is
accepted.  Only the modified method is compiled, not the whole source
code, and the bytecode is immediately runnable.  This facilitates a
very fast edit-compile-run-debug loop at the root of Pharo's
productivity.   The VM optimally interprets or just-in-time compiles
the bytecode as appropriate for best performance.

As an implementation detail carried over from when ram was scarce, the
source-code is not stored in-Image.  It is spread across two files:

* Sources for parts of Pharo (i.e. the tools) that don’t change
frequently are stored in the "*.sources" files.   It is generated as a
static file per major release. Thus for  Pharo 4.0 it is named
PharoV40.sources, and keeps that name when the image file name
changes.   The *.sources file is not essential for running the Image,
but is important because without it, you can't examine and learn from
the implementation of Pharo's tools (e.g Collections, Graphics
libraries, IDE and compiler.)

* All source code changes are journal logged to the "*.changes" file.
This facilitates easy access to per method history for diffs or
reverting.  Further, if you close or crash your image without saving,
an in-Image tool can replay your "lost" changes from this file. The
"*.changes" file is tightly coupled to the "*.image" file because it
records the source code changes for a particular Image.  The basename
of these two files must always be the same (e.g. my.image &
my.changes).  Each release provides a near empty file, for example
Pharo4.changes to match Pharo4.image.

Typically the VM and .sources files are distributed/stored together,
since both are static files (per machine per Pharo Release) and both
can be shared between multiple Images; and the dynamic .image and
.changes files are distributed/stored together.




> The other components below are portable across operating systems, and


The image file provides a frozen in time snapshot of a running Pharo
> system. This is the file where the Pharo bytecode is stored and as such its
> a cross platform format. This is the heart of Pharo, containing the live
> state of all objects in the system (including classes and methods, since
> they are objects too). The file is named for the release (like
> Pharo4.0.image).
>



On Wed, Jan 13, 2016 at 7:58 PM, Dimitris Chloupis
<[hidden email]> wrote:
> So I am correct that the image does not store the source code, and that the
> source code is stored in sources and changes. The only diffirence is that
> the objects have a source variable that points to the right place for
> finding the source code.
>
> This is the final text if you find anything incorrect please correct me
>
> ---------------
>
> 1. The virtual machine (VM) is the only component that is different for each
> operating system. The purpose of the VM is to take Pharo bytcode that is
> generated each time user accepts a piece of code and convert it to machine
> code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT
> VM. The VM executable is named:
>
> • Pharo.exe for Windows; • pharo for Linux ; and
>
> • Pharo for OSX (inside a package also named Pharo.app).
> The other components below are portable across operating systems, and
>
> can be copied and run on any appropriate virtual machine.
>
> 2. The sources file contains source code for parts of Pharo that don’t
> change frequently. Sources file is important because the image file format
> stores only the bytecode of live objects and not their source code.
> Typically a new sources file is generated once per major release of Pharo.
> For Pharo 4.0, this file is named PharoV40.sources.
>
> 3. The changes file logs of all source code modifications since the .sources
> file was generated. This facilitates a per method history for diffs or re-
> verting.That means that even if you dont manage to save the image file on a
> crash or you just forgot you can recover your changes from this file. Each
> release provides a near empty file named for the release, for example
> Pharo4.0.changes.
>
> 4. The image file provides a frozen in time snapshot of a running Pharo
> system. This is the file where the Pharo bytecode is stored and as such its
> a cross platform format. This is the heart of Pharo, containing the live
> state of all objects in the system (including classes and methods, since
> they are objects too). The file is named for the release (like
> Pharo4.0.image).
>
> The .image and .changes files provided by a Pharo release are the starting
> point for a live environment that you adapt to your needs. Essentially the
> image file containes the compiler of the language (not the VM) , the
> language parser, the IDE tools, many libraries and acts a bit like a virtual
> Operation System that runs on top of a Virtual Machine (VM), similarly to
> ISO files.

I don't get this similarity with ISO files.

>
> As you work in Pharo, these files are modified, so you need to make sure
> that they are writable. The .image and .changes files are intimately linked
> and should always be kept together, with matching base filenames. Never edit
> them directly with a text editor, as .images holds your live object runtime
> memory, which indexes into the .changes files for the source. It is a good
> idea to keep a backup copy of the downloaded .image and .changes files so
> you can always start from a fresh image and reload your code. However the
> most efficient way for backing up code is to use a version control system
> that will provide an easier and powerful way to back up and track your
> changes.
>
> The four main component files above can be placed in the same directory,
> although it’s also possible to put the Virtual Machine and sources file in a
> separate directory where everyone has read-only access to them.
>
> If more than one image file is present in the same directory pharo will
> prompt you to choose an image file you want to load.
>
> Do whatever works best for your style of working and your operating system.
>
>
>
>
>
> On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <[hidden email]> wrote:
>>
>>
>> > On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]>
>> > wrote:
>> >
>> > I was adding a short description to the UPBE about sources file , I
>> > always thought that the sources file is the file that contains the source
>> > code of the image because the image file itself stores only the bytecode.
>> >
>> > However its just came to my attention that the sources file does not
>> > contain code that is recently installed in the image.
>> >
>> > So how exactly the sources file works and what it is ?
>>
>> The main perspective is from the object point of view: methods are just
>> objects like everything else. In order to be executable they know their byte
>> codes (which might be JIT compiled on execution, but that is an
>> implementation detail) and they know their source code.
>>
>> Today we would probably just store the source code strings in the image
>> (maybe compressed) as memory is pretty cheap. But way back when Smalltalk
>> started, that was not the case. So they decided to map the source code out
>> to files.
>>
>> So method source code is a magic string (RemoteString) that points to some
>> position in a file. There are 2 files in use: the sources file and the
>> changes file.
>>
>> The sources file is a kind of snapshot of the source code of all methods
>> at the point of release of a major new version. That is why there is a Vxy
>> in their name. The source file never changes once created or renewed (a
>> process called generating the sources, see PharoSourcesCondenser).
>>
>> While developing and creating new versions of methods, the new source code
>> is appended to another file called the changes file, much like a transaction
>> log. This is also a safety mechanism to recover 'lost' changes.
>>
>> The changes file can contain multiple versions of a method. This can be
>> reduced in size using a process called condensing the changes, see
>> PharoChangesCondenser.
>>
>> On a new release, the changes file will be (almost) empty.
>>
>> HTH,
>>
>> Sven
>>
>>
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

Sven Van Caekenberghe-2
I like Ben's version too. (It helps to be a native English speaker as well ;-)

Please write more, Ben.

> On 13 Jan 2016, at 16:41, Dimitris Chloupis <[hidden email]> wrote:
>
> great explanation Ben thanks for sharing, I agree 100% . ISO files are popular file formats used by other VMs , like VirtualBOX unlike pharo image they are not a memory dump but rather a hard disk dump which is later loaded in memory usually to emulate booting from a DVD or booting a HD drive
>
> https://en.wikipedia.org/wiki/ISO_image
>
> Of course because in the pharo image we store every live object that means everything the similarity in how both formats are used is very similar way.
>
> On Wed, Jan 13, 2016 at 5:31 PM Ben Coman <[hidden email]> wrote:
> Oh! I should have guessed there'd be more alternative suggestions and held back.
> Anyway, its done now, so just pick out what you like...
> Now I tend to think of the Image as more than just the .image file, so...
>
> The heart of Pharo is the *Image*.  This holds the live running state
> of a Pharo system. Tools like the IDE, language parser and interactive
> playground run *inside* the Image to write code and create interacting
> objects which comprise your application.  Saving the Image to a
> ".image" file creates a frozen-in-time snapshot of the running system
> state. This can be moved to another system and resumed running as if
> the Image had never stopped.  The Pharo download provides
> /Pharo.image/ file as the starting point for the live environment that
> you adapt to your needs.
>
> The Image runs on a Virtual Machine.  This abstracts away operating
> system and CPU architecture differences to allow the same ".image"
> file (the snapshot of you live system) to open and run on any support
> platform. The VM executable file is a different for each platform:
> * pharo for Linux and Unix
> * Pharo.exe for Windows
> * Pharo for OSX
>
> Your source-code is compiled to byte-code in Image to become part of
> the live system state.  This occurs as soon as a modified method is
> accepted.  Only the modified method is compiled, not the whole source
> code, and the bytecode is immediately runnable.  This facilitates a
> very fast edit-compile-run-debug loop at the root of Pharo's
> productivity.   The VM optimally interprets or just-in-time compiles
> the bytecode as appropriate for best performance.
>
> As an implementation detail carried over from when ram was scarce, the
> source-code is not stored in-Image.  It is spread across two files:
>
> * Sources for parts of Pharo (i.e. the tools) that don’t change
> frequently are stored in the "*.sources" files.   It is generated as a
> static file per major release. Thus for  Pharo 4.0 it is named
> PharoV40.sources, and keeps that name when the image file name
> changes.   The *.sources file is not essential for running the Image,
> but is important because without it, you can't examine and learn from
> the implementation of Pharo's tools (e.g Collections, Graphics
> libraries, IDE and compiler.)
>
> * All source code changes are journal logged to the "*.changes" file.
> This facilitates easy access to per method history for diffs or
> reverting.  Further, if you close or crash your image without saving,
> an in-Image tool can replay your "lost" changes from this file. The
> "*.changes" file is tightly coupled to the "*.image" file because it
> records the source code changes for a particular Image.  The basename
> of these two files must always be the same (e.g. my.image &
> my.changes).  Each release provides a near empty file, for example
> Pharo4.changes to match Pharo4.image.
>
> Typically the VM and .sources files are distributed/stored together,
> since both are static files (per machine per Pharo Release) and both
> can be shared between multiple Images; and the dynamic .image and
> .changes files are distributed/stored together.
>
>
>
>
> > The other components below are portable across operating systems, and
>
>
> The image file provides a frozen in time snapshot of a running Pharo
> > system. This is the file where the Pharo bytecode is stored and as such its
> > a cross platform format. This is the heart of Pharo, containing the live
> > state of all objects in the system (including classes and methods, since
> > they are objects too). The file is named for the release (like
> > Pharo4.0.image).
> >
>
>
>
> On Wed, Jan 13, 2016 at 7:58 PM, Dimitris Chloupis
> <[hidden email]> wrote:
> > So I am correct that the image does not store the source code, and that the
> > source code is stored in sources and changes. The only diffirence is that
> > the objects have a source variable that points to the right place for
> > finding the source code.
> >
> > This is the final text if you find anything incorrect please correct me
> >
> > ---------------
> >
> > 1. The virtual machine (VM) is the only component that is different for each
> > operating system. The purpose of the VM is to take Pharo bytcode that is
> > generated each time user accepts a piece of code and convert it to machine
> > code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT
> > VM. The VM executable is named:
> >
> > • Pharo.exe for Windows; • pharo for Linux ; and
> >
> > • Pharo for OSX (inside a package also named Pharo.app).
> > The other components below are portable across operating systems, and
> >
> > can be copied and run on any appropriate virtual machine.
> >
> > 2. The sources file contains source code for parts of Pharo that don’t
> > change frequently. Sources file is important because the image file format
> > stores only the bytecode of live objects and not their source code.
> > Typically a new sources file is generated once per major release of Pharo.
> > For Pharo 4.0, this file is named PharoV40.sources.
> >
> > 3. The changes file logs of all source code modifications since the .sources
> > file was generated. This facilitates a per method history for diffs or re-
> > verting.That means that even if you dont manage to save the image file on a
> > crash or you just forgot you can recover your changes from this file. Each
> > release provides a near empty file named for the release, for example
> > Pharo4.0.changes.
> >
> > 4. The image file provides a frozen in time snapshot of a running Pharo
> > system. This is the file where the Pharo bytecode is stored and as such its
> > a cross platform format. This is the heart of Pharo, containing the live
> > state of all objects in the system (including classes and methods, since
> > they are objects too). The file is named for the release (like
> > Pharo4.0.image).
> >
> > The .image and .changes files provided by a Pharo release are the starting
> > point for a live environment that you adapt to your needs. Essentially the
> > image file containes the compiler of the language (not the VM) , the
> > language parser, the IDE tools, many libraries and acts a bit like a virtual
> > Operation System that runs on top of a Virtual Machine (VM), similarly to
> > ISO files.
>
> I don't get this similarity with ISO files.
>
> >
> > As you work in Pharo, these files are modified, so you need to make sure
> > that they are writable. The .image and .changes files are intimately linked
> > and should always be kept together, with matching base filenames. Never edit
> > them directly with a text editor, as .images holds your live object runtime
> > memory, which indexes into the .changes files for the source. It is a good
> > idea to keep a backup copy of the downloaded .image and .changes files so
> > you can always start from a fresh image and reload your code. However the
> > most efficient way for backing up code is to use a version control system
> > that will provide an easier and powerful way to back up and track your
> > changes.
> >
> > The four main component files above can be placed in the same directory,
> > although it’s also possible to put the Virtual Machine and sources file in a
> > separate directory where everyone has read-only access to them.
> >
> > If more than one image file is present in the same directory pharo will
> > prompt you to choose an image file you want to load.
> >
> > Do whatever works best for your style of working and your operating system.
> >
> >
> >
> >
> >
> > On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <[hidden email]> wrote:
> >>
> >>
> >> > On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]>
> >> > wrote:
> >> >
> >> > I was adding a short description to the UPBE about sources file , I
> >> > always thought that the sources file is the file that contains the source
> >> > code of the image because the image file itself stores only the bytecode.
> >> >
> >> > However its just came to my attention that the sources file does not
> >> > contain code that is recently installed in the image.
> >> >
> >> > So how exactly the sources file works and what it is ?
> >>
> >> The main perspective is from the object point of view: methods are just
> >> objects like everything else. In order to be executable they know their byte
> >> codes (which might be JIT compiled on execution, but that is an
> >> implementation detail) and they know their source code.
> >>
> >> Today we would probably just store the source code strings in the image
> >> (maybe compressed) as memory is pretty cheap. But way back when Smalltalk
> >> started, that was not the case. So they decided to map the source code out
> >> to files.
> >>
> >> So method source code is a magic string (RemoteString) that points to some
> >> position in a file. There are 2 files in use: the sources file and the
> >> changes file.
> >>
> >> The sources file is a kind of snapshot of the source code of all methods
> >> at the point of release of a major new version. That is why there is a Vxy
> >> in their name. The source file never changes once created or renewed (a
> >> process called generating the sources, see PharoSourcesCondenser).
> >>
> >> While developing and creating new versions of methods, the new source code
> >> is appended to another file called the changes file, much like a transaction
> >> log. This is also a safety mechanism to recover 'lost' changes.
> >>
> >> The changes file can contain multiple versions of a method. This can be
> >> reduced in size using a process called condensing the changes, see
> >> PharoChangesCondenser.
> >>
> >> On a new release, the changes file will be (almost) empty.
> >>
> >> HTH,
> >>
> >> Sven
> >>
> >>
> >>
> >
>


Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

Offray
In reply to this post by David Allouche
Hi,

I like David's proposal. It strikes a good balance between newbie friendly and detailed enough foundations (you don't learn Peano's Axioms before learning to count).

Cheers,

Offray

On 13/01/16 09:27, David Allouche wrote:
Hey,

Since I just recently figured that stuff out, my perspective might be useful.

I do not think it is a good idea to push external VCS too much at this point. They are important to collaboration, so they should be mentioned, but they add a lot of complexity.

Taking into account the feedback from Sven, and my own ideas, here's how I would write it. I hope this helps.

  1. The virtual machine (VM) provides the environment where the Pharo system lives. It is different for each operating system and hardware architecture, and runs as a machine language executable in the operating system. It implements the details of managing memory, executing Pharo byte-code, and communicating with the world outside of the Pharo system: files, other operating system process, and the network.
  2. The image is the state of all objects in a running Pharo system: classes, method as source and byte-code, windows, VM processes. All of those are objects. The virtual machine can load image files, and save running images back to disk.
  3. The sources file is a way to save space by putting the source code of classes outside of the image. Since an image contains byte-compiled methods, it can run without its associated sources files. This file only provides information that useful to the programmer. It is generated once per major release of Pharo, and is usually stored in the same directory as the image. Several images can use the same sources file.
  4. The changes file logs all source code changes since the generation of the sources file. It lets you examine and revert source code changes. It also serves as a journal so you can recover changes you made but did not save to the disk image, by mistake or because of a crash. An image does not need its changes file to run. Each change file belong to a single image.

NOTE: Some virtual machines run in web browsers, and are Javascript programs instead of machine language. They provide the same services as other virtual machines, they just treat Javascript as the machine language and the browser as the operating system.

The image, sources and changes files are portable across operating systems, and can be copied and run on any appropriate virtual machine. The format of the sources and changes file is defined by code stored in the image, not the VM.

A complete Pharo release contains the following files:

  • An executable virtual machine, named:
    • Pharo.exe for Windows;
    • pharo for Linux;
    • Pharo for OS X (inside a package named Pharo.app).
  • An image, named after the release: Pharo4.0.image
  • A sources file, named after the release: PharoV40.sources.
  • A nearly empty changes file, named after the image: Pharo4.0.changes

They are starting point for a live environment that you adapt to your needs. The .image file containes the byte-compiler of the language, the language parser, the IDE tools, many libraries and provide a virtual operating system that runs on top of a virtual machine.

As you work in Pharo, the .image and .changes files are modified, so you need to make sure that they are writable. The .image and .changes files are intimately linked and should always be kept together, with matching filenames. Never edit them directly with a text editor, as .images holds your live object runtime memory, which indexes into the .changes files for the source.

It is a good idea to keep a backup copy of the downloaded .image and .changes files so you can always start from a fresh image and reload your code. But to manage changes on larger projects you should also use a version control system that will provide more control to record and communicate your changes.

The four main component files above can be placed in the same directory, although it’s also possible to put the Virtual Machine and sources file in a separate directory where everyone has read-only access to them.

If more than one image file is present in the same directory pharo will prompt you to choose an image file you want to load.

Do whatever works best for your style of working and your operating system.

Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

Offray
In reply to this post by Ben Coman
Yep, I think that Ben's version is even better and more balanced for a
newbie while informative enough over David's draft.

Cheers,

Offray

On 13/01/16 10:30, Ben Coman wrote:

> Oh! I should have guessed there'd be more alternative suggestions and held back.
> Anyway, its done now, so just pick out what you like...
> Now I tend to think of the Image as more than just the .image file, so...
>
> The heart of Pharo is the *Image*.  This holds the live running state
> of a Pharo system. Tools like the IDE, language parser and interactive
> playground run *inside* the Image to write code and create interacting
> objects which comprise your application.  Saving the Image to a
> ".image" file creates a frozen-in-time snapshot of the running system
> state. This can be moved to another system and resumed running as if
> the Image had never stopped.  The Pharo download provides
> /Pharo.image/ file as the starting point for the live environment that
> you adapt to your needs.
>
> The Image runs on a Virtual Machine.  This abstracts away operating
> system and CPU architecture differences to allow the same ".image"
> file (the snapshot of you live system) to open and run on any support
> platform. The VM executable file is a different for each platform:
> * pharo for Linux and Unix
> * Pharo.exe for Windows
> * Pharo for OSX
>
> Your source-code is compiled to byte-code in Image to become part of
> the live system state.  This occurs as soon as a modified method is
> accepted.  Only the modified method is compiled, not the whole source
> code, and the bytecode is immediately runnable.  This facilitates a
> very fast edit-compile-run-debug loop at the root of Pharo's
> productivity.   The VM optimally interprets or just-in-time compiles
> the bytecode as appropriate for best performance.
>
> As an implementation detail carried over from when ram was scarce, the
> source-code is not stored in-Image.  It is spread across two files:
>
> * Sources for parts of Pharo (i.e. the tools) that don’t change
> frequently are stored in the "*.sources" files.   It is generated as a
> static file per major release. Thus for  Pharo 4.0 it is named
> PharoV40.sources, and keeps that name when the image file name
> changes.   The *.sources file is not essential for running the Image,
> but is important because without it, you can't examine and learn from
> the implementation of Pharo's tools (e.g Collections, Graphics
> libraries, IDE and compiler.)
>
> * All source code changes are journal logged to the "*.changes" file.
> This facilitates easy access to per method history for diffs or
> reverting.  Further, if you close or crash your image without saving,
> an in-Image tool can replay your "lost" changes from this file. The
> "*.changes" file is tightly coupled to the "*.image" file because it
> records the source code changes for a particular Image.  The basename
> of these two files must always be the same (e.g. my.image &
> my.changes).  Each release provides a near empty file, for example
> Pharo4.changes to match Pharo4.image.
>
> Typically the VM and .sources files are distributed/stored together,
> since both are static files (per machine per Pharo Release) and both
> can be shared between multiple Images; and the dynamic .image and
> .changes files are distributed/stored together.
>
>
>
>
>> The other components below are portable across operating systems, and
>
> The image file provides a frozen in time snapshot of a running Pharo
>> system. This is the file where the Pharo bytecode is stored and as such its
>> a cross platform format. This is the heart of Pharo, containing the live
>> state of all objects in the system (including classes and methods, since
>> they are objects too). The file is named for the release (like
>> Pharo4.0.image).
>>
>
>
> On Wed, Jan 13, 2016 at 7:58 PM, Dimitris Chloupis
> <[hidden email]> wrote:
>> So I am correct that the image does not store the source code, and that the
>> source code is stored in sources and changes. The only diffirence is that
>> the objects have a source variable that points to the right place for
>> finding the source code.
>>
>> This is the final text if you find anything incorrect please correct me
>>
>> ---------------
>>
>> 1. The virtual machine (VM) is the only component that is different for each
>> operating system. The purpose of the VM is to take Pharo bytcode that is
>> generated each time user accepts a piece of code and convert it to machine
>> code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT
>> VM. The VM executable is named:
>>
>> • Pharo.exe for Windows; • pharo for Linux ; and
>>
>> • Pharo for OSX (inside a package also named Pharo.app).
>> The other components below are portable across operating systems, and
>>
>> can be copied and run on any appropriate virtual machine.
>>
>> 2. The sources file contains source code for parts of Pharo that don’t
>> change frequently. Sources file is important because the image file format
>> stores only the bytecode of live objects and not their source code.
>> Typically a new sources file is generated once per major release of Pharo.
>> For Pharo 4.0, this file is named PharoV40.sources.
>>
>> 3. The changes file logs of all source code modifications since the .sources
>> file was generated. This facilitates a per method history for diffs or re-
>> verting.That means that even if you dont manage to save the image file on a
>> crash or you just forgot you can recover your changes from this file. Each
>> release provides a near empty file named for the release, for example
>> Pharo4.0.changes.
>>
>> 4. The image file provides a frozen in time snapshot of a running Pharo
>> system. This is the file where the Pharo bytecode is stored and as such its
>> a cross platform format. This is the heart of Pharo, containing the live
>> state of all objects in the system (including classes and methods, since
>> they are objects too). The file is named for the release (like
>> Pharo4.0.image).
>>
>> The .image and .changes files provided by a Pharo release are the starting
>> point for a live environment that you adapt to your needs. Essentially the
>> image file containes the compiler of the language (not the VM) , the
>> language parser, the IDE tools, many libraries and acts a bit like a virtual
>> Operation System that runs on top of a Virtual Machine (VM), similarly to
>> ISO files.
> I don't get this similarity with ISO files.
>
>> As you work in Pharo, these files are modified, so you need to make sure
>> that they are writable. The .image and .changes files are intimately linked
>> and should always be kept together, with matching base filenames. Never edit
>> them directly with a text editor, as .images holds your live object runtime
>> memory, which indexes into the .changes files for the source. It is a good
>> idea to keep a backup copy of the downloaded .image and .changes files so
>> you can always start from a fresh image and reload your code. However the
>> most efficient way for backing up code is to use a version control system
>> that will provide an easier and powerful way to back up and track your
>> changes.
>>
>> The four main component files above can be placed in the same directory,
>> although it’s also possible to put the Virtual Machine and sources file in a
>> separate directory where everyone has read-only access to them.
>>
>> If more than one image file is present in the same directory pharo will
>> prompt you to choose an image file you want to load.
>>
>> Do whatever works best for your style of working and your operating system.
>>
>>
>>
>>
>>
>> On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <[hidden email]> wrote:
>>>
>>>> On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]>
>>>> wrote:
>>>>
>>>> I was adding a short description to the UPBE about sources file , I
>>>> always thought that the sources file is the file that contains the source
>>>> code of the image because the image file itself stores only the bytecode.
>>>>
>>>> However its just came to my attention that the sources file does not
>>>> contain code that is recently installed in the image.
>>>>
>>>> So how exactly the sources file works and what it is ?
>>> The main perspective is from the object point of view: methods are just
>>> objects like everything else. In order to be executable they know their byte
>>> codes (which might be JIT compiled on execution, but that is an
>>> implementation detail) and they know their source code.
>>>
>>> Today we would probably just store the source code strings in the image
>>> (maybe compressed) as memory is pretty cheap. But way back when Smalltalk
>>> started, that was not the case. So they decided to map the source code out
>>> to files.
>>>
>>> So method source code is a magic string (RemoteString) that points to some
>>> position in a file. There are 2 files in use: the sources file and the
>>> changes file.
>>>
>>> The sources file is a kind of snapshot of the source code of all methods
>>> at the point of release of a major new version. That is why there is a Vxy
>>> in their name. The source file never changes once created or renewed (a
>>> process called generating the sources, see PharoSourcesCondenser).
>>>
>>> While developing and creating new versions of methods, the new source code
>>> is appended to another file called the changes file, much like a transaction
>>> log. This is also a safety mechanism to recover 'lost' changes.
>>>
>>> The changes file can contain multiple versions of a method. This can be
>>> reduced in size using a process called condensing the changes, see
>>> PharoChangesCondenser.
>>>
>>> On a new release, the changes file will be (almost) empty.
>>>
>>> HTH,
>>>
>>> Sven
>>>
>>>
>>>
>


Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

Ben Coman
In reply to this post by Sven Van Caekenberghe-2
On Wed, Jan 13, 2016 at 11:41 PM, Dimitris Chloupis
<[hidden email]> wrote:
> great explanation Ben thanks for sharing, I agree 100% .

On Wed, Jan 13, 2016 at 11:46 PM, Sven Van Caekenberghe <[hidden email]> wrote:
> I like Ben's version too. (It helps to be a native English speaker as well ;-)
> Please write more, Ben.

Thanks.  But damn, I need to correct my own grammar...
I've _underlined_ updates...

>> On Wed, Jan 13, 2016 at 5:31 PM Ben Coman <[hidden email]> wrote:
>> Oh! I should have guessed there'd be more alternative suggestions and held back.
>> Anyway, its done now, so just pick out what you like...
>> Now I tend to think of the Image as more than just the .image file, so...
>>
>> The heart of Pharo is the *Image*.  This holds the live running state
>> of a Pharo system. Tools like the IDE, language parser and interactive
>> playground run *inside* the Image to _modify_ code and create interacting
>> objects which comprise your application.  Saving the Image to _an_
>> ".image" file creates a frozen-in-time snapshot of the running system
>> state. This can be moved to another system and resumed running as if
>> the Image had never stopped.  The Pharo _Release_ download provides _the_
>> /Pharo.image/ file as the starting point for the live environment that
>> you adapt to your needs.
>>
>> The Image runs on a Virtual Machine.  This abstracts away operating
>> system and CPU architecture differences to allow the same ".image"
>> file (the snapshot of you live system) to open and run on any _supported_
>> platform. The VM executable file is __ different for each platform:
>> * pharo for Linux and Unix
>> * Pharo.exe for Windows
>> * Pharo for OSX
>>
>> Your source-code is compiled to byte-code in Image to become part of
>> the live system state.  This occurs as soon as a modified method is
>> accepted.  Only the modified method is compiled, not the whole _system_ source
>> __, and the bytecode is immediately runnable.  This facilitates a
>> very fast edit-compile-run-debug loop at the root of Pharo's
>> productivity.   The VM optimally interprets or just-in-time compiles
>> the bytecode as appropriate for best performance.
>>
>> As an implementation detail carried over from when _RAM_ was scarce, the
>> source-code is not stored in-Image.  It is spread across two files:
>>
>> * Sources for parts of Pharo (i.e. the _in-Image_ tools) that don’t change
>> frequently are stored in the "*.sources" _file_.   It is generated as a
>> static file per major release. Thus for  Pharo 4.0 it is named
>> PharoV40.sources, and keeps that name (_while_ the .image file name
>> _can change_).   The *.sources file is not essential for running the Image,
>> but is important because without it, you can't examine and learn from
>> the implementation of Pharo's tools (e.g Collections, Graphics
>> libraries, IDE and compiler.)
>>
>> * All _changes to_ source code __ are journal logged to the "*.changes" file.
>> This facilitates easy access to per method history for diffs or
>> reverting.  Further, if you close or crash your image _before_ saving,
>> an in-Image tool can replay your "lost" changes from this file. The
>> "*.changes" file is tightly coupled to the "*.image" file because it
>> records the source code changes for a particular Image.  The basename
>> of these two files must always _match_  (e.g. my.image &
>> my.changes).  Each release provides a near empty file, for example
>> Pharo4.changes to match Pharo4.image.
>>
>> Typically the VM and .sources files are distributed/stored together,
>> since both are static files (per machine, per Pharo Release) and both
>> can be shared between multiple Images.  _The paired_ dynamic .image and
>> .changes files _are inherently_ distributed/stored together.  _The static and dynamic pairs can be stored in separate folders._

___________________________________






>> > The other components below are portable across operating systems, and
>>
>>
>> The image file provides a frozen in time snapshot of a running Pharo
>> > system. This is the file where the Pharo bytecode is stored and as such its
>> > a cross platform format. This is the heart of Pharo, containing the live
>> > state of all objects in the system (including classes and methods, since
>> > they are objects too). The file is named for the release (like
>> > Pharo4.0.image).
>> >
>>
>>
>>
>> On Wed, Jan 13, 2016 at 7:58 PM, Dimitris Chloupis
>> <[hidden email]> wrote:
>> > So I am correct that the image does not store the source code, and that the
>> > source code is stored in sources and changes. The only diffirence is that
>> > the objects have a source variable that points to the right place for
>> > finding the source code.
>> >
>> > This is the final text if you find anything incorrect please correct me
>> >
>> > ---------------
>> >
>> > 1. The virtual machine (VM) is the only component that is different for each
>> > operating system. The purpose of the VM is to take Pharo bytcode that is
>> > generated each time user accepts a piece of code and convert it to machine
>> > code in order to be executed. Pharo 4 comes with the Cog VM a very fast JIT
>> > VM. The VM executable is named:
>> >
>> > • Pharo.exe for Windows; • pharo for Linux ; and
>> >
>> > • Pharo for OSX (inside a package also named Pharo.app).
>> > The other components below are portable across operating systems, and
>> >
>> > can be copied and run on any appropriate virtual machine.
>> >
>> > 2. The sources file contains source code for parts of Pharo that don’t
>> > change frequently. Sources file is important because the image file format
>> > stores only the bytecode of live objects and not their source code.
>> > Typically a new sources file is generated once per major release of Pharo.
>> > For Pharo 4.0, this file is named PharoV40.sources.
>> >
>> > 3. The changes file logs of all source code modifications since the .sources
>> > file was generated. This facilitates a per method history for diffs or re-
>> > verting.That means that even if you dont manage to save the image file on a
>> > crash or you just forgot you can recover your changes from this file. Each
>> > release provides a near empty file named for the release, for example
>> > Pharo4.0.changes.
>> >
>> > 4. The image file provides a frozen in time snapshot of a running Pharo
>> > system. This is the file where the Pharo bytecode is stored and as such its
>> > a cross platform format. This is the heart of Pharo, containing the live
>> > state of all objects in the system (including classes and methods, since
>> > they are objects too). The file is named for the release (like
>> > Pharo4.0.image).
>> >
>> > The .image and .changes files provided by a Pharo release are the starting
>> > point for a live environment that you adapt to your needs. Essentially the
>> > image file containes the compiler of the language (not the VM) , the
>> > language parser, the IDE tools, many libraries and acts a bit like a virtual
>> > Operation System that runs on top of a Virtual Machine (VM), similarly to
>> > ISO files.
>>
>> I don't get this similarity with ISO files.
>>
>> >
>> > As you work in Pharo, these files are modified, so you need to make sure
>> > that they are writable. The .image and .changes files are intimately linked
>> > and should always be kept together, with matching base filenames. Never edit
>> > them directly with a text editor, as .images holds your live object runtime
>> > memory, which indexes into the .changes files for the source. It is a good
>> > idea to keep a backup copy of the downloaded .image and .changes files so
>> > you can always start from a fresh image and reload your code. However the
>> > most efficient way for backing up code is to use a version control system
>> > that will provide an easier and powerful way to back up and track your
>> > changes.
>> >
>> > The four main component files above can be placed in the same directory,
>> > although it’s also possible to put the Virtual Machine and sources file in a
>> > separate directory where everyone has read-only access to them.
>> >
>> > If more than one image file is present in the same directory pharo will
>> > prompt you to choose an image file you want to load.
>> >
>> > Do whatever works best for your style of working and your operating system.
>> >
>> >
>> >
>> >
>> >
>> > On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe <[hidden email]> wrote:
>> >>
>> >>
>> >> > On 13 Jan 2016, at 10:57, Dimitris Chloupis <[hidden email]>
>> >> > wrote:
>> >> >
>> >> > I was adding a short description to the UPBE about sources file , I
>> >> > always thought that the sources file is the file that contains the source
>> >> > code of the image because the image file itself stores only the bytecode.
>> >> >
>> >> > However its just came to my attention that the sources file does not
>> >> > contain code that is recently installed in the image.
>> >> >
>> >> > So how exactly the sources file works and what it is ?
>> >>
>> >> The main perspective is from the object point of view: methods are just
>> >> objects like everything else. In order to be executable they know their byte
>> >> codes (which might be JIT compiled on execution, but that is an
>> >> implementation detail) and they know their source code.
>> >>
>> >> Today we would probably just store the source code strings in the image
>> >> (maybe compressed) as memory is pretty cheap. But way back when Smalltalk
>> >> started, that was not the case. So they decided to map the source code out
>> >> to files.
>> >>
>> >> So method source code is a magic string (RemoteString) that points to some
>> >> position in a file. There are 2 files in use: the sources file and the
>> >> changes file.
>> >>
>> >> The sources file is a kind of snapshot of the source code of all methods
>> >> at the point of release of a major new version. That is why there is a Vxy
>> >> in their name. The source file never changes once created or renewed (a
>> >> process called generating the sources, see PharoSourcesCondenser).
>> >>
>> >> While developing and creating new versions of methods, the new source code
>> >> is appended to another file called the changes file, much like a transaction
>> >> log. This is also a safety mechanism to recover 'lost' changes.
>> >>
>> >> The changes file can contain multiple versions of a method. This can be
>> >> reduced in size using a process called condensing the changes, see
>> >> PharoChangesCondenser.
>> >>
>> >> On a new release, the changes file will be (almost) empty.
>> >>
>> >> HTH,
>> >>
>> >> Sven
>> >>
>> >>
>> >>
>> >
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Understanding the role of the sources file

Ben Coman
In reply to this post by kilon.alios
> On Wed, Jan 13, 2016 at 4:58 PM Werner Kassens <[hidden email]> wrote:
>>
>> Hi Dimitris,
>> your formulation "...Pharo bytcode...and convert it to machine code..."
>> is insofar irritating to me as "convert it to machine code" would
>> suggest to me that a compiler is at work here. Davids "executing Pharo
>> byte-code" seems more understandable to me here.

On Wed, Jan 13, 2016 at 11:27 PM, Dimitris Chloupis
<[hidden email]> wrote:

> Thats correct its a compiler, a byte compiler, it compiles bytecode to
> machine code and it does it while the code executes, this is why its called
> JIT , which has the meaning of Just In Time compilation, meaning that
> machine code is compiled just before the code is executed so several
> optimizations can be applied that would not be known before the execution of
> the code. Similar to JAVA's JIT compiler.
>
> Note here that a compiler is not just something that produces machine code,
> a compiler for example can take one language and compile it to another
> language.

Indeed.  The OpalCompiler takes Smalltalk and produces bytecode.

However I think Sven and Werner were referring that much of Pharo code
is not JITd, but *merely* interpreted (IIUC).  See section "Not so
smart questions" here...
https://clementbera.wordpress.com/2014/01/09/the-sista-chronicles-i-an-introduction-to-adaptive-recompilation/

cheers -ben

>>
>> On 01/13/2016 02:22 PM, Dimitris Chloupis wrote:
>> > I assume you have never read a an introduction to C++ then :D
>> >
>> > here is the final addition for the vm
>> >
>> > (Vm) is the only component that is different for each operating system.
>> > The main purpose of the VM is to take Pharo bytcode that is generated
>> > each time user accepts a piece of code and convert it to machine code in
>> > order to be executed, but also to generally handle low level
>> > functionality like interpreting code, handling OS events (mouse and
>> > keyboard), calling C libraries etc. Pharo 4 comes with the Cog VM a very
>> > fast JIT VM.
>> >
>> > I think its clear, precise and does not leave much room for confusion.
>> > Personally I think its very important for the absolute begineer to have
>> > strong foundations of understanding the fundamental of Pharo and not for
>> > things to appear magical and "dont touch this".
>> >
>> > On Wed, Jan 13, 2016 at 2:54 PM Sven Van Caekenberghe <[hidden email]
>> > <mailto:[hidden email]>> wrote:
>> >
>> >
>> >      > On 13 Jan 2016, at 13:42, Dimitris Chloupis
>> >     <[hidden email] <mailto:[hidden email]>> wrote:
>> >      >
>> >      > I mentioned bytecode because I dont want the user to see at some
>> >     point bytecode and say "What the hell is that" I want the reader to
>> >     feel confident that at least understands the basic in Pharo. Also
>> >     very brief explanations about bytecode I have seen in similar python
>> >     tutorials. Obviously I dont want to go any deeper than that because
>> >     the user wont have to worry about the technical details on a daily
>> >     basis anyway.
>> >      >
>> >      > I agree that I could add a bit more on the VM description similar
>> >     to what you posted. I am curious though, wont even the interpreter
>> >     generate machine code in order to execute the code  or does it use
>> >     existing machine code inside the VM binary ?
>> >
>> >     No, a classic interpreter does not 'generate' machine code, it is
>> >     just a program that reads and executes bytes codes in a loop, the
>> >     interpreter 'is' machine code.
>> >
>> >     No offence, but you see why I think it is important to not try to
>> >     use or explain too much complex concepts in the 1st chapter.
>> >
>> >     Learning to program is hard. It should first be done abstractly.
>> >     Think about Scratch. The whole idea of Smalltalk is to create a
>> >     world of interacting objects. (Even byte code is not a necessary
>> >     concept at all, for example, in Pharo, you can compile (translate)
>> >     to AST and execute that, I believe. There are Smalltalk
>> >     implementations that compile directly to C or JavaScript). Hell,
>> >     even 'compile' is not necessary, just 'accept'. See ?
>> >
>> >      > On Wed, Jan 13, 2016 at 2:25 PM Sven Van Caekenberghe
>> >     <[hidden email] <mailto:[hidden email]>> wrote:
>> >      > Sounds about right.
>> >      >
>> >      > Now, I would swap 1 and 4, as the image is the most important
>> >     abstraction.
>> >      >
>> >      > There is also a bit too much emphasis on (byte|source)code. This
>> >     is already pretty technical (it assume you know what compilation is
>> >     and so on). But I understand it must be explained here, and you did
>> >     it well.
>> >      >
>> >      > However, I would start by saying that the image is a snapshot of
>> >     the object world in memory that is effectively a live Pharo system.
>> >     It contains everything that is available and that exists in Pharo.
>> >     This includes any objects that you created yourself, windows,
>> >     browsers, open debuggers, executing processes, all meta objects as
>> >     well as all representations of code.
>> >      >
>> >      > <sidenote>
>> >      > The fact that there is a sources and changes file is an
>> >     implementation artefact, not something fundamental. There are ideas
>> >     to change this in the future (but you do not have to mention that).
>> >      > </sidenote>
>> >      >
>> >      > Also, the VM not only executes code, it maintains the object
>> >     world, which includes the ability to load and save it from and to an
>> >     image. It creates a portable (cross platform) abstraction that
>> >     isolates the image from the particular details of the underlying
>> >     hardware and OS. In that role it implements the interface with the
>> >     outside world. I would mention that second part before mentioning
>> >     the code execution.
>> >      >
>> >      > The sentence "The purpose of the VM is to take Pharo bytcode that
>> >     is generated each time user accepts a piece of code and convert it
>> >     to machine code in order to be executed." is not 100% correct. It is
>> >     possible to execute the byte code without converting it. This is
>> >     called interpretation. JIT is a faster technique that includes
>> >     converting (some often used) byte code to machine code and caching
>> > that.
>> >      >
>> >      > I hope this helps (it is hard to write a 'definitive explanation'
>> >     as there are some many aspects to this and it depends on the
>> >     context/audience).
>> >      >
>> >      > > On 13 Jan 2016, at 12:58, Dimitris Chloupis
>> >     <[hidden email] <mailto:[hidden email]>> wrote:
>> >      > >
>> >      > > So I am correct that the image does not store the source code,
>> >     and that the source code is stored in sources and changes. The only
>> >     diffirence is that the objects have a source variable that points to
>> >     the right place for finding the source code.
>> >      > >
>> >      > > This is the final text if you find anything incorrect please
>> >     correct me
>> >      > >
>> >      > > ---------------
>> >      > >
>> >      > > 1. The virtual machine (VM) is the only component that is
>> >     different for each operating system. The purpose of the VM is to
>> >     take Pharo bytcode that is generated each time user accepts a piece
>> >     of code and convert it to machine code in order to be executed.
>> >     Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable
>> >     is named:
>> >      > >
>> >      > > • Pharo.exe for Windows; • pharo for Linux ; and
>> >      > >
>> >      > > • Pharo for OSX (inside a package also named Pharo.app).
>> >      > > The other components below are portable across operating
>> >     systems, and
>> >      > >
>> >      > > can be copied and run on any appropriate virtual machine.
>> >      > >
>> >      > > 2. The sources file contains source code for parts of Pharo
>> >     that don’t change frequently. Sources file is important because the
>> >     image file format stores only the bytecode of live objects and not
>> >     their source code. Typically a new sources file is generated once
>> >     per major release of Pharo. For Pharo 4.0, this file is named
>> >     PharoV40.sources.
>> >      > >
>> >      > > 3. The changes file logs of all source code modifications since
>> >     the .sources file was generated. This facilitates a per method
>> >     history for diffs or re- verting.That means that even if you dont
>> >     manage to save the image file on a crash or you just forgot you can
>> >     recover your changes from this file. Each release provides a near
>> >     empty file named for the release, for example Pharo4.0.changes.
>> >      > >
>> >      > > 4. The image file provides a frozen in time snapshot of a
>> >     running Pharo system. This is the file where the Pharo bytecode is
>> >     stored and as such its a cross platform format. This is the heart of
>> >     Pharo, containing the live state of all objects in the system
>> >     (including classes and methods, since they are objects too). The
>> >     file is named for the release (like Pharo4.0.image).
>> >      > >
>> >      > > The .image and .changes files provided by a Pharo release are
>> >     the starting point for a live environment that you adapt to your
>> >     needs. Essentially the image file containes the compiler of the
>> >     language (not the VM) , the language parser, the IDE tools, many
>> >     libraries and acts a bit like a virtual Operation System that runs
>> >     on top of a Virtual Machine (VM), similarly to ISO files.
>> >      > >
>> >      > > As you work in Pharo, these files are modified, so you need to
>> >     make sure that they are writable. The .image and .changes files are
>> >     intimately linked and should always be kept together, with matching
>> >     base filenames. Never edit them directly with a text editor, as
>> >     .images holds your live object runtime memory, which indexes into
>> >     the .changes files for the source. It is a good idea to keep a
>> >     backup copy of the downloaded .image and .changes files so you can
>> >     always start from a fresh image and reload your code. However the
>> >     most efficient way for backing up code is to use a version control
>> >     system that will provide an easier and powerful way to back up and
>> >     track your changes.
>> >      > >
>> >      > > The four main component files above can be placed in the same
>> >     directory, although it’s also possible to put the Virtual Machine
>> >     and sources file in a separate directory where everyone has
>> >     read-only access to them.
>> >      > >
>> >      > > If more than one image file is present in the same directory
>> >     pharo will prompt you to choose an image file you want to load.
>> >      > >
>> >      > > Do whatever works best for your style of working and your
>> >     operating system.
>> >      > >
>> >      > >
>> >      > >
>> >      > >
>> >      > >
>> >      > > On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe
>> >     <[hidden email] <mailto:[hidden email]>> wrote:
>> >      > >
>> >      > > > On 13 Jan 2016, at 10:57, Dimitris Chloupis
>> >     <[hidden email] <mailto:[hidden email]>> wrote:
>> >      > > >
>> >      > > > I was adding a short description to the UPBE about sources
>> >     file , I always thought that the sources file is the file that
>> >     contains the source code of the image because the image file itself
>> >     stores only the bytecode.
>> >      > > >
>> >      > > > However its just came to my attention that the sources file
>> >     does not contain code that is recently installed in the image.
>> >      > > >
>> >      > > > So how exactly the sources file works and what it is ?
>> >      > >
>> >      > > The main perspective is from the object point of view: methods
>> >     are just objects like everything else. In order to be executable
>> >     they know their byte codes (which might be JIT compiled on
>> >     execution, but that is an implementation detail) and they know their
>> >     source code.
>> >      > >
>> >      > > Today we would probably just store the source code strings in
>> >     the image (maybe compressed) as memory is pretty cheap. But way back
>> >     when Smalltalk started, that was not the case. So they decided to
>> >     map the source code out to files.
>> >      > >
>> >      > > So method source code is a magic string (RemoteString) that
>> >     points to some position in a file. There are 2 files in use: the
>> >     sources file and the changes file.
>> >      > >
>> >      > > The sources file is a kind of snapshot of the source code of
>> >     all methods at the point of release of a major new version. That is
>> >     why there is a Vxy in their name. The source file never changes once
>> >     created or renewed (a process called generating the sources, see
>> >     PharoSourcesCondenser).
>> >      > >
>> >      > > While developing and creating new versions of methods, the new
>> >     source code is appended to another file called the changes file,
>> >     much like a transaction log. This is also a safety mechanism to
>> >     recover 'lost' changes.
>> >      > >
>> >      > > The changes file can contain multiple versions of a method.
>> >     This can be reduced in size using a process called condensing the
>> >     changes, see PharoChangesCondenser.
>> >      > >
>> >      > > On a new release, the changes file will be (almost) empty.
>> >      > >
>> >      > > HTH,
>> >      > >
>> >      > > Sven
>> >      > >
>> >      > >
>> >      > >
>> >      >
>> >      >
>> >
>> >
>>
>

12