Issue 99 in cog: Link LZ4 Compression

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
cog
Reply | Threaded
Open this post in threaded view
|

Issue 99 in cog: Link LZ4 Compression

cog
 
Status: Accepted
Owner: [hidden email]
Labels: Type-Defect Priority-Medium

New issue 99 by [hidden email]: Link  LZ4 Compression
http://code.google.com/p/cog/issues/detail?id=99

We should build our VM with lz4 support:

https://code.google.com/p/lz4/

Ideally we will compress all internal unused/static data.
Another application would be to compress all the fuel-ized data as well.

Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

David T. Lewis
 
On Fri, Oct 12, 2012 at 11:21:33AM +0000, [hidden email] wrote:

>
> Status: Accepted
> Owner: [hidden email]
> Labels: Type-Defect Priority-Medium
>
> New issue 99 by [hidden email]: Link  LZ4 Compression
> http://code.google.com/p/cog/issues/detail?id=99
>
> We should build our VM with lz4 support:
>
> https://code.google.com/p/lz4/
>
> Ideally we will compress all internal unused/static data.
> Another application would be to compress all the fuel-ized data as well.

I do not understand what this means. Is it a request for someone to
write a plugin? What is "internal unused/static data"?

confused,
Dave

Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

Camillo Bruni-3
 
On 2012-10-12, at 15:02, David T. Lewis <[hidden email]> wrote:

> On Fri, Oct 12, 2012 at 11:21:33AM +0000, [hidden email] wrote:
>>
>> Status: Accepted
>> Owner: [hidden email]
>> Labels: Type-Defect Priority-Medium
>>
>> New issue 99 by [hidden email]: Link  LZ4 Compression
>> http://code.google.com/p/cog/issues/detail?id=99
>>
>> We should build our VM with lz4 support:
>>
>> https://code.google.com/p/lz4/
>>
>> Ideally we will compress all internal unused/static data.
>> Another application would be to compress all the fuel-ized data as well.
>
> I do not understand what this means. Is it a request for someone to
> write a plugin?
Link cog with the lz4 as a first step to write bindings with FFI/NativeBoost.

> What is "internal unused/static data"?
I was referring to image-side data, sorry my bad ;) But basically everything
that is not needed directly in the image could be serialized and compressed.

Plus by having a super-fast compression library at hand decompression would
essentially be a NOP.
cog
Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

cog
In reply to this post by cog
 

Comment #1 on issue 99 by [hidden email]: Link  LZ4 Compression
http://code.google.com/p/cog/issues/detail?id=99

unused/static data => referring to objects in the image..
.

Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

Igor Stasenko
 
sounds similar to  #hibernate/#unhibernate
which we already having for forms.




--
Best regards,
Igor Stasenko.
Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

Levente Uzonyi-2
In reply to this post by Camillo Bruni-3
 
On Fri, 12 Oct 2012, Camillo Bruni wrote:

>
> On 2012-10-12, at 15:02, David T. Lewis <[hidden email]> wrote:
>> On Fri, Oct 12, 2012 at 11:21:33AM +0000, [hidden email] wrote:
>>>
>>> Status: Accepted
>>> Owner: [hidden email]
>>> Labels: Type-Defect Priority-Medium
>>>
>>> New issue 99 by [hidden email]: Link  LZ4 Compression
>>> http://code.google.com/p/cog/issues/detail?id=99
>>>
>>> We should build our VM with lz4 support:
>>>
>>> https://code.google.com/p/lz4/
>>>
>>> Ideally we will compress all internal unused/static data.
>>> Another application would be to compress all the fuel-ized data as well.
>>
>> I do not understand what this means. Is it a request for someone to
>> write a plugin?
> Link cog with the lz4 as a first step to write bindings with FFI/NativeBoost.

An external plugin sounds a lot better to me. The larger the VM binary is,
the slower it will be on today's CPUs.

>
>> What is "internal unused/static data"?
> I was referring to image-side data, sorry my bad ;) But basically everything
> that is not needed directly in the image could be serialized and compressed.

Make it work, make it right, make it fast. You don't have the system yet,
but you want to make it fast already?
When you have the system is ready (to test if compression makes sense),
then try the compression via FFI (pick your favorite implementation), and
if it seems like it gives enough benefits, and the system is about to be
used by a wide enough audience, then (and only then) consider adding it to
the VM.

>
> Plus by having a super-fast compression library at hand decompression would
> essentially be a NOP.
>

I didn't see any benchmarks where (de)compression is done on small chunks
of data (a few kilobytes at most - which is your intended use case). And
even though the (de)compression might not make much difference in runtime,
it definitely will give higher CPU usage, which is unwelcome in some cases
(e.g. mobile devices). It might result in lower overall CPU usage too, but
the ~2 compression ration makes me think that it's unlikely.


Levente
Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

Camillo Bruni-3

>>> I do not understand what this means. Is it a request for someone to
>>> write a plugin?
>> Link cog with the lz4 as a first step to write bindings with FFI/NativeBoost.
>
> An external plugin sounds a lot better to me.
replace "link" with "bundle".

> The larger the VM binary is, the slower it will be on today's CPUs.

how come?

> Make it work, make it right, make it fast. You don't have the system yet, but you want to make it fast already?

I agree 100% if you implement everything from scratch by yourself.
But in this case it's relying on an external project, which will give me speedup for free ;).


>> Plus by having a super-fast compression library at hand decompression would
>> essentially be a NOP.
>>
>
> I didn't see any benchmarks where (de)compression is done on small chunks of data (a few kilobytes at most - which is your intended use case).

see [1]

> And even though the (de)compression might not make much difference in runtime, it definitely will give higher CPU usage, which is unwelcome in some cases (e.g. mobile devices).

well it runs on multiple cores. Cog runs on a single core. So wasting some CPU cycles
on the non-used cores won't harm that much.

For mobile devices you might simply not want the image to swap, hence you will pay
a lot of attention to make sure it stays small. So yes, in this case you won't rely
directly on such a features.

However swapping out unused parts of the system and reload them are still interesting
on such a "limited" platform  [1]. And in this case you exactly don't want to waste
cycles on loading the data, so compression in memory is interesting again.

> It might result in lower overall CPU usage too, but the ~2 compression ration makes me think that it's unlikely.


It does if it makes swapping out data cheaper, that's a win. But here you don't
show me benchmarks either ;)


[1] http://rmod.lille.inria.fr/archives/papers/Mart11c-COMLAN-ObjectSwapping.pdf

Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

Levente Uzonyi-2
 
On Sat, 13 Oct 2012, Camillo Bruni wrote:

>
>>>> I do not understand what this means. Is it a request for someone to
>>>> write a plugin?
>>> Link cog with the lz4 as a first step to write bindings with FFI/NativeBoost.
>>
>> An external plugin sounds a lot better to me.
> replace "link" with "bundle".
>
>> The larger the VM binary is, the slower it will be on today's CPUs.
>
> how come?

It's because of the cache hierarchy. You don't have control over what will
be where in a binary, or what sizes and levels of cache a CPU has. The
smaller the binary is, the higher the chance that the part of the VM which
needs to be used right now is in the CPU cache. I wrote about this a few
years ago. I found that building all non-essential plugins as external
gives ~4-5% better performance (using the Interpreter).

>
>> Make it work, make it right, make it fast. You don't have the system yet, but you want to make it fast already?
>
> I agree 100% if you implement everything from scratch by yourself.
> But in this case it's relying on an external project, which will give me speedup for free ;).

So you have the system ready to be tested with compression via FFI to find
out if it makes sense to use it at all or not. There's still no reason to
add extra code to the VM yet, because noone knows if it's worth it or not.

>
>
>>> Plus by having a super-fast compression library at hand decompression would
>>> essentially be a NOP.
>>>
>>
>> I didn't see any benchmarks where (de)compression is done on small chunks of data (a few kilobytes at most - which is your intended use case).
>
> see [1]

I don't see where that paper is "talking" about the size of the exported
chunks, or where it contains (de)compression benchmarks done on small
chunks of data. Please be more specific.

>
>> And even though the (de)compression might not make much difference in runtime, it definitely will give higher CPU usage, which is unwelcome in some cases (e.g. mobile devices).
>
> well it runs on multiple cores. Cog runs on a single core. So wasting some CPU cycles
> on the non-used cores won't harm that much.
>
> For mobile devices you might simply not want the image to swap, hence you will pay
> a lot of attention to make sure it stays small. So yes, in this case you won't rely
> directly on such a features.
>
> However swapping out unused parts of the system and reload them are still interesting
> on such a "limited" platform  [1]. And in this case you exactly don't want to waste
> cycles on loading the data, so compression in memory is interesting again.
>
>> It might result in lower overall CPU usage too, but the ~2 compression ration makes me think that it's unlikely.
>
>
> It does if it makes swapping out data cheaper, that's a win. But here you don't
> show me benchmarks either ;)

IIUC there's an unspecified (probably non-public/not open source) project.
You'd like to do some experiments with it (because you have access to it)
and therefore you want to add some extra code (not useful for most users)
to the public VM (used by everyone) to support that expriment. Am I right?

Btw, I'm not showing any benchmarks, because I don't know what to measure.


Levente

>
>
> [1] http://rmod.lille.inria.fr/archives/papers/Mart11c-COMLAN-ObjectSwapping.pdf
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

Camillo Bruni-3

>>> The larger the VM binary is, the slower it will be on today's CPUs.
>>
>> how come?
>
> It's because of the cache hierarchy. You don't have control over what will be where in a binary, or what sizes and levels of cache a CPU has. The smaller the binary is, the higher the chance that the part of the VM which needs to be used right now is in the CPU cache. I wrote about this a few years ago. I found that building all non-essential plugins as external gives ~4-5% better performance (using the Interpreter).

sounds interesting, you have a link on that?

Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

Camillo Bruni-3
In reply to this post by Levente Uzonyi-2

>
> IIUC there's an unspecified (probably non-public/not open source) project. You'd like to do some experiments with it (because you have access to it) and therefore you want to add some extra code (not useful for most users) to the public VM (used by everyone) to support that expriment. Am I right?

sorry, but you're quite wrong about that.
a) if anything I produce open source artifacts
b) I do research
c) I work on pharo

I am looking forward to improve our system and having a state of the art
compression library at hand might be useful. besides that I prefer shipping
a slightly overloaded VM to the community for the sake of wider functionality.

For instance we have the SSL plugin by default now, which is kind of something
you'd expect from any 2012 programming language.

We plan on adding the sources / ast to the image, a perfect use-case for a nice
compression library.
Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

stephane ducasse-2


On Oct 13, 2012, at 5:07 PM, Camillo Bruni wrote:

>
>>
>> IIUC there's an unspecified (probably non-public/not open source) project. You'd like to do some experiments with it (because you have access to it) and therefore you want to add some extra code (not useful for most users) to the public VM (used by everyone) to support that expriment. Am I right?
>
> sorry, but you're quite wrong about that.
> a) if anything I produce open source artifacts
> b) I do research
> c) I work on pharo
>
> I am looking forward to improve our system and having a state of the art
> compression library at hand might be useful. besides that I prefer shipping
> a slightly overloaded VM to the community for the sake of wider functionality.
>
> For instance we have the SSL plugin by default now, which is kind of something
> you'd expect from any 2012 programming language.
>
> We plan on adding the sources / ast to the image, a perfect use-case for a nice
> compression library.

but don't you think that compressing AST
        - cannot benefit from special/specific compression only for them: theo d'hondt mentioned to me one algorithm used in Pcode (of oberon).
        - we should a fall back if this library is not there? Is it widely available?

do you really want to rely on yet another library?


Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

Levente Uzonyi-2
In reply to this post by Camillo Bruni-3
 
On Sat, 13 Oct 2012, Camillo Bruni wrote:

>
>>>> The larger the VM binary is, the slower it will be on today's CPUs.
>>>
>>> how come?
>>
>> It's because of the cache hierarchy. You don't have control over what will be where in a binary, or what sizes and levels of cache a CPU has. The smaller the binary is, the higher the chance that the part of the VM which needs to be used right now is in the CPU cache. I wrote about this a few years ago. I found that building all non-essential plugins as external gives ~4-5% better performance (using the Interpreter).
>
> sounds interesting, you have a link on that?

It was on this list a few years ago, but I can't find it. Doing the
experiment yourself is rather easy to do. With a bit of googling I found
this:
http://stackoverflow.com/questions/2838345/when-does-code-bloat-start-having-a-noticeable-effect-on-performance


Levente

>
>
Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

David T. Lewis
In reply to this post by Camillo Bruni-3
 
On Sat, Oct 13, 2012 at 05:07:40PM +0200, Camillo Bruni wrote:

>
> >
> > IIUC there's an unspecified (probably non-public/not open source) project. You'd like to do some experiments with it (because you have access to it) and therefore you want to add some extra code (not useful for most users) to the public VM (used by everyone) to support that expriment. Am I right?
>
> sorry, but you're quite wrong about that.
> a) if anything I produce open source artifacts
> b) I do research
> c) I work on pharo
>
> I am looking forward to improve our system and having a state of the art
> compression library at hand might be useful. besides that I prefer shipping
> a slightly overloaded VM to the community for the sake of wider functionality.
>
> For instance we have the SSL plugin by default now, which is kind of something
> you'd expect from any 2012 programming language.
>
> We plan on adding the sources / ast to the image, a perfect use-case for a nice
> compression library.

If you are looking for a use case to actually test and measure the
performance differences, the source file stream might be a good use
case. Start by moving it into the image so the sources "file" is in
the image rather than stored on disk. Then do the same with a compressed
sources file (an stc file created with "Smalltalk compressSources"). The
uncompressed version will be about 4 times larger than the compressed
one, so you could measure performance impact there. Finally, add your
LZ4 compression and do a version of compressed sources that uses LZ4.
Test it and see if it is measurably better or worse than zip compression.

Dave

Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

Levente Uzonyi-2
In reply to this post by Camillo Bruni-3
 
On Sat, 13 Oct 2012, Camillo Bruni wrote:

>
>>
>> IIUC there's an unspecified (probably non-public/not open source) project. You'd like to do some experiments with it (because you have access to it) and therefore you want to add some extra code (not useful for most users) to the public VM (used by everyone) to support that expriment. Am I right?
>
> sorry, but you're quite wrong about that.
> a) if anything I produce open source artifacts
> b) I do research
> c) I work on pharo
>
> I am looking forward to improve our system and having a state of the art
> compression library at hand might be useful. besides that I prefer shipping
> a slightly overloaded VM to the community for the sake of wider functionality.
>
> For instance we have the SSL plugin by default now, which is kind of something
> you'd expect from any 2012 programming language.
>
> We plan on adding the sources / ast to the image, a perfect use-case for a nice
> compression library.

So I'm right in that the project is not public/not open source at the
moment, otherwise I'm sure you would have given a link to it already or
at least more information about it.

Experiments are great, but you should keep the changes local before you
finish it.

LZ4 is not a general purpose compression algorithm, because it's
compression ratio is low. It's designed for realtime compression.

The SSL plugin should be external, so it doesn't add anything to the VM
binary unless you use SSL from your image.
r

What I really wanted to point out is that you should show us the benefits
of your proposal, before adding it to the VM.


Levente
Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

Camillo Bruni-3

> So I'm right in that the project is not public/not open source at the moment, otherwise I'm sure you would have given a link to it already or at least more information about it.

it is, it's called Pharo. so again your wrong. it's maybe in our heads,
luckily one cannot put copyrights there yet ;). The idea IS the information.

We plan, somewhere in the glorious future we will have a great system.

I work fulltime as a PhD so no time to do any closed-sure projects aside -
it's that simple ;)

> Experiments are great, but you should keep the changes local before you finish it.
>
> LZ4 is not a general purpose compression algorithm, because it's compression ratio is low. It's designed for realtime compression.

yes you got it right, that's why we plan to use it for runtime
compression of more or less in-frequently used ASTs and Source code in
the image.

But again that's future talk, I know.

> The SSL plugin should be external, so it doesn't add anything to the VM
> binary unless you use SSL from your image.

if you read carefully I corrected myself to only bundle the VM with the plugin,
not link it against, so yes.

> What I really wanted to point out is that you should show us the benefits of your proposal, before adding it to the VM.

we did and if we ship a future PharoXX with it at some point we need to have
it somewhere close to the VM.
Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

stephane ducasse-2
In reply to this post by Levente Uzonyi-2

>>
>
> So I'm right in that the project is not public/not open source at the moment, otherwise I'm sure you would have given a link to it already or at least more information about it.

It is look for Nabujito either on github or smalltalkhub.
And everything that camillo is doing is public and open-source.


> Experiments are great, but you should keep the changes local before you finish it.
>
> LZ4 is not a general purpose compression algorithm, because it's compression ratio is low. It's designed for realtime compression.

Ok
>
> The SSL plugin should be external, so it doesn't add anything to the VM
> binary unless you use SSL from your image.
> r
>
> What I really wanted to point out is that you should show us the benefits of your proposal, before adding it to the VM.

Yes this is what we traditionally do. For example Fuel is the result of a real experience that shows us that (against what I thought)
imagesegment was not worth.

>
>
> Levente

cog
Reply | Threaded
Open this post in threaded view
|

Re: Issue 99 in cog: Link LZ4 Compression

cog
In reply to this post by cog
 

Comment #2 on issue 99 by [hidden email]: Link  LZ4 Compression
http://code.google.com/p/cog/issues/detail?id=99

Mirrored in  
https://pharo.fogbugz.com/default.asp?pre=preMultiSearch&pg=pgList&pgBack=pgSearch&search=2&searchFor=11368

--
You received this message because this project is configured to send all  
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings