NativeBoost pointer and "+"

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

NativeBoost pointer and "+"

Matthieu
Hello everyone,

I have a small question about NativeBoost : How does the "+" operator when applied to a pointer translates into NativeBoost code ?

To give a bit of context, what I want to do is to reallocate some non-contiguous bytes in memory to a buffer. Basically, I have an array of integers in a buffer and I want to copy some chunks of it in another buffer. The chunks are always the same size and the offset between each chunk is always the same too.

Because a bit of actual code is easier to understand here is what I'd like to do in Pharo :

...

int i, j;
int *data = malloc(1000*sizeof(int));
int *newData = malloc(50*sizeof(int));

// Allocate initial data
for (i = 0 ; i < 1000, i++) {
  data[i] = i;
}

//Copy desired chunks into new buffer
for (i = 0; i < 5; i++ ) {
  memcpy( newData + j*10, data + 200 + j*30, 10*sizeof(int));
  j++;
}

free(data);

...

Here basically I'll get in my buffer chunks of 10 integers starting at 200 with an offset of 30 between chunks, and this 5 times. (200 201 202 ... 208 209 230 231 ... 238 239 260 ... 328 329).

I am okay with the malloc, memcpy and free but I don't know how to handle the "+" operator in my memcpy function.

Thank you,

Matthieu
Reply | Threaded
Open this post in threaded view
|

Re: NativeBoost pointer and "+"

Henrik Sperre Johansen

> On 08 Jun 2015, at 4:41 , Matthieu Lacaton <[hidden email]> wrote:
>
> Hello everyone,
>
> I have a small question about NativeBoost : How does the "+" operator when applied to a pointer translates into NativeBoost code ?
>
> To give a bit of context, what I want to do is to reallocate some non-contiguous bytes in memory to a buffer. Basically, I have an array of integers in a buffer and I want to copy some chunks of it in another buffer. The chunks are always the same size and the offset between each chunk is always the same too.
>
> Because a bit of actual code is easier to understand here is what I'd like to do in Pharo :
>
> ...
>
> int i, j;
> int *data = malloc(1000*sizeof(int));
> int *newData = malloc(50*sizeof(int));
>
> // Allocate initial data
> for (i = 0 ; i < 1000, i++) {
>   data[i] = i;
> }
>
> //Copy desired chunks into new buffer
> for (i = 0; i < 5; i++ ) {
>   memcpy( newData + j*10, data + 200 + j*30, 10*sizeof(int));
>   j++;
> }
>
> free(data);



You can do relative addressing like this:
(destReg ptr: dataSize) + offsetReg + constant

So with offSetRegs containing j* 10 and j* 30, you might end up with an unrolled inner loop (barring using any fancier longer-than-int moves) like:

0 to: 9 do: [:constantOffset |
        asm mov: (destReg ptr: currentPlatform sizeOfInt) + dstOffsetReg + constantOffset  with: (srcReg ptr: currentPlatform sizeOfInt) + 200 + srcOffsetReg + constantOffset]

If the range of j is constant, you can just as easily unroll the whole thing in a similarly compact fashion, space and sensibilites permitting:

0 to: 4 do: [ :j | 0 to: 9 do: [ :consOffset |
        asm mov: (destReg ptr: currentPlatform sizeOfInt) + (j* 10) + constOffset  with: (srcReg ptr: currentPlatform sizeOfInt) + 200 + (j * 30) + constOffset]

Cheers,
Henry
Reply | Threaded
Open this post in threaded view
|

Re: NativeBoost pointer and "+"

Matthieu
Hello Henrik,

Thank you very much for your answer. However, the code you provided is some sort of assembly right ? So does it mean that I need to learn assembly to do what I want ?

I'm asking that because I don't know anything about assembly so it will take me some time to learn.

Cheers,

Matthieu

2015-06-08 19:56 GMT+02:00 Henrik Johansen <[hidden email]>:

> On 08 Jun 2015, at 4:41 , Matthieu Lacaton <[hidden email]> wrote:
>
> Hello everyone,
>
> I have a small question about NativeBoost : How does the "+" operator when applied to a pointer translates into NativeBoost code ?
>
> To give a bit of context, what I want to do is to reallocate some non-contiguous bytes in memory to a buffer. Basically, I have an array of integers in a buffer and I want to copy some chunks of it in another buffer. The chunks are always the same size and the offset between each chunk is always the same too.
>
> Because a bit of actual code is easier to understand here is what I'd like to do in Pharo :
>
> ...
>
> int i, j;
> int *data = malloc(1000*sizeof(int));
> int *newData = malloc(50*sizeof(int));
>
> // Allocate initial data
> for (i = 0 ; i < 1000, i++) {
>   data[i] = i;
> }
>
> //Copy desired chunks into new buffer
> for (i = 0; i < 5; i++ ) {
>   memcpy( newData + j*10, data + 200 + j*30, 10*sizeof(int));
>   j++;
> }
>
> free(data);



You can do relative addressing like this:
(destReg ptr: dataSize) + offsetReg + constant

So with offSetRegs containing j* 10 and j* 30, you might end up with an unrolled inner loop (barring using any fancier longer-than-int moves) like:

0 to: 9 do: [:constantOffset |
        asm mov: (destReg ptr: currentPlatform sizeOfInt) + dstOffsetReg + constantOffset  with: (srcReg ptr: currentPlatform sizeOfInt) + 200 + srcOffsetReg + constantOffset]

If the range of j is constant, you can just as easily unroll the whole thing in a similarly compact fashion, space and sensibilites permitting:

0 to: 4 do: [ :j | 0 to: 9 do: [ :consOffset |
        asm mov: (destReg ptr: currentPlatform sizeOfInt) + (j* 10) + constOffset  with: (srcReg ptr: currentPlatform sizeOfInt) + 200 + (j * 30) + constOffset]

Cheers,
Henry

Reply | Threaded
Open this post in threaded view
|

Re: NativeBoost pointer and "+"

Henrik Sperre Johansen
There are many ways to Rome :)
If you just need some externally allocated objects in the formats you specified you can do the cache extraction using nothing but normal Smalltalk:

intArray := (NBExternalArray ofType: 'int').

data := intArray new: 1000.
1 to:data size do:[:i |data at:i put: i].
cache := intArray new: 50.
0 to: 4 do: [:j | 
1 to: 10 do: [ :k |
cache at: (j* 10) + k put: (data at: 199 + (30 * j ) + k)] ].

But if you want to take full advantage of the performance boost NB offers, you'd write a NativeBoost function to do the cache extraction*, as I outlined last time:
MyClass class >> #createCacheOf: aSource in: aDestination
createCacheOf: aSource in: aDestination
<primitive: #primitiveNativeCall module: #NativeBoostPlugin>
"Should work on both x86 and x64, as long as sizeOf: lookups work correctly"
^ self nbCallout 
function: #(void (int * aSource, int * aDestination) ) 
emit: [:gen :proxy :asm | |destReg srcReg tmpReg intSize ptrSize|
intSize := NBExternalType sizeOf: 'int'.
ptrSize := NBExternalType sizeOf: 'void *'.
"Only use caller-saved regs, no preservation needed"
destReg := asm EAX as: ptrSize.
srcReg := asm ECX as: ptrSize.
tmpReg := asm EDX as: intSize.
asm pop: srcReg.
asm pop: destReg.
0 to: 4 do: [ :j | 0 to: 9 do: [ :offset |
        asm 
"Displacement in bytes, not ptr element size :S, so we have to multiply offset by that manually :S"
mov: tmpReg with: srcReg ptr + (199 + (j * 30) + offset * intSize);
mov: destReg ptr  + ((j* 10) + offset * intSize) with: tmpReg]]]  

and use that;
intArray := (NBExternalArray ofType: 'int').
data := intArray new: 1000. 
1 to:data size do:[:i |data at:i put: i].
cache := intArray new: 50.
MyClass createCacheOf: data in: cache.

The difference using a simple [] bench is about two orders of magnitude; 11million cache extractions per seconds for the inline assembly version, while the naive loop achieves around 110k.

Cheers,
Henry

*as: is not yet defined, could be something like:
AJx86GPRegister >> #as: aSize
^ self isHighByte
ifTrue: [ self asLowByte as: aSize ]
ifFalse: [ 
AJx86Registers
generalPurposeWithIndex: self index
size: aSize
requiresRex: self index > (aSize > 1 ifTrue: [7] ifFalse: [ 3])
prohibitsRex: false ]


On 09 Jun 2015, at 9:46 , Matthieu Lacaton <[hidden email]> wrote:

Hello Henrik,

Thank you very much for your answer. However, the code you provided is some sort of assembly right ? So does it mean that I need to learn assembly to do what I want ?

I'm asking that because I don't know anything about assembly so it will take me some time to learn.

Cheers,

Matthieu

2015-06-08 19:56 GMT+02:00 Henrik Johansen <[hidden email]>:

> On 08 Jun 2015, at 4:41 , Matthieu Lacaton <[hidden email]> wrote:
>
> Hello everyone,
>
> I have a small question about NativeBoost : How does the "+" operator when applied to a pointer translates into NativeBoost code ?
>
> To give a bit of context, what I want to do is to reallocate some non-contiguous bytes in memory to a buffer. Basically, I have an array of integers in a buffer and I want to copy some chunks of it in another buffer. The chunks are always the same size and the offset between each chunk is always the same too.
>
> Because a bit of actual code is easier to understand here is what I'd like to do in Pharo :
>
> ...
>
> int i, j;
> int *data = malloc(1000*sizeof(int));
> int *newData = malloc(50*sizeof(int));
>
> // Allocate initial data
> for (i = 0 ; i < 1000, i++) {
>   data[i] = i;
> }
>
> //Copy desired chunks into new buffer
> for (i = 0; i < 5; i++ ) {
>   memcpy( newData + j*10, data + 200 + j*30, 10*sizeof(int));
>   j++;
> }
>
> free(data);



You can do relative addressing like this:
(destReg ptr: dataSize) + offsetReg + constant

So with offSetRegs containing j* 10 and j* 30, you might end up with an unrolled inner loop (barring using any fancier longer-than-int moves) like:

0 to: 9 do: [:constantOffset |
        asm mov: (destReg ptr: currentPlatform sizeOfInt) + dstOffsetReg + constantOffset  with: (srcReg ptr: currentPlatform sizeOfInt) + 200 + srcOffsetReg + constantOffset]

If the range of j is constant, you can just as easily unroll the whole thing in a similarly compact fashion, space and sensibilites permitting:

0 to: 4 do: [ :j | 0 to: 9 do: [ :consOffset |
        asm mov: (destReg ptr: currentPlatform sizeOfInt) + (j* 10) + constOffset  with: (srcReg ptr: currentPlatform sizeOfInt) + 200 + (j * 30) + constOffset]

Cheers,
Henry


Reply | Threaded
Open this post in threaded view
|

Re: NativeBoost pointer and "+"

Henrik Sperre Johansen

On 09 Jun 2015, at 2:59 , Henrik Johansen <[hidden email]> wrote:

MyClass createCacheOf: data in: cache.

Forgot to change this; you need to pass in the ExternalArray addresses as parameters, not the ExternalArrays themselves.

MyClass createCacheOf: data address in: cache address

Cheers,
Henry
Reply | Threaded
Open this post in threaded view
|

Re: NativeBoost pointer and "+"

Igor Stasenko
In reply to this post by Matthieu
As i understand, in general, the problem that you described is in following:
- you want to pass an address of your buffer contents, but started not from
very first element of your buffer, but somewhere inside a buffer.

In smalltalk you cannot reference an element of array,
only the object (array in that case) as a whole.

The reason why it like so, because VM moves objects around, and you cannot control directly when that happens,
and also VM responsible for updating all pointers (references) to moved object(s)
for all interested parties (which could be other objects, stack etc) , making sure all references remain consistent upon such move.
So, with such constraints, the only way to validly point to an element inside array
would be to store two values separately:
 - a reference to an object, that represent your buffer (which VM would update at will)
 - an index (or offset) in that object, pointing to element in your buffer

Unfortunately, this is the only way how we could implement such, lets say 'ElementPointer' safely. Which then can be used to pass to C function(s),
converting object reference + offset into simple address just before invoking a function (and sure thing, knowing that there's no chance triggering GC, else it will turn into pointer to wrong place, but that's general problem of passing pointers on object memory heap, not just exclusively for 'element pointer' and such).

For buffers allocated externally, e.g. outside heap governed by VM,
there's nothing prevents you from having an address that pointing inside some buffer (or even outside it :)
 
For NBExternalAddress:

addr := self allocate: somespace.

newAddr := NBExternalAddress value: addr value + someoffset.

or

newAddr := addr copy value: addr value + someoffset

sure, it is up to you then, how to calculate offsets and buffer size(s) as well as allocating/deallocating memory for buffers you using.


On 8 June 2015 at 16:41, Matthieu Lacaton <[hidden email]> wrote:
Hello everyone,

I have a small question about NativeBoost : How does the "+" operator when applied to a pointer translates into NativeBoost code ?

To give a bit of context, what I want to do is to reallocate some non-contiguous bytes in memory to a buffer. Basically, I have an array of integers in a buffer and I want to copy some chunks of it in another buffer. The chunks are always the same size and the offset between each chunk is always the same too.

Because a bit of actual code is easier to understand here is what I'd like to do in Pharo :

...

int i, j;
int *data = malloc(1000*sizeof(int));
int *newData = malloc(50*sizeof(int));

// Allocate initial data
for (i = 0 ; i < 1000, i++) {
  data[i] = i;
}

//Copy desired chunks into new buffer
for (i = 0; i < 5; i++ ) {
  memcpy( newData + j*10, data + 200 + j*30, 10*sizeof(int));
  j++;
}

free(data);

...

Here basically I'll get in my buffer chunks of 10 integers starting at 200 with an offset of 30 between chunks, and this 5 times. (200 201 202 ... 208 209 230 231 ... 238 239 260 ... 328 329).

I am okay with the malloc, memcpy and free but I don't know how to handle the "+" operator in my memcpy function.

Thank you,

Matthieu



--
Best regards,
Igor Stasenko.
Reply | Threaded
Open this post in threaded view
|

Re: NativeBoost pointer and "+"

stepharo
In reply to this post by Henrik Sperre Johansen
Henrik

you amaze me :)

Stef

Le 9/6/15 14:59, Henrik Johansen a écrit :
There are many ways to Rome :)
If you just need some externally allocated objects in the formats you specified you can do the cache extraction using nothing but normal Smalltalk:

intArray := (NBExternalArray ofType: 'int').

data := intArray new: 1000.
1 to:data size do:[:i |data at:i put: i].
cache := intArray new: 50.
0 to: 4 do: [:j | 
1 to: 10 do: [ :k |
cache at: (j* 10) + k put: (data at: 199 + (30 * j ) + k)] ].

But if you want to take full advantage of the performance boost NB offers, you'd write a NativeBoost function to do the cache extraction*, as I outlined last time:
MyClass class >> #createCacheOf: aSource in: aDestination
createCacheOf: aSource in: aDestination
<primitive: #primitiveNativeCall module: #NativeBoostPlugin>
"Should work on both x86 and x64, as long as sizeOf: lookups work correctly"
^ self nbCallout 
function: #(void (int * aSource, int * aDestination) ) 
emit: [:gen :proxy :asm | |destReg srcReg tmpReg intSize ptrSize|
intSize := NBExternalType sizeOf: 'int'.
ptrSize := NBExternalType sizeOf: 'void *'.
"Only use caller-saved regs, no preservation needed"
destReg := asm EAX as: ptrSize.
srcReg := asm ECX as: ptrSize.
tmpReg := asm EDX as: intSize.
asm pop: srcReg.
asm pop: destReg.
0 to: 4 do: [ :j | 0 to: 9 do: [ :offset |
        asm 
"Displacement in bytes, not ptr element size :S, so we have to multiply offset by that manually :S"
mov: tmpReg with: srcReg ptr + (199 + (j * 30) + offset * intSize);
mov: destReg ptr  + ((j* 10) + offset * intSize) with: tmpReg]]]  

and use that;
intArray := (NBExternalArray ofType: 'int').
data := intArray new: 1000. 
1 to:data size do:[:i |data at:i put: i].
cache := intArray new: 50.
MyClass createCacheOf: data in: cache.

The difference using a simple [] bench is about two orders of magnitude; 11million cache extractions per seconds for the inline assembly version, while the naive loop achieves around 110k.

Cheers,
Henry

*as: is not yet defined, could be something like:
AJx86GPRegister >> #as: aSize
^ self isHighByte
ifTrue: [ self asLowByte as: aSize ]
ifFalse: [ 
AJx86Registers
generalPurposeWithIndex: self index
size: aSize
requiresRex: self index > (aSize > 1 ifTrue: [7] ifFalse: [ 3])
prohibitsRex: false ]


On 09 Jun 2015, at 9:46 , Matthieu Lacaton <[hidden email]> wrote:

Hello Henrik,

Thank you very much for your answer. However, the code you provided is some sort of assembly right ? So does it mean that I need to learn assembly to do what I want ?

I'm asking that because I don't know anything about assembly so it will take me some time to learn.

Cheers,

Matthieu

2015-06-08 19:56 GMT+02:00 Henrik Johansen <[hidden email]>:

> On 08 Jun 2015, at 4:41 , Matthieu Lacaton <[hidden email]> wrote:
>
> Hello everyone,
>
> I have a small question about NativeBoost : How does the "+" operator when applied to a pointer translates into NativeBoost code ?
>
> To give a bit of context, what I want to do is to reallocate some non-contiguous bytes in memory to a buffer. Basically, I have an array of integers in a buffer and I want to copy some chunks of it in another buffer. The chunks are always the same size and the offset between each chunk is always the same too.
>
> Because a bit of actual code is easier to understand here is what I'd like to do in Pharo :
>
> ...
>
> int i, j;
> int *data = malloc(1000*sizeof(int));
> int *newData = malloc(50*sizeof(int));
>
> // Allocate initial data
> for (i = 0 ; i < 1000, i++) {
>   data[i] = i;
> }
>
> //Copy desired chunks into new buffer
> for (i = 0; i < 5; i++ ) {
>   memcpy( newData + j*10, data + 200 + j*30, 10*sizeof(int));
>   j++;
> }
>
> free(data);



You can do relative addressing like this:
(destReg ptr: dataSize) + offsetReg + constant

So with offSetRegs containing j* 10 and j* 30, you might end up with an unrolled inner loop (barring using any fancier longer-than-int moves) like:

0 to: 9 do: [:constantOffset |
        asm mov: (destReg ptr: currentPlatform sizeOfInt) + dstOffsetReg + constantOffset  with: (srcReg ptr: currentPlatform sizeOfInt) + 200 + srcOffsetReg + constantOffset]

If the range of j is constant, you can just as easily unroll the whole thing in a similarly compact fashion, space and sensibilites permitting:

0 to: 4 do: [ :j | 0 to: 9 do: [ :consOffset |
        asm mov: (destReg ptr: currentPlatform sizeOfInt) + (j* 10) + constOffset  with: (srcReg ptr: currentPlatform sizeOfInt) + 200 + (j * 30) + constOffset]

Cheers,
Henry



Reply | Threaded
Open this post in threaded view
|

Re: NativeBoost pointer and "+"

Matthieu
In reply to this post by Igor Stasenko
@ Igor
 
As i understand, in general, the problem that you described is in following:
- you want to pass an address of your buffer contents, but started not from
very first element of your buffer, but somewhere inside a buffer.

Yes ! Exactly that. I'm bad at explaining things :(


Unfortunately, this is the only way how we could implement such, lets say 'ElementPointer' safely. Which then can be used to pass to C function(s),
converting object reference + offset into simple address just before invoking a function (and sure thing, knowing that there's no chance triggering GC, else it will turn into pointer to wrong place, but that's general problem of passing pointers on object memory heap, not just exclusively for 'element pointer' and such).

Alright, thank you very much for your explanations ! By the way, is there a way to disable the GC for a short period of time and then re-enable it ?



@ Henrik

I am not sure I understand every bit of your code right now but I will definitely study it because it looks awesome.
Moreover, performance is quite important for me so your solution is very attractive and I'll try to use it. Thanks a lot !

I find it both fun and amazing what you can do with Pharo. I never thought I would do assembly inside Pharo !


Again, a big thanks to both of you,

Cheers,
Matthieu




2015-06-09 17:43 GMT+02:00 Igor Stasenko <[hidden email]>:
As i understand, in general, the problem that you described is in following:
- you want to pass an address of your buffer contents, but started not from
very first element of your buffer, but somewhere inside a buffer.

In smalltalk you cannot reference an element of array,
only the object (array in that case) as a whole.

The reason why it like so, because VM moves objects around, and you cannot control directly when that happens,
and also VM responsible for updating all pointers (references) to moved object(s)
for all interested parties (which could be other objects, stack etc) , making sure all references remain consistent upon such move.
So, with such constraints, the only way to validly point to an element inside array
would be to store two values separately:
 - a reference to an object, that represent your buffer (which VM would update at will)
 - an index (or offset) in that object, pointing to element in your buffer

Unfortunately, this is the only way how we could implement such, lets say 'ElementPointer' safely. Which then can be used to pass to C function(s),
converting object reference + offset into simple address just before invoking a function (and sure thing, knowing that there's no chance triggering GC, else it will turn into pointer to wrong place, but that's general problem of passing pointers on object memory heap, not just exclusively for 'element pointer' and such).

For buffers allocated externally, e.g. outside heap governed by VM,
there's nothing prevents you from having an address that pointing inside some buffer (or even outside it :)
 
For NBExternalAddress:

addr := self allocate: somespace.

newAddr := NBExternalAddress value: addr value + someoffset.

or

newAddr := addr copy value: addr value + someoffset

sure, it is up to you then, how to calculate offsets and buffer size(s) as well as allocating/deallocating memory for buffers you using.


On 8 June 2015 at 16:41, Matthieu Lacaton <[hidden email]> wrote:
Hello everyone,

I have a small question about NativeBoost : How does the "+" operator when applied to a pointer translates into NativeBoost code ?

To give a bit of context, what I want to do is to reallocate some non-contiguous bytes in memory to a buffer. Basically, I have an array of integers in a buffer and I want to copy some chunks of it in another buffer. The chunks are always the same size and the offset between each chunk is always the same too.

Because a bit of actual code is easier to understand here is what I'd like to do in Pharo :

...

int i, j;
int *data = malloc(1000*sizeof(int));
int *newData = malloc(50*sizeof(int));

// Allocate initial data
for (i = 0 ; i < 1000, i++) {
  data[i] = i;
}

//Copy desired chunks into new buffer
for (i = 0; i < 5; i++ ) {
  memcpy( newData + j*10, data + 200 + j*30, 10*sizeof(int));
  j++;
}

free(data);

...

Here basically I'll get in my buffer chunks of 10 integers starting at 200 with an offset of 30 between chunks, and this 5 times. (200 201 202 ... 208 209 230 231 ... 238 239 260 ... 328 329).

I am okay with the malloc, memcpy and free but I don't know how to handle the "+" operator in my memcpy function.

Thank you,

Matthieu



--
Best regards,
Igor Stasenko.

Reply | Threaded
Open this post in threaded view
|

Re: NativeBoost pointer and "+"

Igor Stasenko


On 9 June 2015 at 20:05, Matthieu Lacaton <[hidden email]> wrote:
@ Igor
 
As i understand, in general, the problem that you described is in following:
- you want to pass an address of your buffer contents, but started not from
very first element of your buffer, but somewhere inside a buffer.

Yes ! Exactly that. I'm bad at explaining things :(

me too, sometimes. :)
 

Unfortunately, this is the only way how we could implement such, lets say 'ElementPointer' safely. Which then can be used to pass to C function(s),
converting object reference + offset into simple address just before invoking a function (and sure thing, knowing that there's no chance triggering GC, else it will turn into pointer to wrong place, but that's general problem of passing pointers on object memory heap, not just exclusively for 'element pointer' and such).

Alright, thank you very much for your explanations ! By the way, is there a way to disable the GC for a short period of time and then re-enable it ?

Well, some aspects of GC behavior can be controlled, but they serve rather for fine tuning or picking the strategy ahead of time, knowing, what application is going to run. So, at application level, you can use them.. but not at the level of library/framework (like in case of NB), because there's no way to determine what/where will be used, and so, fiddling with GC is worst possible way to solve the problem :)
 
Also, in general, it would be a bad practice to rely on subtle and fuzzy details of GC triggering logic, because it is one of the most sophisticated parts of VM and subject of future changes.

So, instead relying on implementation details, a new contract between VM and language side is introduced and it called 'object pinning'. So, that pinned objects are no longer a subject of relocation in memory. It means that you will be able to control, that chosen object(s) will be not relocated in memory, regardless how often VM triggers GC and what is involved.
And that comes with Spur.
 

@ Henrik

I am not sure I understand every bit of your code right now but I will definitely study it because it looks awesome.
Moreover, performance is quite important for me so your solution is very attractive and I'll try to use it. Thanks a lot !

I find it both fun and amazing what you can do with Pharo. I never thought I would do assembly inside Pharo !


Again, a big thanks to both of you,

Cheers,
Matthieu




2015-06-09 17:43 GMT+02:00 Igor Stasenko <[hidden email]>:
As i understand, in general, the problem that you described is in following:
- you want to pass an address of your buffer contents, but started not from
very first element of your buffer, but somewhere inside a buffer.

In smalltalk you cannot reference an element of array,
only the object (array in that case) as a whole.

The reason why it like so, because VM moves objects around, and you cannot control directly when that happens,
and also VM responsible for updating all pointers (references) to moved object(s)
for all interested parties (which could be other objects, stack etc) , making sure all references remain consistent upon such move.
So, with such constraints, the only way to validly point to an element inside array
would be to store two values separately:
 - a reference to an object, that represent your buffer (which VM would update at will)
 - an index (or offset) in that object, pointing to element in your buffer

Unfortunately, this is the only way how we could implement such, lets say 'ElementPointer' safely. Which then can be used to pass to C function(s),
converting object reference + offset into simple address just before invoking a function (and sure thing, knowing that there's no chance triggering GC, else it will turn into pointer to wrong place, but that's general problem of passing pointers on object memory heap, not just exclusively for 'element pointer' and such).

For buffers allocated externally, e.g. outside heap governed by VM,
there's nothing prevents you from having an address that pointing inside some buffer (or even outside it :)
 
For NBExternalAddress:

addr := self allocate: somespace.

newAddr := NBExternalAddress value: addr value + someoffset.

or

newAddr := addr copy value: addr value + someoffset

sure, it is up to you then, how to calculate offsets and buffer size(s) as well as allocating/deallocating memory for buffers you using.


On 8 June 2015 at 16:41, Matthieu Lacaton <[hidden email]> wrote:
Hello everyone,

I have a small question about NativeBoost : How does the "+" operator when applied to a pointer translates into NativeBoost code ?

To give a bit of context, what I want to do is to reallocate some non-contiguous bytes in memory to a buffer. Basically, I have an array of integers in a buffer and I want to copy some chunks of it in another buffer. The chunks are always the same size and the offset between each chunk is always the same too.

Because a bit of actual code is easier to understand here is what I'd like to do in Pharo :

...

int i, j;
int *data = malloc(1000*sizeof(int));
int *newData = malloc(50*sizeof(int));

// Allocate initial data
for (i = 0 ; i < 1000, i++) {
  data[i] = i;
}

//Copy desired chunks into new buffer
for (i = 0; i < 5; i++ ) {
  memcpy( newData + j*10, data + 200 + j*30, 10*sizeof(int));
  j++;
}

free(data);

...

Here basically I'll get in my buffer chunks of 10 integers starting at 200 with an offset of 30 between chunks, and this 5 times. (200 201 202 ... 208 209 230 231 ... 238 239 260 ... 328 329).

I am okay with the malloc, memcpy and free but I don't know how to handle the "+" operator in my memcpy function.

Thank you,

Matthieu



--
Best regards,
Igor Stasenko.




--
Best regards,
Igor Stasenko.