Hi, all! How can I figure out, whether the handle of an external object -- if it is a byte array -- was fabricated in the image or whether it is actually something creating in the external library? For actual instances of ExternalAddress, it is obvious that those are meant to point to external memory. But what about instances of ByteArray being stored in the "handle" instVar? I tried Object >> #isPinned. :-D Did not work. I am looking at all the primitives in ByteArray. ... is there actually a difference? Or are all byte arrays that I find in the image actually in the object memory? Best, Marcel |
On Wed, May 20, 2020 at 10:00 AM Marcel Taeumel <[hidden email]> wrote:
Yes they are. - Vanessa - |
In reply to this post by marcel.taeumel
|
In reply to this post by marcel.taeumel
On Wed, May 20, 2020 at 06:53:37PM +0200, Marcel Taeumel wrote:
> Hi, all! > > How can I figure out, whether the handle of an external object -- if it is a byte array -- was fabricated in the image or whether it is actually something creating in the external library? For actual instances of ExternalAddress, it is obvious that those are meant to point to external memory. But what about instances of ByteArray being stored in the "handle" instVar? > > I tried Object >> #isPinned. :-D Did not work. I am looking at all the primitives in ByteArray. > > ... is there actually a difference? Or are all byte arrays that I find in the image actually in the object memory? > > Best, > Marcel > There is no simple answer. A good illustration is this: SourceFiles first fileID On a 64-bit VM, you will see a ByteArray of size 24. On a 32-bit VM, it is a shorter ByteArray. In either case, the ByteArray instance exists entirely within the object memory. The byte values within that ByteArray happen to be the value of a C pointer, which is the address in the process virtual memory of a data structure that lives in FilePlugin within the VM. That data structure contains various things, including (on a Unix VM) another pointer to a FILE struct that lives in the C runtime library. None of those pointers or internal things have any meaning within the image or within the object memory itself. It is best to think of the fileID field as an opaque handle to something in the external world, and the fact that the bytes just happen to be a C pointer is something that you are supposed to not notice. I really wish that Andreas could be here to comment. I clearly recall his shock and dismay on finding out that I was using the actual byte contents of a fileID to do things in the OSProcess plugin. We had slightly different perspectives on that topic, but if you were at all interested in issues of security for the Squeak execution environment (as Andreas was), then you would want to hear his perspective. So in some sense, you should not really be able to know if a ByteArray contains a pointer to something elsewhere in the virtual memory of the VM. On the other hand, if you already know that you are doing something dangerous and insecure, then it would be really convenient to be able to answer the question that you are asking - does this ByteArray object in the object memory contain a reference to some external thing outside of the object memory, and if so is it safe for me to use it? I don't know that there could ever be a safe answer to that question. The image and the VM have no way of knowing what happens to things at the other end of that C pointer. So for example in the case of fileID, you really need to keep track of when a FileStream refers to invalid addresses. Thus the data structure is: /* squeak file record; see sqFilePrims.c for details */ typedef struct { int sessionID; /* ikp: must be first */ void *file; squeakFileOffsetType fileSize; /* 64-bits we hope. */ #if defined(ACORN) // ACORN has to have 'lastOp' as at least a 32 bit field in order to work int lastOp; // actually used to save file position char writable; char lastChar; char isStdioStream; #else char writable; char lastOp; /* 0 = uncommitted, 1 = read, 2 = write */ char lastChar; char isStdioStream; #endif } SQFile; The first field of the struture is sessionID, which is a value associated with the currently running VM program. If you save your image and start it again, the sessionID in the new VM instance will now be different, which allows the FilePlugin to figure out that the pointer to the FILE struct (or to a HANDLE on Windows) is not valid, and therefore it should not attempt to dereference that pointer (VM crash). This is just one example, but it illustrates that general case, which is that the VM cannot be expected to keep track of what people are doing on the other end of those C pointers, and the image in turn cannot be expected to know if a ByteArray that contains a C pointer is referring to anything useful or safe on the other end of the pointer that was saved in the ByteArray. In specific cases, you can consider handling this by keeping track of the known valid external references. If you look at the Windows FilePlugin, you will see that Andreas did this by maintaining a registry of known valid HANDLE values, and failing the primitives when an unregistered HANDLE was passed, e.g. by my WindowsOSProcessPlugin which attempted to pass unregistered HANDLE values for anonymous pipes. This was an annoyance for me because I could not pursue my OSProcess hacks on Windows (and I abandoned the effort). But from a security and system integrity point of view, Andreas was right. To this day, I do not have any good answer for how to handle this. So it is not a easy problem. Dave |
Hi Dave, hi Eliot, hi Vanessa! Thank you very much. Those answers are very helpful! I am trying to learn about the differences of talking to a C library between from a C program and from within Squeak through FFI. I am especially interested in the existing safety nets to rely on or common patterns to follow when using FFI. This includes: - how to care for external structures created through FFI calls - how to care for external structures created in-image, then passed to FFI calls - when to use #new or #externalNew (+#free) - differences between a handle being a ByteArray or an ExternalAddress - what happens to all by external structures when (re-)starting the image - ... In this learning process, I want to double-check whether more clues can be offered through Squeak's tools. Especially if an action would crash the VM. Latest thing -- that's why this question about ByteArrays -- was how to re-think this code: MyStruct foo; someFunctionFillsMyStruct(&foo); Into this code: foo := MyStruct new. "handle is ByteArray" self apiSomeFunctionFillsMyStruct: foo. Meaning, what whould be on the stack in C, can conveniently be hold in Squeak's object memory to be shared across Squeak processes and applications. No need to use malloc() and free(): MyStruct *foo = malloc(sizeof(MyStruct)); someFunctionFillsMyStruct(foo); ... free(foo); Which I can translate to Squeak FFI: foo := MyStruct externalNew. "handle is ExternalAddress" self apiSomeFunctionFillsMyStruct: foo. "same method as above! :-)" ... foo free. While there is no need to change #apiSomeFunctionFillsMyStruct: for this, Squeak FFI conveniently copies structs from C stack memory to object memory anyway. No need to address the heap from within Squeak. Or this there? Performance? :-) Best, Marcel
|
On Thu 21. May 2020 at 00:30, Marcel Taeumel <[hidden email]> wrote:
I was under the impression that is exactly how it works. You just need to make a ByteArray that is large enough to hold the struct. Passing that to FFI will pass a pointer to the first byte of the ByteArray. The API call would fill the ByteArray. So it should Just Work. - Vanessa - |
Hi Vanessa! Ah, I thought so. But I did not verify it by looking at the FFI sources. :-) So, is there any need for #newExternal and #free? Best, Marcel
|
The old object memory did not have pinning. So if you needed an unchanging address, you had to allocate it externally. With Spur’s pinned objects there is less need for external allocations, true. However, there is more risk of corrupting your object memory if the ByteArray is not large enough. Externally allocated memory is a little safer in that regard. Also, you may have to think more about what happens after image reload. Then again, that’s tricky with FFI either way. - Vanessa - On Thu, May 21, 2020 at 11:30 Marcel Taeumel <[hidden email]> wrote:
|
There is a small difference though. Spur alloc on 8 bytes boundary, while the OS might alloc on 16. Believe it or not, depending on alignment, accelerated path can differ. Le jeu. 21 mai 2020 à 20:38, Vanessa Freudenberg <[hidden email]> a écrit :
|
Free forum by Nabble | Edit this page |