Greetings all,
-- As some of you may know, I've started a side project on github to connect eclipse paho mqtt c client library to VA Smalltalk. I was curious how many folks in this forum actively develop user-primitives in C/C++ and hook them up to Smalltalk? There are platform function call-outs in Smalltalk, but I'm specifically talking about users that write/maintain C/C++ code and link to the virtual machine? Motivation: By doing this project it allowed me to put on my "user" hat. In a short time I have identified and submitted 4 cases that caused me pain in user-prim development and I believe will make the product better in a way I wouldn't otherwise have known how to identify. The other team members and myself are finding out exactly the same thing as we work on some of the IOT projects we have been buzzing about. 9.2 will have an incredible number of fixes and refinements in the areas of XD cross-development, Packaging, Seaside, Remote Debugging, the StackDump tooling and all of those fixes are a direct result of the IOT projects. Questions: Do you generally feel writing user-prims is approachable? Desirable? What shortcoming would you like addressed or additional features would you like the see with user-prim development? Would a git-hub project that would scaffold a cross-platform user-prim project and provide some utilities help (more details later)? I would love any feedback. User-prim Scaffolding project: In short, I would like to provide a project that allows users to easily get going with user-prims and not have to worry about the details of setting up the project, hooking up to the vm, compiler directives and so on. I think many of the items in paho-mqtt-vast apply. The current paho-mqtt-vast project has a lot of code patterns that we use on the vm project and, over the 5+ years, have really served us well. We like CMake .. It provides a cross-platform build generation system that can spit out Visual Studio projects, makefile projects, mingw project, and a bunch of others. I use the CLion IDE (the C version of IntelliJ) and it integrates nicely with it. I've heard people speak of the obfuscated cmake generated artifacts which is true...but I generally find it just works. In the main Cmake build script are a bunch of settings, features, patterns that we have learned over the years. In our larger projects, we split these up into many scripts but currently I have it as one consumable script. Additionally, it pulls down dependent projects from github, configures and builds them, locates the esvm40.dll on your machine for linking, builds unit tests, builds documentation. You can see in the build instructions at the top that its pretty simple to get started and how its organized Additionally, I'm building up reusable producer/consumer queues and utilities that folks might find helpful. In this project those are just prefixed with Es. The mqtt specific ones are prefixed with EsMqtt and are not generalized. There is also a reusable header file that helps build portable C code There is a small unit test framework and you can see how its used. Framework is a strong word...maybe unit test utilities but it gets the job done. The following are some notes on how this got started, the issues I ran into, and the cases I have submitted so far. Eclipse Paho Development Notes This was supposed to be an easy Smalltalk binding to Eclipse Paho (In short, MQTT is a lightweight messaging protocol). And it started out that way using EsEntryPoints for mqtt message callbacks. But after a bunch of vm crashes and triple checking the binding code, I resorted to vm debugging. Then it told me what the documentation (had I looked at it) would have said: Whoops. Ok fine, looking at the vm machinery this makes sense. And given that all the mqtt events are coming in on separate native threads, I needed a way to get it back to smalltalk in a safe way. This is how the paho-mqtt-vast (User-Prim) project got started which basically is designed to funnel incoming mqtt events from multiple threads back to smalltalk via the async queue (to be processed during interrupt points). In-between sits a producer/consumer queue that will help conduct this activity, which is another part of the project. Now Smalltalk can happily consume events without trashing the vm. Current cases to be fixed for 9.2 Case 64625: Review header files that are shipped with VA in InstallDir/samples/include - Need to make sure these are all updated for 64-bit. I got the impression that they were not. So much so that I have my own copy of esuser.h which makes me sad:( https://github.com/vasmalltalk/paho-mqtt-vast/blob/master/c/include/es/esuser.h Case 64624: Export EsPostAsyncMessageThruGlobal in the vm - The EsPostAsyncMessage is actually a thread-safe queue (mutex-guarded internally). - But it needs a vmContext object...and from a separate OS thread callback...you can't get one! And you cant' cache one because these are transient (maps to Smalltalk process) - So I have a callback...but I can't post anything to the async queue from it:( - Having the source code, I see the only reason it needs the context is because it needs access to the globalInfo object (a static object you can cache) - Basically, I had to stand up a dummy context that holds on to a globalInfo just so I could post to the queue from a callback in a separate thread:( - With this additional message...we can just provide the globalInfo directly and not worry about it. - Search for _DummyVMContext and you'll see the silliness in https://github.com/vasmalltalk/paho-mqtt-vast/blob/master/c/source/EsMqttAsyncMessages.c - Would have been nice just to call EsPostAsyncMessageThruGlobal Case 64623: Promote EsAllocateMemory/EsFreeMemory to be exported on Windows - Some may know that there are a few different ways to allocate/free in various C Apis - You always want to be sure you use the same way to free as you did for allocate (if not...crash) - Yes, I crashed. Because I was allocated memory using c malloc for an OSObject to wrap that later used OSObject>>free (which call EsFreeMemory...which uses a windows api to free). - So it would be nice to offer the user a way to ensure this is consistent Case 64622: Build .lib files on Windows and distribute in INSTALL/samples/lib - Since the new vm, we don't build with Visual Studio's. We use MinGW-w64 on windows, and GCC everywhere else. (and LLVM for our core interpreter, but that's a specialized creature) - Lib files are not things we need and did not consider this when releasing the new vm - Users might want to build their projects with MSVC and, while they can generate the lib file, it would be nice to provide it in order to link to the vm. - What's there is outdated. And the only reason I know is now I have a reason to find and try all this stuff. I'm sure there will be more to come. And if you have more, please tell me...chances are I don't know about it:) - Seth You received this message because you are subscribed to the Google Groups "VA Smalltalk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To post to this group, send email to [hidden email]. Visit this group at https://groups.google.com/group/va-smalltalk. For more options, visit https://groups.google.com/d/optout. |
Administrator
|
On Tuesday, March 19, 2019 at 1:29:19 PM UTC-7, Seth Berman wrote:
--
In 25 years of working with VA, I can honestly say that I have successfully avoid user-primitives! :-) But, hats off to you for revisiting them.
You received this message because you are subscribed to the Google Groups "VA Smalltalk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To post to this group, send email to [hidden email]. Visit this group at https://groups.google.com/group/va-smalltalk. For more options, visit https://groups.google.com/d/optout. |
Very,very interested topic ... but actually I've only written ONE user primitive in my 20 years of VASmalltalk ... and this was only for personal interests. It was too painful ... Marten Am Samstag, 23. März 2019 02:00:35 UTC+1 schrieb Richard Sargent:
You received this message because you are subscribed to the Google Groups "VA Smalltalk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To post to this group, send email to [hidden email]. Visit this group at https://groups.google.com/group/va-smalltalk. For more options, visit https://groups.google.com/d/optout. |
Hi All,
-- The avoidance and pain is basically what I suspected because I don't often hear a lot about user-prim feedback and its been a part of VA for a long time. So thanks for taking the time and for the validation. For folks that are interested, I include some more thoughts below. One day I may collect these and centralize this information for others. I tend to avoid user-prims myself because VA's FFI makes it easy, assuming you are comfortable with C concepts, to achieve a lot of this from Smalltalk. I also enjoy doing bindings more in a live environment than edit-compile-run-crash-debug environments. What I'm having to do now with user-prims is basically by force, though I'm more annoyed at some of the missing pieces I see and the needless setup I have to do rather than the C development itself. This could be made a lot easier. Working with the framework I'm putting together in the mqtt project, I would already shave off a lot of time. It might be useful for me to enumerate some advantages/disadvantages of user-prims that I see. This is an incomplete list, because as I've said, I just recently discovered their requirement for dealing with multi-threaded callbacks. I welcome those that have others because it might help me during some of the improvements, which I feel can be minimal but with an overall positive impact. And just to capture, as I would like a place to put this information one day. User Prims vs Smalltalk FFI 1. User-prims are significantly faster, by many factors, than platform function callouts. If you're calling out to a small C function (one with just a handful of instructions) many times from smalltalk and using a PlatformFunction instead of a user-prim, then just know that you could probably get a 3x speedup just from switching to a user-prim to do the same thing. You're mostly just executing overhead in terms of the ratio of instructions that get you to the C function and back vs the instructions in the C function itself. The FFI that smalltalk uses has overhead. Overhead in vm state saving and calling convention (saving off registers, stack operations, restoring registers), constructing a call dynamically that can invoke an exported C function, converting Smalltalk Objects to C for arguments and back for the return type. Async (threaded) call-outs have even more overhead. There are lots of instructions involved in this. In the common case, this is ok. Usually C libraries are constructed such that the exported functions are the entry points to what might be larger algorithms. For example, in a lot of the compression call-outs in EsCompressionStreams and related there are functions like "compress this buffer from i to the buffer size and put the resulting bytes in this other buffer starting at j". Using most realistic buffer sizes, the FFI overhead becomes nothing more than noise...you don't need user-prims for this. Most of the libraries that I deal with are usually this situation, but there are exceptions. As a historic note, IBM's virtual machine had the actual ffi machinery defined in smalltalk-generated assembly. This meant supporting new calling conventions and platforms became a major issue in the porting process. We use libffi today and I think it was a fantastic choice. This is one of the reasons that we ported the full virtual machine to ARM with relative ease, and that we got cdecl support for entry points which required almost no effort, since we were already hooking up to libffi. It does not work out this way for most. 2. The smalltalk ffi gives you access to threaded call-outs. This could be a dedicated thread resource (i.e staticFuture) or from a thread-pool (asyncCalls). And this is obviously very nice because you can take advantage of multiple processors and really push your machine resources from one image. I've done examples of this with async calls to crypto functions and pushed 3 or 4 cores to be fully utilized. And the reason it topped out at 4 was because the system's random number generator was creating the bottleneck for the algorithms I was running. During this time, other smalltalk processes may continue to run and only the process that called the threaded-function needs to block. The downside is that the code that you are running is not Smalltalk code, it must be C code. 3. User-prims expose some inescapable vm details. This is one of the reasons its so fast. It's a quick call and what you are presented with is the active process and the raw arguments that are on the smalltalk stack which are not converted. The arguments are still in Smalltalk object form which doesn't resemble what they might look like in C. Arguments have to be converted if you want to use them. "esuser.h" has macros to help, but it's something you are exposed to. The place where you can get into a world of hurt is allocating a smalltalk object from a user-prim. Sometimes you have to, but you really have to be careful. Given the following 3 statements...statement 3 is absolutely wrong. And you will find that out very obtusely "at some point". It's not deterministic. EsObject a = EsAllocateArray(...) EsObject b = EsAllocateArray(...) EsPrimSucceed(a); <-- wrong If the allocation in statement 2 causes a garbage collection to occur, then 'a' from statement 1 is most "likely" pointing to an invalid location. 'a' no longer points to the array that was allocated, because the garbage collector "likely" moved it. You need to save 'a' on the Smalltalk stack first (there are macros for this) and then restore 'a' afterwards. This works because Smalltalk's garbage collector will update everything on the stack of all vmContexts (the backing data structure for a Smalltalk process). So you can be sure when you ask for it back from the stack...that the address has been updated to where it was moved. The vm team writes and modifies user-prims all the time. The Swapper (ObjectDumper/ObjectLoader) is a bunch of very complex user-prims. When we were porting that, we had bugs like this that took me weeks to find. When you have multiple branches, loops and an 800 line user-prim with lots of these allocations....it can get bad. Splitting it up into multiple functions can actually help obfuscate this unfortunately. So please be careful! 4. User-prims are in C and FFI-bindings are in Smalltalk (enough said:) 5. There are some useful patterns that we just don't see, but can be done through user-prims. One thing that is difficult are constants, which with bindings we usually put in pool dictionaries. And for any reasonable library, they provide the same constant values across all supported platforms so this is generally ok. But there are a host of platform-specific constants that share the same name but actually have different values across platforms This is so much easier in C, you just put the name of the constant and the compiler injects the value for the platform it's compiling for. In Smalltalk, you can't do this. You would have to start using conditional code that is doing per-platform injection of the actual values and it can get complex. One pattern that you don't see a lot of, but actually we have an example for, is on Unix for some X/Motif constants. It actually makes a user-prim call which returns the keys and values for pool dictionary names and values. So you can use the user-prim to resolve the actual values (since it runs in C) and inject it in the pool dictionary in one location. No platform-specific code needed. I'm sure there are a lot more, but I'll cut it off for now. Thanks again for all who provided feedback. - Seth On Saturday, March 23, 2019 at 3:50:39 AM UTC-4, Marten Feldtmann wrote:
You received this message because you are subscribed to the Google Groups "VA Smalltalk" group. To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email]. To post to this group, send email to [hidden email]. Visit this group at https://groups.google.com/group/va-smalltalk. For more options, visit https://groups.google.com/d/optout. |
Free forum by Nabble | Edit this page |