Hi,
I want to run some benchmarks with the Magma object serializer... Am I using it in the right way? Thanks!! Martin | serializer graphBuffer anObject classDefinitionsByteArray graphBufferByteArray loadedObject | anObject := Array with: 1 with: 'string'. serializer := MaObjectSerializer new. graphBuffer := serializer serializeGraph: anObject. classDefinitionsByteArray := serializer classDefinitionsByteArray. graphBufferByteArray := graphBuffer byteArray. "put these two bytearrays into a stream, and reload them..." loadedObject := MaObjectSerializer new classDefinitionsByteArray: classDefinitionsByteArray; materializeGraph: graphBufferByteArray _______________________________________________ Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
Hi Martin and Mariano, thanks for asking. The answer depends on what
you want to measure. Benchmarking software as complex as a serializer is tricky because, as you know, there are multiple functions to measure which are used independently in various real-world use-cases. To gain a meaningful understanding of the performance, you need to bench the most-atomic level of operations a user of a serializer would use individually. - Instantiatation / initialization of a MaObjectSerializer. - Serialization of object graphs of various sizes. - Materialization of said graphs. - Also, if doing comparisons to other serializers (which, I'm guessing you are) it is crucial to ensure each serializer is configured to serialize the same number of objects (i.e., the same depth, etc.) as the other serializers being compared to. - It's also important to discover whether any special configuration-options / preferences which affect the performance can be used. So, given the one example object you've provided, here are some starter scripts which could be used for benching useful serialization operations with MaObjectSerializer. "Initialization" [ MaObjectSerializer new ] bench. "Serialization" | obj ser | obj := Array with: 1 with: 'string'. ser := MaObjectSerializer new. [ ser serializeGraph: obj ] bench "Materialization" | obj ser ba | obj := Array with: 1 with: 'string'. ser := MaObjectSerializer new. ba := (ser serializeGraph: obj) byteArray. [ ser materializeGraph: ba ] bench I have not researched any speed optimizations of MaObjectSerializer in many years, so I'm sure it will not be as fast as Fuel if it was designed for speed from the ground up. I need to profile and revisit performance aspects of MaObjectSerializer. As I said, benching is tricky, and publishing comparisons even trickier, and so I do appreciate your asking for my input and hope whatever you publish to the world will be based on fair, responsible measuring. Toward that end, I support you and thank you for your work on Fuel. Regards, Chris On Thu, Jun 2, 2011 at 12:14 AM, Martin Dias <[hidden email]> wrote: > Hi, > > I want to run some benchmarks with the Magma object serializer... Am I using > it in the right way? > > Thanks!! > Martin > > > | serializer graphBuffer anObject classDefinitionsByteArray > graphBufferByteArray loadedObject | > > anObject := Array with: 1 with: 'string'. > > serializer := MaObjectSerializer new. > graphBuffer := serializer serializeGraph: anObject. > > classDefinitionsByteArray := serializer classDefinitionsByteArray. > graphBufferByteArray := graphBuffer byteArray. > > "put these two bytearrays into a stream, and reload them..." > > loadedObject := MaObjectSerializer new > classDefinitionsByteArray: classDefinitionsByteArray; > materializeGraph: graphBufferByteArray > > _______________________________________________ > Magma mailing list > [hidden email] > http://lists.squeakfoundation.org/mailman/listinfo/magma > > Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
On Fri, Jun 3, 2011 at 3:23 AM, Chris Muller <[hidden email]> wrote: Hi Martin and Mariano, thanks for asking. The answer depends on what Yes, exactly. To gain a meaningful understanding of the performance, you need to If you have by chance some code snippets to generate graphs for testing (maybe you have that in Magma) let us know :) We can a pice of code that generates binary trees... - Materialization of said graphs. yes, exactly. This is the most complicated when you do not know that much the serializer. That's why we were asking :)
With the rest of the serializers that we do is to serialize the graph into a file. How could we do this with Magma Serializer? because serializeGraph: answers a MaSerializedGraphBuffer. So, I guess I can ask the byteArray to it and do a nextPutAll: or something like that to our stream? We also have in Fuel what we call "in memory serialization" that basically returns the byteArray and then we can materialize from that. So this case would be similar to this usage of Magma, wouldn't it ? "Materialization" The same question of serializtion.
No problem. Speed is only one more measure and only needed in certain scenarios. And usually speed comes together with trade-offs. In addition, you cannot compare a serializer of a database to a general-purpose serializer. Even if magma serializr could be used outside magma. There are a lot of things that Magma Serializer has to do that maybe other do not need, or things you cannot do because of magma. So, each serializer has its own goals.
yes, exactly. The problem is in addition that there are supported features of a serializer that inpacts on the results of a benchmark. For example, supported class reshapes, initialization after materialization, support transient instVars...etc... to support all those things you usually spend more time. So maybe you support all that and in the results you are slower in comparisson with someone that do not support that. So yes, measuring speed only is not good. But taking into account the rest of the properties is better. , and so I do appreciate your asking for my input and hope Thanks Chris. Regards, -- Mariano http://marianopeck.wordpress.com _______________________________________________ Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
>> - Instantiatation / initialization of a MaObjectSerializer.
>> - Serialization of object graphs of various sizes. > > If you have by chance some code snippets to generate graphs for testing > (maybe you have that in Magma) let us know :) > We can a pice of code that generates binary trees... Yes, see MaFixtureFactory. MaFixtureFactory current samples or MaFixtureFactory current knot There was a discussion recently about comparing object graphs -- you may be interested in MaObjectSerializerTester, and how it verifies serialized-->remateralized object graphs of any shape against the original fixture graph. Very useful for ensuring your serializer is _working_. :) See MaObjectSerializerTestCase>>#testSamples which sends #maEquivalentForSerializationTest: to determine that.. >> "Serialization" >> | obj ser | >> obj := Array with: 1 with: 'string'. >> ser := MaObjectSerializer new. >> [ ser serializeGraph: obj ] bench >> > > With the rest of the serializers that we do is to serialize the graph into a > file. How could we do this with Magma Serializer? > because serializeGraph: answers a MaSerializedGraphBuffer. So, I guess I can > ask the byteArray to it and do a nextPutAll: or something like that to our > stream? Yes. I, of course, always appreciated the elegance of the notion that MaObjectSerializer could operate directly on Streams, but the problem is that I also want a secure client-server protocol which wraps the serialized requests and responses. So to, for example, calculate a MAC, the full byteArray of the request is required in advance. It's ok though, serialized / materialized objects have to fit into memory anyway, so a streaming API doesn't really offer any practical advantage - just elegance. > We also have in Fuel what we call "in memory serialization" that basically > returns the byteArray and then we can materialize from that. So this case > would be similar to this usage of Magma, wouldn't it ? Yeah - similar to use of "Ma object serializer". (IOW, you don't have to load Magma to use MaObjectSerializer), you can just load MaBase. >> As I said, benching is tricky, and publishing comparisons even >> trickier > > yes, exactly. The problem is in addition that there are supported features > of a serializer that inpacts on the results of a benchmark. For example, > supported class reshapes, initialization after materialization, support > transient instVars...etc... to support all those things you usually spend > more time. So maybe you support all that and in the results you are slower > in comparisson with someone that do not support that. So yes, measuring > speed only is not good. But taking into account the rest of the properties > is better. Yip! Glad you said that. - Chris _______________________________________________ Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
Am 03.06.2011 um 18:30 schrieb Chris Muller: > There was a discussion recently about comparing object graphs Do you have a link to the thread? I somehow missed that and cannot find it. Norbert _______________________________________________ Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
I shouldn't have said, "a discussion about". It was only mentioned.
I was referring to "ESUG SummerTalk - Fuel, binary object serializer" in the PHaro list. On Fri, Jun 3, 2011 at 6:34 PM, Norbert Hartl <[hidden email]> wrote: > > Am 03.06.2011 um 18:30 schrieb Chris Muller: > >> There was a discussion recently about comparing object graphs > Do you have a link to the thread? I somehow missed that and cannot find it. > > Norbert > _______________________________________________ > Magma mailing list > [hidden email] > http://lists.squeakfoundation.org/mailman/listinfo/magma > Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
In reply to this post by Chris Muller-4
Hi Chris,
Sorry for my delay, thank you very much for your answer and for the tips about benchmarking. I also agree in that both serializers have different purposes and so different features and limitations and so looking who is the fastest is not the best comparison. Another question: have you benchmarked the memory use while serializing or materializing? I have never. I know that there is something called SpaceTally but I don't know about it. Cheers, Martin On Fri, Jun 3, 2011 at 1:30 PM, Chris Muller <[hidden email]> wrote:
_______________________________________________ Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
In reply to this post by Chris Muller-4
On Fri, Jun 3, 2011 at 6:30 PM, Chris Muller <[hidden email]> wrote:
Thanks Chris. We have looked at it (and still are).
Ok. So if aStream is a fileStream for example, then the following two methods are correct: >> serialize: anObject on: aStream | serializer graphBuffer classDefinitionsByteArray graphBufferByteArray | aStream binary. serializer := MaObjectSerializer new. graphBuffer := serializer serializeGraph: anObject. classDefinitionsByteArray := serializer classDefinitionsByteArray. graphBufferByteArray := graphBuffer byteArray. self nextByteArrayPut: classDefinitionsByteArray on: aStream. self nextByteArrayPut: graphBufferByteArray on: aStream. and >> materializeFrom: aStream | size classDefinitionsByteArray graphBufferByteArray | aStream binary. classDefinitionsByteArray := self nextByteArrayFrom: aStream. graphBufferByteArray := self nextByteArrayFrom: aStream. ^ MaObjectSerializer new classDefinitionsByteArray: classDefinitionsByteArray; materializeGraph: graphBufferByteArray >> nextByteArrayPut: aByteArray on: aWriteStream aWriteStream nextNumber: 4 put: aByteArray size; nextPutAll: aByteArray >> nextByteArrayFrom: aReadStream ^ aReadStream next: (aReadStream nextNumber: 4) is correct? I, of course, always appreciated the elegance of the notion that what do you mean to "operate directly on Streams" ? but the problem ok I understand.
Good!! we didn't know. Martin fixed that now :) -- Mariano http://marianopeck.wordpress.com _______________________________________________ Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
> Ok. So if aStream is a fileStream for example, then the following two
> methods are correct: > >>> serialize: anObject on: aStream > .... > and > >>> materializeFrom: aStream > > | size classDefinitionsByteArray graphBufferByteArray | > ... > > is correct? Hmm, well, that might work, but you should just use the helper methods that are already provided for this, and which essentially do exactly the same thing. To serialize an object to a file, you may use MaObjectSerializer>>#fileOut:toFileNamed:in:, which calls MaObjectSerializer>>#object:toStream: (operates on any binary WriteStream). For materialization, use MaObjectSerializer class>>#fileIn:, which calls MaObjectSerializer class>>#objectFromStream: (operates on any binary ReadStream). BTW, I just noticed these two methods are incorrectly categorized under 'debugging', they should be under their own category called 'file' or something.. These are just convenience methods for saving / loading users work to a single file. If you would need to load multiple files where performance is concerned, you would want to try to instantiate only one serializer and use it for all of them. - Chris _______________________________________________ Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
On Thu, Jun 9, 2011 at 10:43 PM, Chris Muller <[hidden email]> wrote:
Thanks Chris. In fact, those methods was the kind of thing I was looking for :)
I am not sure if I understood. In our benchmarks, we have a list of samples and each sample is at the same time an array of objects that we serialize/materialize. For each sample we create instantiate a serializer and a materializer. I understood now that we should reuse the same serializer/materialize instance for all samples? if true, why I don't use a Singleton ? I mean...it is not clear for me when to instantiate a serializer. Thanks Chris -- Mariano http://marianopeck.wordpress.com _______________________________________________ Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
> I am not sure if I understood. In our benchmarks, we have a list of samples
> and each sample is at the same time an array of objects that we > serialize/materialize. For each sample we create instantiate a serializer > and a materializer. > > I understood now that we should reuse the same serializer/materialize > instance for all samples? if true, why I don't use a Singleton ? I > mean...it is not clear for me when to instantiate a serializer. Users of MaObjectSerialization only instantiate one serializer to handle a related "groups" of objects - usually related by being of the same set of classes. In this case, if the samples of your benchmark are small (e.g., < 100 obejcts), you should reuse the same serializer for each sample being serialized/materialized. A generic pattern for improving real-world performance is to off-load work to an initialization step. For example, many applications pre-cache certain objects from a database at system startup (a.k.a., the "initialization step"). Since startup of the system is only done once, it is ok if it takes, say, an additional 10 or 30 seconds to pre-cache if it means that users will have sub-second response times after the system comes up rather than response times of 5 seconds.. MaObjectSerialization uses this pattern - it is expensive to instantiate a MaObjectSerializer (about 500 milliseconds) but, in exchange, the performance of the serializer is improved. So if the benchmark is going to include initialization of a new MaObjectSerializer for each of _many tiny_ "samples", then that is not a good measurement of how it would be used in actual practice. The repeated initializations time will dominate 99% of the time consumed, and the "benchmark" would favor the serializers which have fast initialization times but, in fact, may be slower for serialization and/or materialization. This is why I stressed it is important to measure each operation - initialization, serialization, and materialization - individually, so that interpretation of the results can be made with respect to how it would be used. It's up to you, of course. But to bring the benchmark into a real-world usage pattern for MaObjectSerializer, I hope you will consider: 1) measure and report initialization, serialization and materialization separately, 2) reuse the same serializer for all of the tiny samples or 3) use very large samples for the benchmark, so that the initialization cost is not such a large factor. Regards, Chris _______________________________________________ Magma mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/magma |
Free forum by Nabble | Edit this page |