Google Protobuf and usage of Slots

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Google Protobuf and usage of Slots

Holger Freyther
Hi,

from my point of view the pillars of future services are Google protobuf[1] and gRPC. Protobuf can be used for configuration files[2], tracing/logging[3] and storage[4]. gRPC is built on top-of HTTP2 and is using the protobuf IDL and marshaling. Many projects (etcd, envoy, Google Cloud, ...) provide a gRPC based API (and some automatically map this from REST to gRPC) already and I would like to build clients and servers in Pharo.


My plan of action is to start with a protobuf compiler/model and then look into binding the gRPC C implementation to Pharo.


I have started with gokr's Protobuf code on Smalltalkhub and defined base types and a Slot and could use some help to structure this more nicely.

The code is at: https://github.com/zecke/pharo-protobuf and I could use some comments/review of how to use Slots in a significant way (add "type" validation on write, constraint checks, builder pattern?).

Given a definition like:

  syntax = "proto2";
  package foo;
  enum Color {
        RED = 0;
        GREEN = 1;
        BLUE = 2;
  }
  message MyMessage {
        optional Color color = 1;
  }

And an encoded binary message of #[8 2] the following can:

  PBTestMessage materializeFrom: #[8 2] readStream


decode and set the color field. Proper handling of mandatory and repeated fields are missing and decoding the other values, nested messages... :)



Looking forward to have some eyes on the code.

holger





[1] A simple IDL to define structs/enums with marshaller/materializer for a binary protocol, JSON (and yaml).

[2] Without having to write a XML/JSON schema and getting validation from the IDL.

[3] Many of us log structured to text, then have something like logstash to re-parse, regexp it and dump it into ElasticSearch. With protobuf we can have plain objects end to end for logging.

[4] We have FUEL so this is less interesting for us than others but quite interesting if you are mixing Pharo with other implementation languages.
Reply | Threaded
Open this post in threaded view
|

Re: Google Protobuf and usage of Slots

Marcus Denker-4
Hello!

Nice, I will have a look.

I am interested to see how class layouts / slots can be used here.
(but it might take some days to find time…)

        Marcus

> On 26 Feb 2019, at 07:54, Holger Freyther <[hidden email]> wrote:
>
> Hi,
>
> from my point of view the pillars of future services are Google protobuf[1] and gRPC. Protobuf can be used for configuration files[2], tracing/logging[3] and storage[4]. gRPC is built on top-of HTTP2 and is using the protobuf IDL and marshaling. Many projects (etcd, envoy, Google Cloud, ...) provide a gRPC based API (and some automatically map this from REST to gRPC) already and I would like to build clients and servers in Pharo.
>
>
> My plan of action is to start with a protobuf compiler/model and then look into binding the gRPC C implementation to Pharo.
>
>
> I have started with gokr's Protobuf code on Smalltalkhub and defined base types and a Slot and could use some help to structure this more nicely.
>
> The code is at: https://github.com/zecke/pharo-protobuf and I could use some comments/review of how to use Slots in a significant way (add "type" validation on write, constraint checks, builder pattern?).
>
> Given a definition like:
>
>  syntax = "proto2";
>  package foo;
>  enum Color {
> RED = 0;
> GREEN = 1;
> BLUE = 2;
>  }
>  message MyMessage {
> optional Color color = 1;
>  }
>
> And an encoded binary message of #[8 2] the following can:
>
>  PBTestMessage materializeFrom: #[8 2] readStream
>
>
> decode and set the color field. Proper handling of mandatory and repeated fields are missing and decoding the other values, nested messages... :)
>
>
>
> Looking forward to have some eyes on the code.
>
> holger
>
>
>
>
>
> [1] A simple IDL to define structs/enums with marshaller/materializer for a binary protocol, JSON (and yaml).
>
> [2] Without having to write a XML/JSON schema and getting validation from the IDL.
>
> [3] Many of us log structured to text, then have something like logstash to re-parse, regexp it and dump it into ElasticSearch. With protobuf we can have plain objects end to end for logging.
>
> [4] We have FUEL so this is less interesting for us than others but quite interesting if you are mixing Pharo with other implementation languages.


Reply | Threaded
Open this post in threaded view
|

Re: Google Protobuf and usage of Slots

HenrikNergaard5
In reply to this post by Holger Freyther
Hi Holger,

> My plan of action is to start with a protobuf compiler/model and then look
> into binding the gRPC C implementation to Pharo.

I have an almost complete parser and compiler (class generation)
implementation of protobuf in Pharo around somewhere...
I cant quite remember the state of it, but I think everything is implemented
parsing wise expect the gRPC part.

Instead of using slots the compiler generates accessor methods for each
variable, initialization, custom encoding and decoding based on its message
definition.

i can probably dig it up and make it publicly available if it is of
interest.

Best regards,
Henrik



--
Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html

Reply | Threaded
Open this post in threaded view
|

Re: Google Protobuf and usage of Slots

Holger Freyther


> On 5. Mar 2019, at 19:06, HenrikNergaard5 <[hidden email]> wrote:
>
> Hi Holger,
>
>> My plan of action is to start with a protobuf compiler/model and then look
>> into binding the gRPC C implementation to Pharo.
>
> I have an almost complete parser and compiler (class generation)
> implementation of protobuf in Pharo around somewhere...
> I cant quite remember the state of it, but I think everything is implemented
> parsing wise expect the gRPC part.
>
> Instead of using slots the compiler generates accessor methods for each
> variable, initialization, custom encoding and decoding based on its message
> definition.

That would be good timing. I have hand created enough descriptions to be able to materialize the descriptor.proto (the model definition in itself) and class generation is next.

I have experimented (by using the right super class) with automatic selector creation and I think it will work if tweaked to handle collections/repeated fields correctly.

A class/type definition currently reads as:

PBTypeMessage subclass: #GBPFieldDescriptorProto
        slots: { #name => (PBTypeString asSlot fieldName: 'name'; fieldNumber: 1; beOptional).
                                #extendee => (PBTypeString asSlot fieldName: 'extendee'; fieldNumber: 2; beOptional).
                                #number => (PBTypeInt32 asSlot fieldName: 'number'; fieldNumber: 3; beOptional).
                                #label => (GBPFieldDescriptorProto_Label asSlot fieldName: 'label'; fieldNumber: 4; beOptional).
                                #type => (GBPFieldDescriptorProto_Type asSlot fieldName: 'type'; fieldNumber: 5; beOptional).
                                #type_name => (PBTypeString asSlot fieldName: 'type_name'; fieldNumber: 6; beOptional).
                                #default_value => (PBTypeString asSlot fieldName: 'default_value'; fieldNumber: 7; beOptional).
                                #options => (GBPFieldOptions asSlot fieldName: 'options'; fieldNumber: 8; beOptional).
                                #json_name => (PBTypeString asSlot fieldName: 'json_name'; fieldNumber: 10; beOptional) }
        classVariables: {  }
        package: 'Protobuf-Compiler-Definitions'


> i can probably dig it up and make it publicly available if it is of
> interest.

That would be helpful!


>
> Best regards,
> Henrik
>
>
>
> --
> Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
>


Reply | Threaded
Open this post in threaded view
|

Re: Google Protobuf and usage of Slots

Holger Freyther
In reply to this post by Marcus Denker-4


> On 4. Mar 2019, at 16:12, Marcus Denker <[hidden email]> wrote:
>


Thanks for the help!

[ToC]

I try to keep it brief so please tell me when to elaborate more.


(1) Handling of default values.
(2) Fast lazy initialization of OrderedCollection for repeated fields
(3) Installation of selectors based on the type
(4) Caching a descriptor reverse table somewhere
(5) Mandatory field handling
(6) Verbosity/Tooling


(1) In Protobuf a field can have a default value associated. I am torn if to use the option of a Slot to write these fields or have a PBTypeMessage>>#initialize that iterates over the slots and writes the default values.

(2) PBRepeatedType>>#decodeFrom:instVarIndex:to: lazily initializes the collection for a repeated field. I am not sure if I can do (much) better and what the trade-offs are (e.g. use the option to write instructions to make loads lazy initialize).

(3) I started to use AccessorInstanceVariableSlot as base-class and it is quite neat. I wondered if I want/should/can protect overwriting a user-generated selector.

(4) For decoding I am building a field_number -> Slot map. I am currently building this whenever a message needs to be parse. I would like to (lazily) initialize it and store it somewhere. I am considering to use a hidden slot but maybe there is something obvious?

(5) After materialization I should check if all mandatory fields were written. The most lazy implementation seems to select all slots that are not repeated/optional and verify that the fields are not nil..

(6) My slot syntax is (too) complex and writing it by hand is difficult. I wondered if Class>>#addSlot: could be exposed nicely in Calypso. And in GTInspector my class is shown without slots. :)



The following works now

$ protoc -odescriptor.pb src/google/protobuf/descriptor.proto

GBPFileDescriptorSet materializeFrom: 'descriptor.pb' asFileReference binaryReadStream.


> Hello!
>
> Nice, I will have a look.
>
> I am interested to see how class layouts / slots can be used here.
> (but it might take some days to find time…)
>
> Marcus
>
>> On 26 Feb 2019, at 07:54, Holger Freyther <[hidden email]> wrote:
>>
>> Hi,
>>
>> from my point of view the pillars of future services are Google protobuf[1] and gRPC. Protobuf can be used for configuration files[2], tracing/logging[3] and storage[4]. gRPC is built on top-of HTTP2 and is using the protobuf IDL and marshaling. Many projects (etcd, envoy, Google Cloud, ...) provide a gRPC based API (and some automatically map this from REST to gRPC) already and I would like to build clients and servers in Pharo.
>>
>>
>> My plan of action is to start with a protobuf compiler/model and then look into binding the gRPC C implementation to Pharo.
>>
>>
>> I have started with gokr's Protobuf code on Smalltalkhub and defined base types and a Slot and could use some help to structure this more nicely.
>>
>> The code is at: https://github.com/zecke/pharo-protobuf and I could use some comments/review of how to use Slots in a significant way (add "type" validation on write, constraint checks, builder pattern?).
>>
>> Given a definition like:
>>
>> syntax = "proto2";
>> package foo;
>> enum Color {
>> RED = 0;
>> GREEN = 1;
>> BLUE = 2;
>> }
>> message MyMessage {
>> optional Color color = 1;
>> }
>>
>> And an encoded binary message of #[8 2] the following can:
>>
>> PBTestMessage materializeFrom: #[8 2] readStream
>>
>>
>> decode and set the color field. Proper handling of mandatory and repeated fields are missing and decoding the other values, nested messages... :)
>>
>>
>>
>> Looking forward to have some eyes on the code.
>>
>> holger
>>
>>
>>
>>
>>
>> [1] A simple IDL to define structs/enums with marshaller/materializer for a binary protocol, JSON (and yaml).
>>
>> [2] Without having to write a XML/JSON schema and getting validation from the IDL.
>>
>> [3] Many of us log structured to text, then have something like logstash to re-parse, regexp it and dump it into ElasticSearch. With protobuf we can have plain objects end to end for logging.
>>
>> [4] We have FUEL so this is less interesting for us than others but quite interesting if you are mixing Pharo with other implementation languages.
>
>