Website logo: pixel art of a black cat with a yellow background

Bruno
Costa

Protogen

From April 2018 to present

https://github.com/brunexgeek/protogen

Protogen is a tool designed to streamline the process of serializing and deserializing data as JSON within C++ programs. Data models are defined using a subset of proto3 syntax and the protogen tool automatically generate C++ code for serialization and deserialization operations. The generated code requires a C++11-capable compiler and has no external dependencies. Both the compiler and the generated code are distributed under the permissive Apache License 2.0.

Build

# mkdir build && cd build # cmake .. # make

Usage

Create a .proto file:

syntax = "proto3"; message Person { string name = 1; int32 age = 2; repeated string pets = 3; }

Compile .proto files using protogen program:

# ./protogen model.proto model.pg.hh

Include the generated header file in your source code and use it. That’s all!

#include "model.pg.hh" ... // create and populate an object Person person1; person1.name = "이주영"; person1.age = 32; person1.pets.push_back("티그"); // JSON serialization std::string json; if (!person1.serialize(json)) std::cerr << "Error" << std::endl; // JSON deserialization Person person2; if (!person2.deserialize(json)) std::cerr << "Error" << std::endl; else std::cout << person2.name << std::endl; // JSON deserialization with parameters and error information (optional) Person person3; protogen_3_0_0::Parameters params; if (!person3.deserialize(json, &params)) std::cerr << "Error: " << params.error.message << " at " << params.error.line << ':' << params.error.column << std::endl; else std::cout << person3.name << std::endl; ...

Compile the program as usual. In the example above, the output would be:

{"name":"이주영","age":32,"pets":["티그"]} 이주영 이주영

Types generated by protogen compiler contain helper functions like clear, empty and comparison operators.

Supported proto3 options

These options can be set in the proto3 file:

  • obfuscate_strings (top-level) – Enable string obfuscation. If enabled, all strings in the C++ generated file will be obfuscated with a very simple (and insecure) algorithm. The default value is false. This option can be used to make a little difficult for curious people to find out your JSON field names by inspecting binary files.
  • number_names (top-level) – Use field numbers as JSON field names. The default value is false. If enabled, every JSON field name will be the number of the corresponding field in the .proto file. This can reduce significantly the size of the JSON output.
  • transient (field-level) – Make the field transient (true) or not (false). Transient fields are not serialized/deserialized. The default value is false.
  • cpp_use_lists (top-level) – Use std::list (true) instead of std::vector (false) in repeated fields. This gives best performance if your program constantly changes repeated fields (add and/or remove items). This option does not affect bytes fields which always use std::vector. The default value is false (i.e. use std::vector).
  • name (field-level) – Specify a custom name for the JSON field, while retaining the C++ field name as defined in the message. If no custom name is provided, the JSON field and the C++ field name will be the same.

Features

Supported field types:

  • messages (see Limitations)
  • repeated
  • optional - Fields are always optional, but the syntax is accepted for completeness.
  • double, float
  • int32, sint32, uint32, fixed32, sfixed32
  • int64, sint64, uint64, fixed64, sfixed64
  • bool
  • string
  • bytes
  • any
  • oneof - The compiler do not actually supports it, but you can have a similar behavior by calling empty to check whether a field is present.
  • map

Proto3 syntax features:

  • Line and block comments
  • Packages
  • Imports
  • Options
  • Nested messages
  • Enumerations

Type mapping

The following table maps proto3 types with their corresponding C++ types in generated code. In the namespace protogen_x_y_z, the part x_y_z will be replaced by the current version number.

proto3 C++
message class
repeated std::vector or std::list
double protogen_x_y_z::field<double>
float protogen_x_y_z::field<float>
int32 protogen_x_y_z::field<int32_t>
sint32 protogen_x_y_z::field<int32_t>
uint32 protogen_x_y_z::field<uint32_t>
int64 protogen_x_y_z::field<int64_t>
sint64 protogen_x_y_z::field<int64_t>
uint64 protogen_x_y_z::field<uint64_t>
fixed32 protogen_x_y_z::field<uint32_t>
fixed64 protogen_x_y_z::field<uint64_t>
sfixed32 protogen_x_y_z::field<int32_t>
sfixed64 protogen_x_y_z::field<int64_t>
bool protogen_x_y_z::field<bool>
string protogen_x_y_z::string_field
bytes std::vector<uint8_t>

Some considerations:

  • optional is accepted only for compatibility since everything is always optional in protogen and all field types have the empty function to check its presence.
  • Exact precision for 64-bit integers (e.g. int64, uint64) is guaranteed only when using up to 53 bits, since JSON numbers are always IEEE-754 doubles.
  • C++ integer types are defined by <cstdint>.

Limitations

These are the current limitations of the implementation. Some of them may be removed in future versions.

Proto3 parser:

  • Circular references are not supported;

License

The library and the compiler are distributed under Apache License 2.0.

Code generated by protogen compiler is distributed under The Unlicense, but it depends on Protogen code which is licensed under Apache License 2.0.