This summer we introduced nProbe Cento 2.0. Before this release, Cento was supporting JSON serialization only when exporting flows to Kafka. JSON is straightforward and widely used, but it can be verbose and less efficient for high-throughput or resource-sensitive environments. To address these challenges, when exporting flows to ntopng, some time ago we introduced a binary/TLV format for data serialization, implemented in our open-source nDPI library. However, despite this being an open format, it is not widely used. For this reason, in order to improve interoperability with other solutions, we decided to also introduce the Avro serialization format as an additional option when exporting to Kafka. In fact, the key Avro features are:
- Compact and efficient: similar to the nDPI TLV, Avro uses a binary format, making it more compact than JSON. This results in reduced storage and bandwidth usage.
- Flexible schema: Avro includes schemas with the data, which makes it easier to handle changes in the data structure over time. Kafka consumers can dynamically adapt to new schemas, ensuring forward and backward compatibility.
- Faster processing: parsing binary data is faster than processing JSON, which can improve performance in high-throughput environments.
In addition to introducing Avro serialization, we’ve implemented custom templates for Kafka flow export. This feature allows you to define the Information Elements (IEs) to include in your flow exports, whether you’re using JSON or Avro serialization formats. Information Elements are the individual fields that describe the template of a network flow. For example they include source and destination IP addresses, ports, protocols, in/out packets, bytes, etc. Depending on your use case, you might not need all the available IEs in every export and further optimise exported data size and processing.
To start using these new features, follow these simple steps:
- Enable Avro Serialization:
- Update your Kafka configuration in Cento and select Avro as the serialization format (—avro). An Avro schema is automatically generated by Cento.
- Create a Custom Template:
- Define in the Cento configuration file a template specifying the IEs you want to include in the export (–template). The corresponding Avro schema is automatically generated by Cento also in this case.
Example 1 – Exporting flows to Kafka with Avro serialization
cento -i eth0 --kafka “127.0.0.1:9092,127.0.0.1:9093,127.0.0.1:9094;topicFlows" --avro
Example 2 – Exporting flows to Kafka with JSON serialization using a custom template
cento -i eth0 --kafka “127.0.0.1:9092,127.0.0.1:9093,127.0.0.1:9094;topicFlows" --json-labels --template "%SRC_VLAN %SRC_MAC %DST_MAC %IP_PROTOCOL_VERSION %IPV4_SRC_ADDR %IPV4_DST_ADDR %IPV6_SRC_ADDR %IPV6_DST_ADDR %EXPORTER_IPV4_ADDRESS %DIRECTION %INPUT_SNMP %OUTPUT_SNMP %SRC_TO_DST_PKTS %SRC_TO_DST_BYTES %DST_TO_SRC_PKTS %DST_TO_SRC_BYTES %FIRST_SWITCHED %LAST_SWITCHED %L4_SRC_PORT %L4_DST_PORT %PROTOCOL %L7_PROTO %L7_PROTO_NAME"
Example 3 – Exporting flows to Kafka with Avro serialization using a custom template
cento -i eth0 --kafka “127.0.0.1:9092,127.0.0.1:9093,127.0.0.1:9094;topicFlows" --avro --template "%SRC_VLAN %SRC_MAC %DST_MAC %IP_PROTOCOL_VERSION %IPV4_SRC_ADDR %IPV4_DST_ADDR %IPV6_SRC_ADDR %IPV6_DST_ADDR %EXPORTER_IPV4_ADDRESS %DIRECTION %INPUT_SNMP %OUTPUT_SNMP %SRC_TO_DST_PKTS %SRC_TO_DST_BYTES %DST_TO_SRC_PKTS %DST_TO_SRC_BYTES %FIRST_SWITCHED %LAST_SWITCHED %L4_SRC_PORT %L4_DST_PORT %PROTOCOL %L7_PROTO %L7_PROTO_NAME"
Please also check the documentation for the full list of IEs and to learn more about all the export settings and customisations.
We’re committed to continuously improving our software to meet the evolving needs of our users. We look forward to hearing your feedback and seeing how you leverage Avro serialization and custom templates in your workflows. Stay tuned!