Quantcast
Channel: ntop
Viewing all 544 articles
Browse latest View live

How to export BGP routing information (AS Path) in network flows

$
0
0

Tools like traceroute have been used for a long time to track the forward path of packets, i.e. the journey of our packets to a remote destination. Unfortunately with traceroute nothing can be said about the path of ingress packets, it not assuming that routing is symmetrical, fact that is often not correct. For this reason we have designed a solution that allows path information to be report in emitted flows. As the most popular exterior gateway protocol used on the internet is BGP, we have designed a tool that allows nProbe to receive BGP messages and use them to infer routing information. Using the nProbe BGP plugin it is possible to export the first ten Autonomous Systems (ASes) in both the AS-path to the client and to the server of each flow.

AS-paths, which are well-known mandatory BGP attributes, are determined by establishing a BGP session with a BGP-router. BGP sessions are established by an helper script bgp_probe_client.pl which encapsulates all of the functionality needed to establish and maintain a BGP peering session and exchange routing update information. Specifically, the script:

  • Establishes a BGP session with a BGP-router
  • Reads BGP updates to extract AS-paths
  • Sends AS-paths to nProbe

The script is opensource and can be downloaded from GitHub.

nProbe uses AS-paths received from the script to export additional elements SRC_AS_PATH_1, ..., SRC_AS_PATH_10 and DST_AS_PATH_1, ..., DST_AS_PATH_10, indicating up to the first 10 ASes in the AS-path to the client and to the server. Those elements can be specified in the nProbe template (option -T) as any other regular element.

So for example, one can export the first 8 ASes of the client and the server of flows monitored from interface eth1 as follows:

sudo ./nprobe --bgp-port 9999 --zmq "tcp://*:5556" -i eth1 -n none -T "@NTOPNG@ %SRC_TOS %DST_TOS %INPUT_SNMP %OUTPUT_SNMP %SRC_AS_PATH_1 %SRC_AS_PATH_2 %SRC_AS_PATH_3 %SRC_AS_PATH_4 %SRC_AS_PATH_5 %SRC_AS_PH_6 %SRC_AS_PATH_7 %SRC_AS_PATH_8 %DST_AS_PATH_1 %DST_AS_PATH_2 %DST_AS_PATH_3 %DST_AS_PATH_4 %DST_AS_PATH_5 %DST_AS_PATH_6 %DST_AS_PATH_7 %DST_AS_PATH_8"

Apart from the well-known options, the extra --bgp-port 9999 is required for the communications between nProbe and the bgp_probe_client.pl.

For an in-depth description of the plugin, refer to the official nProbe documentation. Keep in mind that the BGP plugin also works when nProbe is collector mode.

 


nDPI 2.6-stable is Out

$
0
0

This new release brings several fixes that make nDPI more stable. Such fixes involve especially DNS and HTTP traffic dissection.

Here is the full list of changes:

  • New Supported Protocols and Services
    • Added Modbus over TCP dissector
  • Improvements
    • Wireshark Lua plugin compatibility with Wireshark 3
    • Improved MDNS dissection
    • Improved HTTP response code handling
    • Full dissection of HTTP responses
  • Fixes
    • Fixed false positive mining detection
    • Fixed invalid TCP DNS dissection
    • Releasing buffers upon realloc failures
    • ndpiReader: Prevents references after free
    • Endianness fixes
    • Fixed IPv6 HTTP traffic dissection
    • Fixed H.323 detection
  • Other
    • Disabled ookla statistics which need to be improved
    • Support for custom protocol files of arbitrary lengthUpdate radius.c to RFC2865

Cento 1.6 Stable Just Released

$
0
0

After more than one year since the latest stable release, we are glad to announce cento 1.6-stable. This new release brings stability, fixes and several new features.

Among the new features, it is worth mentioning that:

  • Flows can be exported in a standardized JSON to text files.
  • By default, a user cento runs and owns both the process and process files. This makes running cento more secure than using root. In addition, any user in the system can be used to run cento.
  • A capture direction can be indicated so that cento will capture only TX, only RX, or both directions, depending on what has been specified.
  • Traffic counters can now be dumped in a second-by-second fashion to increase the traffic visibility as it has already done in nProbe.

Here is the full list of changes:

  • Main New Features
    • Ability to configure multiple netflow collectors with load-balancing
    • Support for Ubuntu 18
    • Added PEN to JSON fields
    • Added heuristic for handling DNS response dissection
    • HTTP dissection and export in textual flows
    • Check drop stats for the egress aggregated queue in case of multiple threads
    • Running cento process as cento by default, falling back to nobody if it does not exist
    • Print Kafka metadata
  • New Options
    • Implemented --dump-second-counters for dumping (-P) second counters
    • Added --dump-json-format to dump text files in JSON (instead of text) format
    • --unprivileged-user|-u <username> to run cento with any given user
    • Added VxLAN support with --tunnel
    • Added new --capture-direction to capture from TX, RX or both directions
    • Added -K to set kernel cluster id
    • New --timestamp-format option
  • Fixes
    • Fixed segfault fix with Kafka on ubuntu 14
    • Fixes for ZC interfaces with RSS enabled in bridge mode
    • IPV6 export fixes
    • Packet slicing fixes (aggregated egress queues)
    • Added missing DNS response code handlingFixed DNS response dissection in uDPI
    • Added fix for “remembering” the L7 protocol in case a flow lasts too long
    • Added optimization to avoid reprocessing DPI when taken from the previous flow fragment
    • Fixes broken non-systemd cento control due to missing pid file
    • Fixes systemd / non-systemd packages installation

ntopng Multilanguage Support: EN, IT, DE and JP

$
0
0

We are happy to announce that ntopng has gone fully international! The following languages are now officially supported:

  • English
  • Italian
  • Japanese
  • German

Language files are completely opensource, meaning that you can choose your preferred ntopng language, no matter if you are a community, Professional or Enterprise user!

Languages are supported on a per-user basis, hence, multiple ntopng users (both administrators and normal users) can simultaneously use ntopng, each one with his/her language of choice.

Switching the language is a breeze. Just visit the “Manage Users” page, select the user of interest, click “Manage” then “Preferences” and pick a language.

Least but not last, a big thank goes to our friend Hideyuki Yamashita (yhidey_2001@yahoo.co.jp) who has donated his time to turn the dream of having a japanese-speaking ntopng into reality! This is a big service for the community! Kudos Hideyuki!

How enable DPI-based Traffic Management in pfSense using nEdge

$
0
0

We have been receiving several inquiries from pfSense users who would love to complement the classical firewall-style pfSense features with the inline Layer-7-based traffic policing offered by nEdge. Being able place pfSense and nEdge side by side allows to overcome the common belief which sees the bad guys on the Internet and the good guys on the Local Area Network (LAN). Bad guys are on the Internet and this is true. Period. However, bad guys are also on the LAN, especially today in the Bring-Your-Own-Device (BYOD) era. Think to infected personal computes, vulnerable IoT devices (video surveillance cameras, for example), or compromised smartphones, just to name a few. nEdge allows to enforce Layer-7 policies to prevent LAN devices, being them compromised or not, from doing Tor, using unsafe or unwanted DNS servers, or performing unencrypted plain HTTP traffic, just to give a bunch of examples.

Unfortunately, creating this synergy is not that easy as nEdge has not been ported to FreeBSD and, consequently, to pfSense. Indeed, nEdge heavily relies on certain functionalities provided by Linux kernels and kernel modules. Specifically, such functionalities are mostly offered by the Netfilter framework and by its corresponding userspace utilities such as conntrack, iptables and ebtables. This strong coupling between Linux and nEdge makes it actually unfeasible to work on a FreeBSD port as it would basically mean to rewrite the majority of the code to use FreeBSD utilities such as ipfw.

Being it virtually unfeasible to port nEdge to FreeBSD, we would like to briefly discuss how to setup nEdge to make it work in close cooperation with pfSense. Typically, pfSense firewalls are deployed between the Internet and the Local Area Network.

nEdge, in the configuration above, can be placed between the Internet and pfSense, or between pfSense and the LAN. You can choose to leave pfSense directly exposed to the Internet (for example if you want it to perform the first checks and cleanups on the traffic) or you can choose nEdge to be exposed to the Internet, to let pfSense receive Internet traffic which has already been cleaned at the Layer-7.

 

The easiest way to setup nEdge is to use its bridge mode. In bridge mode, nEdge acts as a transparent bridge which enforces Layer-7 policies and cleans the traffic from unwanted applications or devices which are jeopardizing the network. This mode is described in detail here.

nEdge can be run on small low end devices such as PC Engines apu2 system boards, ZOTAC Mini PCs and fully-fledged computers, always with a minimum of 2 network interfaces to actually bridge the traffic between the LAN and the WAN interface.

nEdge interfaces should be connected as follows:

  • When nEdge is placed between the Internet and pfSense
    • The nEdge WAN interface should be connected to the Internet and;
    • the nEdge LAN interface should be connected to the pfSense WAN interface, previously connected to the Internet.
  • When nEdge is placed between pfSense and the LAN
    • The nEdge WAN interface should be connected to the pfSense LAN interface and;
    • The nEdge LAN interface should be connected to the LAN.

In both cases, you are guaranteed that the traffic will go through the nEdge before reaching the Internet or the clients.

If you are interested in knowing more about the nEdge, feel free to drop us an email, check out the official documentation, or have a look ad the introductory YouTube video.

 

Detecting Hidden Hosts and Networks on your (shared) LAN

$
0
0

In theory on switched networks each portion of a LAN is independent. This means that for instance that network 192.168.1.0/24 and 192.168.2.0/24 are using different switch ports that communicate through a router,  and also that are not sharing the same physical network. Unfortunately sometimes people violate this principle by putting on the same physical port multiple networks.

The reasons are manyfold:

  • You want to run a VM on your host that can (silently) communicate with other devices and thus you want to use a different network address plan.
  • You use devices that have an embedded switch (e.g. Apple Airport Time Capsule NAS device) to which you connect both your PC (with a publicly accessible IP address) and the backup device that is not supposed to be accessed from the Internet and thus living on a different network.
  • Some of your colleagues are trying to hide some devices and thus are decided to use a network other than the one used on the LAN.
  • You migrated your network to a different addressing scheme but you forgot to update some devices that are still configured with the old network.
  • Somebody attached (without configuring it) a new device just purchased that is then using a different network address.

So in essence there are many reasons ranging from misconfiguration, to malicious users who attach devices to the network hoping not to be discovered. Fortunately moderns devices are rather verbose and advertise their presence for instance through MDNS (Multicast DNS), IPv6 advertisements, and for sure ARP on IPv4 networks.

If you want to discover these devices living on “ghost networks”, you can now do it easily using ntopng in a matter of clicks.

Just go to the interface menu and select your network device and then click on the “Networks” tab. There you will see listed the networks that have been learnt by ntopng using ARP messages. In case they do not overlap with the IP networks configured on your network card (i.e. eth0 in the above image) ntopng will tell you if:

  • There is a misconfiguration (i.e. on your network card you configured 192.168.10.0/24 but ntopng has learnt 192.168.10.0/25, so the network mask was wrong).
  • If there are devices belonging to networks that were not supposed to exist, and thus they are marked with a ghost icon.

If you want to find out what devices belong to such ghost network, you can click on the network label and see something to the image below:

Now the final question: where are those network devices attached so you network admins can go and chase them? Just click on the device IP address and if you have configured your SNMP devices in ntopng you can find out where are those devices physically located and on what network port they are have been connected to the network.

All done using ntopng, without having to use several tools. Easy isn’t it?

Enjoy!

 

Monitoring Containerised Application Environments with eBPF

$
0
0

Earlier this week ntop and InfluxData held a joint webinar about monitoring containerised applications. We have discussed solution for monitoring both legacy (e.g. non-containerised) and containerised applications, what are the technologies we can use. As most of you know, we have developed libebpfflow that is an open source library for generating IPFIX-like flows not using packets but system events we capture with eBPF. In addition to this, we are developing a new version of the nProbe product family that is able to also exploit Netlink to complement eBPF statistics with traffic counters.

You can read more of this ongoing work on these slides we have prepared.

We’re finalising this work and we’ll soon announce it for general availability, so stay tuned.

ntopConf 2019 Retrospective

$
0
0

On May 8-9th we have organised our yearly event, in Padova, Italy. The first day was dedicated to training and the second day to the conference. Overall about 150 people attended the event, and we’re glad of it. Our gratitude goes to the speakers, Wintech that took care of logistics, and to all those that made this event a success. Below you can find the presentation slides used during the conference.

If you want to organise an ntop-centric event/meetup, please contact us!


Packets vs Flows: Which Option is the Best?

$
0
0

One of the most difficult steps on a monitoring deployment scenario is to choose where is the best point where traffic has to be monitored, and what is the best strategy to observe this traffic. The main options are basically:

  • Port Mirroring/Network Tap
  • NetFlow/sFlow Flow Collector

Port Mirroring/Network Tap

Port mirroring (often called span port) and network tap have already been covered on a previous post. They are two techniques used to provide packet access that often are the best way to troubleshoot network issues as packets are often perceived as the ground truth (“packets never lie”). This means that we are able to have a “full packet visibility” because we have visibility of L2 (mirror) or L1 (taps). There are various types of hardware taps where the most complex is called network packet broker. A good introduction to this topic, Tap vs. Span, can be found on this deep dive article. Note that is you are monitoring traffic on a computer you have access to, you can avoid this technique by simply running your monitoring tool on this host: just be aware that you will introduce some extra load on the server and thus that you network communications might be slightly affected by your monitoring activities.

NetFlow/sFlow Flow Collector

In flow collection we have no direct access to packets with some little differences. In NetFlow/IPFIX is the probe running inside the router that clusters together similar packets according to a 5-tuple key (proto, IP/port src/dst) and computes metrics such as bytes, packets; in a way NetFlow/IPFIX “compresses” (not literally) traffic in order to produce a “digest” of network communications. In sFlow instead, the probe running inside the network switch emits samples that include “packet samples” that in essence are packets captured on switch ports cut to a snaplen (usually 128 bytes) and send to a sFlow collector encapsulated on sFlow packet format. When comparing sFlow to packet capture you have no full packet visibility (both in terms of packet length and ability to see all packets but just a sample), but on the other hand you have access to additional metadata such as the name of the authenticated user (e.g. via Radius or 802.1X) that made such traffic. This is a very important information that can be very helpful during troubleshooting or security analysis.

In flow collection, ntopng will show you flows collected by nProbe and sent to ntopng via ZMQ. This has the advantage of being able to monitor multiple NetFlow/sFlow/IPFIX exporters and combine all of them into a single ntopng instance. Doing the same with packets would have been much more complicated if doable at all. In this scenario, you need to keep in mind that you can see only prefiltered and presummarized traffic from the device that is sending flow: it mean that you can’t have “full packet visibility” but only a summarized version of it.

Which Option is the Best?

It is now time to decide the flow visibility strategy to follow, this based on your monitoring expectation. Taps are definitively a good options for packet-oriented people, but keep in mind that is possible to have also mixed scenarios where some networks are monitored using packets, and others with flows.

Physical or Virtual Monitoring Tools?

Often people ask us whether a physical box has advantages over a VM used to monitor traffic. IT trend is towards virtual but physical could help and could help you on the proof of concept scenario. Remember, there is not “best scenario” to follow. Virtual environment allow you to avoid possible hardware problems but require a dedicated physical NIC for TAP mode scenario, that isn’t always possible. Hardware could be easier but it can vary every time.

Technical requirements depends on what you need to see and collect, but the minimum should be:

  • Intel CPU with two cores
  • 4 Gb of RAM
  • 120 Gb of Disk Space. SATA or SSD depend on the traffic you need to verify but SSD is preferred.
  • 1 NIC. 1 NIC only for Flow Collector mode. 2 NIC, at least for TAP Visibility.
  • Linux operating system. ntop builds prepackages packages for Debian, Ubuntu LTS and CentOS. You can choose the nitro you like, but if you ask us we suggest you to use Ubuntu LTS.

Licensing

Most ntop software is open source, that for most people free of licensing fees. However even in the case of ntopng we offer premium versions that allows us to keep developing the product. Hence sot all the components are freely available so you need to choose the right deployment based on the budget or based on the feature you need. ntopng could run on community mode: it means that you can catch from the wire all the flows presented to ntopng via tap Interfaces but you are going to have limited functions and capabilities. If you choose to have all the features on, you need a simple ntopng Pro or Enterprise license.

Otherwise if you plan to add or to use flow collector mode, remember you need to buy a nprobe license to allow you to grab all the flows form devices and present them to ntopng, better if licensed so that you can have for instance full integration with other protocols such as SNMP. Probably, if you try both scenario, you will adopt a ntopng plus nprobe scenario (check main feature here https://www.ntop.org/products/traffic-analysis/ntop/).

Enjoy!

PS: Many thanks to Giordano Zambelli for writing this post.

Telemetry Data in ntopng: Giving Back to the Community

$
0
0

The latest ntopng 3.9 dev gives you the possibility to choose whether to send telemetry data back to ntop. We collect and analyze telemetry data to diagnose ntopng issues and make sure it’s functioning properly. In other words, telemetry data help us in finding and fixing certain bugs that may affect certain versions of ntopng.

And don’t worry, we won’t use any data to try and identify you. However, if you want to, you can decide to provide an email address we can use to reach you in case we detect your instance has anomalies.

So which kind of telemetry data is sent? Currently, the only telemetry data sent to ntop are crash reports. That is, when ntopng terminates anomalously, a tiny JSON containing ntopng version, build platform, operating system and startup options is sent to notify us that something went wrong.

At any time you can see the status of the telemetry by visiting the “Telemetry” page accessible from the “Home” menu. You can see all the details of the data that may be sent to our server, and also the most recent data which have been sent.

At any time you can consent or revoke the permission to send telemetry data, this is completely up to you. Deciding to send telemetry is a small act, but it has a great value for the community as it can foster a continuous improvement of ntopng. So please, visit the “Preferences” and choose to contribute!

TLS/SSL Analysis: When Encryption and Safety Are Not Alike

$
0
0

Most people think that SSL means safety. While this is not a false statement, you should not take it for granted. In fact while your web browser warns you when a certain encrypted communication has issues (for instance them SSL certificates don’t match), you should not assume that SSL = HTTPS, as:

  • TLS/SSL encryption is becoming (fortunately) pervasive also for non web-based communications.
  • The web browser can warn you for the main URL, but you should look onto the browser development console for other alerts (most people ignore the existence of this component).

As when TLS/SSL communications are insecure (see below for details) we are on a very bad situation as we believe we have done our best, but in practice SSL is hiding our data but is not implementing safety as attackers have tools to exploit SSL weaknesses. In the past weeks we have spent quite some time enhancing SSL support in both nDPI and ntopng. This is to make people aware of SSL issues on their network, understand the risks, and implement countermeasures (e.g. update old servers). What we have implemented in the latest ntopng dev version (that will be merged on the next stable release) is SSL handshake dissection for detecting:

  • Insecure and weak ciphers
    Your communication is encrypted (i.e. you will see a lock on the URL bar but the date you exchange might be potentially decrypted).
  • Client/server certificate mismatch
    You are not talking with the server you want to talk to.
  • Insecure/obsolete SSL/TLS versions
    It’s time to update your device/application.

When a SSL communication is not satisfying all safety criteria, ntopng detected it, and triggers an alert. In essence we have implemented a lightweight SSL monitoring console that allows you (without having to install an IDS or similar application) to understand the security risks and fix them before it’s too late.

Below you can find a valid SSL communication: for your convenience we have highlighted the SSL detection fields (on a future blog post we’ll talk more about JA3).

When ntopng detect TLS/SSL issues, it reports them both in the flow

and alerts

The goal of this post is not to scare the reader, but increase awareness in network communications and use ntopng for understanding the risks and implement countermeasures to keep your network safe.

Remember: you should not implement secure communications because you are scared of attackers, but because it’s the right thing to do for preserving your privacy.

Enjoy!

 

Released nProbe Cento 1.8

$
0
0

This is to announce the release of nProbe Cento 1.8 stable release. This is a maintenance release where we have made many reliability fixes and added new options to integrate this tool with the latest ntopng developments. We suggest all our users to update to this new release so you can benefit from the enhancements.

New Features

  • Added –json-labels option to print labels as keys with JSON
  • Added –tcp : option to export JSON over TCP export
  • Added –disable-l7-protocol-guess option to disable nDPI protocol guess
  • Support for ZMQ flows export with/without compression
  • Keepalive support with ZMQ

Fixes

  • Fixed JSON generation and string escape
  • Fixed export drops when reading from a PCAP file
  • Fixed wrong detection of misbehaving flows
  • Fixed pkts and bytes counters in logged flows
  • Fixed license check when reading from PCAP file
  • Fixed size endianness in ZMQ export
  • Fixed ZMQ header version to be compatible with the latest ntopng
  • Crash and memory leak fixes

Enjoy!

Talking about Network, Service, and Container Monitoring at InfluxDays

$
0
0

Later this week the ntop team will attend InfluxDays, June 13-14, London, UK. We’ll be talking about traffic monitoring in containerised environments, and give you an outlook of our roadmap. 

 

If you are attending this event (we’ll have a booth at InfluxDays), or if you live in London and want to meet us, please show at the event to contact us so we can arrange an informal meeting and hear from you. We need feedback from our users so that together we can plan the future of ntop.

Hope to see you!

Introducing nProbe Agent: Packetless, System-Introspected Network Visibility

$
0
0

A few months ago at FOSDEM we introduced the concept of network and container visibility through system introspection and we released an opensource library based on eBPF that can be used for this scope. Based on this technology, we created a lightweight probe, nProbe™ Agent (formerly known ad nProbe mini), able to detect, count and measure all network activities taking place on the host where it is running. Thanks to this agent it is possible to enrich the information extracted with a traditional probe from network traffic packets, with system data such as users and processes responsible for network communications. In fact, this agent is able to extract and export a rich set of information, including:

  • TCP and UDP network communications (5-tuple, status).
  • TCP counters, including retransmissions, out-of-order packets, round-trip times read reliably from the Linux kernel without having to mimic them using packets.
  • The user behind a communication.
  • The process and executable behind a communication.
  • Container and orchestrator information (e.g. POD), if any.

For example, nProbe Agent gives you the answer to questions like: who is the user trying to download a file from a malware host? Which process is he running? From which container, if any?

nProbe™ Agent does all this without even looking at Network packets, in fact it implements a low-overhead event-based monitoring mainly based on hooks provided by the Operating System, leveraging on well-established technologies such as Netlink and eBPF. In particular eBPF support is implemented my means on the open source libebpfflow library we developed to mask eBPF complexity. This also allows the agent to detect communications between containers on the same host. nProbe Agent is able to export all the extracted information in JSON format over a ZMQ socket or to a Kafka cluster.

nProbe™ Agent is natively integrated with ntopng out-of-the-box so you can finally seamlessly merge system and network information.

As eBPF requires modern Linux kernels, nProbe™ Agent is available only for Ubuntu 18.04 LTS and CentOS 7 (please upgrade your distro with the latest CentOS packages). If you just need basic system visibility information, there is also libebpflowexport a fully open-source tool that is also natively supported by ntopng out of the box.

For further information about this agent and getting started with system-introspected network visibility, please visit the nProbe Agent documentation and product page.

Stay tuned!

System-Introspected Network and Container Visibility: A Quick Start Guide

$
0
0

Recently, we have introduced the concept of network and container visibility through system introspection and also demonstrated its feasibility with an opensource library libebpfflow. In other words, by leveraging certain functionalities of the linux operating system, we are able to detect, count and measure the network activity that is taking place on a certain host. We have published a paper and also presented the work at the FOSDEM 2019 and therefore a detailed discussion falls outside the scope of this post. However, we would like to recall that information we are able to extract is very rich and is absolutely not limited to mere byte and packet counters. For example we can determine:

  • All the TCP and UDP network communications, including their peers, ports, and status
  • TCP counters and also retransmissions, out-of-orders, round-trip times, and other metrics which are useful as a proxy for the communication quality
  • Users, processes and executables behind every communication  (eg., /bin/bash , executed with root privileges, is contacting a malware IP address)
  • Container and orchestrator information, when available (e.g., /usr/sbin/dnsmasq is being run inside container dnsmasq which, in turn, belongs to Kubernetes pod kube-dns-6bfbdd666c-jjt75 within namespace kube-system)

By the way, do you know what is the really cool innovation behind all of this? Well, actually, is that we do not have to look at the packets to get this information out! This is why we also love to use the term packetless network visibility, which may seem an oxymore at first, but eventually it makes a lot of sense. Indeed, not looking at the packets is not only cool but it is also somehow necessary under certain circumstances. Think to multiple containers which are communicating together on the same host. Their packets would never leave the system and, thus, would never reach the network, making any mirror or TAP totally useless. In this case, having visibility into the inter-container communications would require an introspection-based approach such as the one we have proposed.

Getting Started

Ok so now that we have gone through a brief recap of our technology it is time to see it in action. To start you need two pieces:

  • nprobe-agent, a small application which integrates libebpfflow and is responsible for performing system introspection
  • ntopng, our visualization tool, which receives introspected data from nprobe-agent and visualizes it in an handy GUI

Configuration is straightforward. You can fire nprobe-agent with just a single option which basically tells it the address on which ntopng is listening for introspected data

# nprobe-agent -v --zmq tcp://127.0.0.1:1234c

In this example, we are going to use nprobe-agent and ntopng on the same host so we are safely using the loopback address 127.0.0.1 to make them communicate. Note, however, that this is not necessary as nprobe-agent and ntopng can run on two physically separate hosts and you can also run multiple nprobe-agent and let them export to the same instance of ntopng.

To collect data from nprobe-agent, ntopng can be started as follows

./ntopng -i tcp://*:1234c -m "192.168.2.0/24"

the -i option specifies on which port ntopng has to listen for incoming data (see the port is 1234, the same used for nprobe-agent) whereas the option -m specifies a local network of interest.

Once both applications are running, point your browser to the address of ntopng and you will start seeing network communications along with users, processes and container information. Cool, isn’t it?

Combining Network Packets with System Introspection

When you have packets, you can also combine them with data from system introspection. This is straightforward to do. You just have to indicate a packet interface in ntopng as a companion of the interface which is responsible for the collection of system introspection data from nprobe-agent.

For example, assuming an ntopng instance is monitoring the loopback interface lo in addition to receiving data from nprobe-agent as

 ./ntopng -i tcp://*:1234c -i lo -m "192.168.2.0/24"

We can declare the tcp://*:1234c as the companion of lo from the preferences as

From that point on, system-introspected data arriving at tcp://*:1234c will also be delivered to lo and automatically combined with real packets:

 


Introducing PF_RING Configuration Wizard

$
0
0

Getting started with PF_RING can be a bit tricky as it requires the creation of a few configuration files in order to setup the service, especially when ZC drivers need to be used.

First of all it requires packages installation: PF_RING comes with a set of packages for installing the userspace libraries (pfring), the kernel module (pfring-dkms), and the ZC drivers (<driver model>-zc-dkms). Installing the main package, pfring, is quite intuitive and straightforward following the instructions available at http://packages.ntop.org , however installing and configuring the proper package when it comes to install the ZC driver for the actual NIC model available on the target machine can lead to some headache.

In fact, doing the driver configuration manually means (in this example we consider the ixgbe driver):

  • Creating a configuration file for PF_RING (/etc/pf_ring/pf_ring.conf).
  • Checking the model of the installed NIC.
  • Installing the proper dkms driver from the repository (ixgbe-zc-dkms)
  • Creating a configuration file for the NIC model (/etc/pf_ring/zc/ixgbe/ixgbe.conf), to indicate the number of RSS queues and other driver settings.
  • Creating a .start file for the NIC model (/etc/pf_ring/zc/ixgbe/ixgbe.start) to indicate that we actually want to load the driver.
  • Creating a configuration file for the hugepages (/etc/pf_ring/hugepages.conf)
  • Restarting the service.

In order to simplify all of this, since PF_RING 7.5, the pfring package includes the pf_ringcfg script that can be used to automatically install the required driver package and create the full configuration for the PF_RING kernel module and ZC drivers. With this method, configuring and loading the ZC driver for an interface is straightforward, it can be done in a few steps:

1. Configure the repository as explained at http://packages.ntop.org and install the pfring package which includes the pf_ringcfg script (example for Ubuntu):

apt-get install pfring

Note: it is not required to install any additional package like pfring-dkms or <driver model>-zc-dkms, pf_ringcfg will take care of that, installing selected packages according to what is actually required by the configuration.

2. List the interfaces and check the driver model:

pf_ringcfg --list-interfaces
Name: em1  Driver: igb    [Supported by ZC]
Name: p1p2 Driver: ixgbe  [Supported by ZC]
Name: p1p1 Driver: ixgbe  [Supported by ZC]
Name: em2  Driver: e1000e [Supported by ZC]

3. Configure and load the driver specifying the driver model and (optionally) the number of RSS queues per interface:

pf_ringcfg --configure-driver ixgbe --rss-queues 1

Note: this also installs the required packages, including pfring-dkms and <driver model>-zc-dkms for the selected driver, before configuring and loading the driver.

4. Check that the driver has been successfully loaded by looking for ‘Running ZC’:

pf_ringcfg --list-interfaces
Name: em1  Driver: igb    [Supported by ZC]
Name: p1p2 Driver: ixgbe  [Running ZC]
Name: p1p1 Driver: ixgbe  [Running ZC]
Name: em2  Driver: e1000e [Supported by ZC]

Note: there are corner cases that require particular attention and that you handle with custom configuration. For example  if you’re configuring a ZC driver for an adapter that you’re currently using as management, pf_ring does not reload the driver by default as it may break network connectivity. In this case you need to add the –force option when running the pf_ringcfg script, or follow the Manual Configuration section in the PF_RING User’s Guide.

 

Enjoy!

Building a (Cheap) Continuous Packet Recorder using n2disk and PF_RING [Part 2]

$
0
0

Continuous packet recorders are devices that capture raw traffic to disk, providing a window into network history, that allows you to go back in time when a network event occurs, and analyse traffic up to the packet level to find the exact network activity that caused the problem.

n2disk is a software application part of the ntop suite able to capture traffic at high speed (it relies on the PF_RING packet capture framework, able to deliver line-rate packet capture up to 100 Gbit/s) and dump traffic to disk using the standard PCAP format (which is used by packet analysis tools like Wireshark and ntopng). Network traffic is recorded permanently and the oldest data is overwritten as disk space fills up, in order to provide continuous recording and the best data retention time.

Besides storing network data to disk, n2disk can also:

  • Index and organize data in a timeline, to be able to retrieve traffic searching for packets matching a specific BPF filter in a selected time interval.
  • Compress data to save disk space (if you compile pcap-based applications on top of PF_RING-aware libpcap, any application compatible with the PCAP format can read compressed pcap files seamlessly).
  • Filter traffic, up to L7: you can discard traffic matching selected application protocols.
  • Shunt traffic: you can save disk space by recording only a few packets from the beginning of each communication for selected application protocols (e.g. encrypted or multimedia elephant flows).
  • Slice packets: the ability to reduce packet size by cutting them (e.g. up to the IP or TCP/UDP header).

In a previous post (part 1) we described how to build a 2×10 Gbit continuous packet recorder using n2disk and PF_RING, however it’s time to update it as a few years have past, new features have been added, and new capture and storage technologies are available.

Network Adapter: Intel vs FPGAs

All ntop applications, including n2disk, are based on PF_RING and can operate on top of commodity adapters (accelerated Zero-Copy drivers are available for Intel) as well as specialised FPGAs adapters like Napatech, Fiberblaze and others (the full list is available in the PF_RING documentation).

In order to choose the best adapter we need to take into account a few factors, including capture speed, features and price. Intel adapters are cheap and can deliver 10+ Gbps packet capture with 64 byte packets using PF_RING ZC accelerated drivers. FPGA adapters are (more or less, depending on the manufacturer) expensive, but provide, in addition to higher capture speed with small packet sizes, support for useful features like ports aggregation, nanosecond timestamping, traffic filtering. We can summarize it in a small table:

Link Speed / Features Required Recommended Adapter
1 Gbit Any adapter
10 Gbit Intel (e.g. 82599/X520)
2 x 10 Gbit Aggregation / Nanosecond Timestamp FPGA (Napatech, Silicom)
40/100 Gbit FPGA (Napatech, Silicom)

What Storage System Do I Need?

If you need to record 1 Gbps, even a single (fast) HDD is enough to keep up with the traffic throughput. If you need to record 10+ Gbps, you need to increase the I/O throughput by using a RAID system with many drives. At ntop we usually use 2.5″ 10K RPM SAS HDD drives for compact systems (e.g. 2U form factor with up to 24 disks), or 3.5″ 7.2 KRPM SAS HDD for modular systems when rack space is not a problem and many units are required to increase data retention (in this case you need to use a RAID controller able to drive a SAS expander, which is able to handle hundreds of disks). More space in the storage system translates in a higher retention time and thus the ability to go further back in time to find old data.

The number of drives, combined with the I/O throughput for each drive, and the RAID configuration, determine the final I/O throughput you are able to achieve. The drive speed depends on the drive type and model, they can be summarized in the table below:

Drive Type Sustained Sequential I/O
SAS/SATA HDD 100-250 MB/s
SSD 100-500 MB/s
NVMe (PCIe SSD) 500-3000 MB/s

In order to record traffic at 10 Gbps for instance, you need 8-10 SAS HDDs in RAID 0, 10-12 disks in RAID 50. The RAID controller should have at least 1-2 GB of buffer onboard in order to keep up with 10+ Gbps. Alternatively you can use 3-5 SSDs, or 1-2 NVMe (PCIe SSD) drives. SSDs are usually used when concurrent write and read are required under intensive workload, to avoid that HDD’s seek time jeopardize the performance. Please make sure that you select write-intensive flash disks that guarantee great endurance over time.

At 20 Gbps at ntop we usually use 16-24 HDDs. At 40-100 Gbps you probably also need to use multiple controllers as most controllers are able to handle up to 35-40 Gbps sustained and you need to distribute the load across a few of them. In fact, since version 3.2, n2disk implements multithreaded dump, that means it is able to write to multiple volumes in parallel. This is also useful when using NVMe disks, as they are directly attached to the PCIe bus and lightfast, but they cannot be driven by a standard controller, thus you can use n2disk to write to many NVMe disks in parallel: we have been able to achieve 140 Gbps of sustained throughput using 8 write-intensive NVMe disks!

What CPU Do I Need?

Choosing the right CPU depends on a few factors.

First of all the adapter model. Intel adapters transfers packets one-by-one putting pressure on the PCIe bus and thus increasing the overall system utilisation with respect to FPGA adapter like Napatech or Silicom that are able to work in “chunk” mode (other NIC vendors such as Accolade for instance do not support it yet). FPGA adapters are also able to aggregate traffic in hardware at line-rate, whereas with Intel we need to merge packets on the host, and it is hard to scale above 20-25 Mpps in this configuration. A CPU with high frequency (3+ GHz) is required with Intel.

The second factor is definitely traffic indexing. You probably want to index traffic to accelerate traffic extraction and this requires a few CPU cores. In order to index traffic on the fly at 10 Gbps, 1-2 dedicated cores/threads are required (in addition to the capture and writer threads. At 40 Gbps you probably need 4-6 indexing threads. At 100 Gbps at least 8-10 threads.

In short, if you need to record 10 Gbps, a cheap Intel Xeon E3 with 4 cores and 3+ GHz is usually enough even with Intel adapters. If you need to record and index 20+ Gbps, you should probably go with something more expensive like an Intel Xeon Scalable  (e.g. Xeon Gold 6136) with 12+ cores  and 3+ GHz. Pay attention to the core affinity and NUMA as already discussed in the past.

How Much Does It Cost?

Continuous packet recorders on the market are expensive devices because they need fast/expensive storage systems and they are usually part of enterprise-grade solutions designed for high-end customers. At ntop we want to deliver everyone the best technology at affordable prices, and we recently updated our price list lowering down prices for the n2disk product (please check the shop for more info). Education and no-profit can use our commercial tools at no cost.

For further information about the n2disk configuration and tuning, please refer to the n2disk documentation.

Measuring nProbe ElasticSearch Flow Export Performance

$
0
0

nProbe (via its export plugin) supports ElasticSearch flows export. Setting up nProbe for the ElasticSearch export is a breeze, it just boils down to specifying option --elastic. For example, to export NetFlow flows collected on port 2058 (--collector-port 2058)  to an ElasticSearch cluster running on localhost port 9200, one can use the following

nprobe -i none -n none --collector-port 2058 --elastic "flows;nprobe-%Y.%m.%d;http://localhost:9200/_bulk"

nProbe will take care of pushing a template to ElasticSearch to have IP fields properly indexed, and will also POST flows in bulk to maximize the performance.

Recently we’ve done several improvements to the nProbe performance (you need to use the latest dev nProbe version) when it comes to export flows to ElasticSearch and therefore we believe it is time to publish some official numbers.

Performance tests have been run on an Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz machine with 16GB RAM with both nProbe and ElasticSearch:

  • OS: Ubuntu 16.04.6 LTS
  • nProbe v.8.7.190712 (r6564)
  • ElasticSearch 6.8.1

In order to measure the export performance, we’ve pushed NetFlow at increasing rates using pfsend as described in another post and we’ve disabled nProbe internal caches (--disable-cache).

We’ve seen that the maximum number of flows per second that a single nProbe instance (but remember you can instantiate one instance per-core on a multicore system, all sharing the same license) can export to ElasticSearch is approximately 45,000 flows per second. Above that threshold, flows will be dropped, that is, it won’t be possible to bulk-POST the incoming NetFlow fast enough.

For the sake of completeness, this is the full nprobe command used in the tests

./nprobe -i none -n none --collector-port 2058 -T "@NTOPNG@" --elastic "flows;nprobe-%Y.%m.%d;http://localhost:9200/_bulk" --disable-cache -b 1

We’ve also extended nProbe export stats shown when using option -b=1 to accurately report the rates. This allowed us to make the measurements and will also allow you to accurately monitor the performance of nProbe. Note that the drops you are seeing below are normal as we pushed nProbe above its limit to see the maximum successful flow export rate.

12/Jul/2019 12:54:26 [nprobe.c:3448] ElasticSearch flow exports (successful) [1307219 flows][avg: 42168.4 flows/sec][latest 30 seconds: 42372.5 flows/sec]
12/Jul/2019 12:54:26 [nprobe.c:3455] ElasticSearch flow drops [export queue full: 311977][post failed: 0]
12/Jul/2019 12:54:56 [nprobe.c:3448] ElasticSearch flow exports (successful) [2655429 flows][avg: 43531.6 flows/sec][latest 30 seconds: 44940.3 flows/sec]
12/Jul/2019 12:54:56 [nprobe.c:3455] ElasticSearch flow drops [export queue full: 560827][post failed: 0]
12/Jul/2019 12:55:26 [nprobe.c:3448] ElasticSearch flow exports (successful) [4036654 flows][avg: 44358.8 flows/sec][latest 30 seconds: 46040.8 flows/sec]
12/Jul/2019 12:55:26 [nprobe.c:3455] ElasticSearch flow drops [export queue full: 778416][post failed: 0]

The main advantage of the direct export to ELK instead of using intermediate tools such as LogStash is that you can do it more efficiently and without having to configure too many intermediate components. Please also note that you can obtain similar figures when using the nProbe to export flows towards Kafka using the export plugin.

Enjoy!

Containers and Networks Visibility with ntopng and InfluxDB

$
0
0

For a while we have investigated how to combine system and network monitoring in a simple and effective way. In 2014 we have done a few experiments with Sysdig, and recently thanks to eBPF we have revamped our work to exploit this technology as well to be able to monitoring containerised environments. Months ago we have shown how to detect, count and measure the network activity which is taking place at a certain host just by leveraging certain functionalities of the linux operating system, without even looking at the traffic packets. Our seminal work has been published in the paper ” Combining System Visibility and Security Using eBPF“. Since then, we have given the talk “Merging System and Network Monitoring with BPF” at FOSDEM 2019 and co-authored the article “IT Monitoring in the Era of Containers: Tapping into eBPF Observability” with our friends at InfluxDB, among with other activities.

In this post, we would like to show you how to get started with containers and network visibility, that is, what tools you need to end up in having new fancy metrics delivered straight to your instance of InfluxDB, metrics which will support you in observing, understanding and troubleshooting containerized environments. In essence this is a guide that highlights what you need to install and run in order to combine system and network monitoring.

The Tools

Three lightweight tools are needed, namely:

  • nprobe-agent, formerly known as nProbe Mini, a small application which is responsible for performing system introspection. If you prefer a pure opensource solution (yet more limited) you can have a look at libebpfflow.
  • ntopng, a visualization tool which receives introspected data from the nprobe-agent, and slices and dices it for producing metrics and sending them to InfluxDB. This article assumes that you are using the latest ntopng version available at the date of this blog post.
  • InfluxDB, the popular timeseries database to store metrics generated by ntopng.

The top-right part of the following picture shows graphically how they work together. The other parts of the picture show also how they relate to the overall ntop visibility ecosystem.

The tools can run on the same host or on three different and physically independent hosts, it does not matter as they all communicate each other using the network. For simplicity, in the remainder of this post we assume the tools are installed and run on the same host.

Installation

nprobe-agent and ntopng are distributed by ntop. Follow the instructions at https://packages.ntop.org to add the ntop repositories so that you can use the package manager of your distro to do the installation, which is a one-liner

$ sudo apt-get -y install nprobe-agent ntopng

InfluxDB installation instructions are available at https://docs.influxdata.com/influxdb/latest/introduction/installation/.

Configuration

To configure nprobe-agent copy its default configuration file into /etc/nprobe-agent/nprobe-agent.conf

$ sudo cp /etc/nprobe-agent/nprobe-agent.conf.example /etc/nprobe-agent/nprobe-agent.conf

The default configuration file contains a single -z=tcp://127.0.0.1:1234c which instructs nprobe-agent to export introspected data on localhost port 1234. As we want ntopng to consume such data, we have to add line -i=tcp://*:1234c to the ntopng configuration file /etc/ntopng/ntopng.conf so it will listen for incoming data on port 1234.

Now that the configurations are done, we can safely start our tools as follows

$ systemctl restart ntopng
$ systemctl restart nprobe-agent

The final step is to tell ntopng to export metrics to InfluxDB, which basically consists in changing a preference as described in the documentation

Getting the licenses

nprobe-agent requires a license to work. Without a license it will work in a fully functional demo mode but only for 5 minutes. ntopng does not require a license but you may want to consider its Professional or Enterprise versions for a richer set of features. Similarly, InfluxDB works without license but an Enterprise version is featured as well.

Metrics Metrics Metrics!

As all the pieces are working together now, you’ll start seeing all the network communications which are taking place on the monitored host, including information on users, processes, pods, containers, round trip times, and so on. We have already discussed how this rich information becomes available for browsing within ntopng in an earlier post System-Introspected Network and Container Visibility: A Quick Start Guide of this series.

In this post we would like to focus more on the metrics which are produced by ntopng and inserted into InfluxDB. Metrics which are not only produced by but also consumed from ntopng. Indeed, ntopng transparently query InfluxDB to produce any of the charts you’ll see when browsing its graphical user interface.

Wait, but what if you already have your dashboarding solution such as Grafana or Chronograf in place? Well, this is perfect and it is going to work seamlessly. You don’t have to stick with the ntopng graphical user interface. You are free to use your favourite solution just by connecting it to InfluxDB which safely stores metrics and will happily serves them as well – InfluxDB implements out-of-the-box datasource plugins for Grafana and Chronograf.

Let’s go back to the ntopng graphical user interface for a moment to have a look at some charts produced by (transparently) querying InfluxDB metrics.

The following is a stacked chart of all the network interfaces traffic on the host. Interfaces are physical as well as virtual (see for example the veth… which stands for virtual ethernet…).

 

This is a chart which tells the number of active flows (i.e., network communications) active at a certain container (named heapster) over time, both as client and as server.

For the same heapster container it is possible to chart the average round trip time in milliseconds over time.

Charts are also available at the level of pod, where values are averages of all the containers of the pod. For example, this is the chart with the average round trip time of pod heapster:

Again, all the charts shown above can be obtained with any other dashboarding solution, it is absolutely not necessary to use ntopng.

Metrics in Detail

Let’s have a closer look at the metrics pushed by ntopng into InfluxDB. An exhaustive list of metrics is available here. With reference to the container visibility, it is worth mentioning the following:

  • Number of flows per Container and POD
  • RTT/RTT-Variance per Container and POD
  • Number of containers per POD

Once in InfluxDB, each metric becomes a timeseries as its values are written into the database periodically. In the remainder of this post we discuss the details of the metrics and how they are stored in InfluxDB.

Name

All metrics are identified with a name. This name is what is also referred to as measurement in the InfluxDB parlance. The convention <prefix>:<suffix> is used by ntopng for metric names:

  • <prefix> is the subject of the metric, e.g., an host a container, or an interface
  • <suffix> is what the metric represents, e.g., the traffic, the round trip time or the number of flows.

Examples of names are:

  • pod:rtt
  • pod:num_flows
  • container:num_flows
  • host:ndpi
Tags

Each metric has one or more tags associated. Tags are used to filter a metric as explained here. ntopng uses tags to enrich metrics with the name or the identifier of an interface, the IP address of a host, the identifier of a pod or container, and so on.

Examples of tags are:

  • ifid=0
  • subnet=192.168.2.0/24
  • container=89b0acbdba4b
  • pod=heapster-v1.5.2-6b5d7b57f9-g2cjz
Resolution

ntopng pushes each metric into InfluxDB at regular intervals of time, as low as 10 sec. Hence, each metric has a resolution associated. The higher the resolution, the shorter the regular interval of time the metric is pushed into InfluxDB. Resolution is configurable for certain metrics and is documented here.

Type

Metrics are of two types, namely gauges and counters.

  • Counters are metrics which always increase in time, such as for example the traffic of a certain network interface.
  • Gauges are metrics such as the number of active flows or active hosts at a certain point in time and can have any value without any constraint.

The documentation indicates the type of each metric generated.

Examples

Now that we have seen all the details behind the metrics, it’s time to use the influxdb cli influx to execute some queries to give a better idea of how easy it is to query and operate on metrics

Let’s connect to InfluxDB and select database ntopng:

$ influx -precision rfc3339
Connected to http://localhost:8086 version 1.7.4
InfluxDB shell version: 1.7.4
Enter an InfluxQL query
> use ntopng
Using database ntopng
>

To list all the measurements involving a container we can do

> show measurements with measurement =~ /^container*/
name: measurements
name
----
container:num_flows
container:rtt
container:rtt_variance
>

We have 3 measurements (3 metrics), one for the number of flows and two for the round trip time and its variance, respectively.

To select the 10 most recent metric values for container:num_flows we can do

> select * from "container:num_flows" order by time desc limit 10
name: container:num_flows
time as_client as_server container ifid
---- --------- --------- --------- ----
2019-07-29T15:42:00Z 1 11 f427db2c87ac 6
2019-07-29T15:42:00Z 1 0 0a7f0bfa1a2b 6
2019-07-29T15:42:00Z 0 2 7c983b788320 6
2019-07-29T15:42:00Z 4 48 4a32234f6c35 6
2019-07-29T15:42:00Z 2 0 ceaa16daddd5 6
2019-07-29T15:42:00Z 138 34 1edb6c16e3d2 6
2019-07-29T15:42:00Z 19 121 f1e6f2b128e9 6
2019-07-29T15:42:00Z 1 0 a2a9c82c759f 6
2019-07-29T15:42:00Z 3 12 8a74af30d974 6
2019-07-29T15:41:00Z 1 0 a2a9c82c759f 6
>

As we see, there are multiple containers. To filter results for a certain container we can use the container tag as follows

> select * from "container:num_flows" where "container" = '1edb6c16e3d2' order by time desc limit 10
name: container:num_flows
time as_client as_server container ifid
---- --------- --------- --------- ----
2019-07-29T15:44:00Z 136 24 1edb6c16e3d2 6
2019-07-29T15:43:00Z 142 40 1edb6c16e3d2 6
2019-07-29T15:42:00Z 138 34 1edb6c16e3d2 6
2019-07-29T15:41:00Z 142 28 1edb6c16e3d2 6
2019-07-29T15:40:00Z 140 21 1edb6c16e3d2 6
2019-07-29T15:39:00Z 139 38 1edb6c16e3d2 6
2019-07-29T15:38:00Z 136 23 1edb6c16e3d2 6
2019-07-29T15:37:00Z 132 36 1edb6c16e3d2 6
2019-07-29T15:36:00Z 137 21 1edb6c16e3d2 6
2019-07-29T15:35:00Z 143 26 1edb6c16e3d2 6
>

Now we believe we have given you the basics to start playing with InfluxDB and system introspection. Don’t forget to drop us an email or use our official github page for any question!

Enjoy!

New Challenges in DPI Protocol Detection

$
0
0

In the early Internet days, each network protocol was designed for a specific purpose: SMTP for sending emails, HTTP for the web and so on. In order to make sure that implementations where compliant with the specification, there was an RFC per protocol describing it. If a connection was starting with a protocol, let’s say SMTP, for the duration of the connection that was a SMTP connection meaning that the protocol behind a given connection was persistent for its duration. This in the early days.

Unfortunately the modern Internet does not respect this rule anymore. The use of NAT and firewalls started in the late 90s, created several issues to Internet communications and some companies decided that ease of use was a better that being standard compliant. For instance VoIP is a typical application that is not firewall-friendly as SIP/RTP/H.323/STUN are complex to operate with respect to double clicking on the Skype icon. In order Skype to operate through the firewall, it was impersonating protocols likely to pass through a firewall (for instance HTTP) and when the communication was established, Skype was using the open connection to exchange data using a new protocol (the proprietary Skype protocol but no longer HTTP). This is where the mess started: a HTTP connection turned into a different protocol.

Today this practice has exploded with protocols such as Facebook WhatsApp/Messenger, Google Hangout/Duo/Meet. These protocols start with STUN and then become something else after a few packets. See the example below: this is a WhatsApp call that started as STUN and then at packet 13 become something else that Wireshark was unable to decode.

There are many other protocols that behave like that: for instance the popular Signal messenger application starts as STUN and then changes to DTLS.

Another problem with DPI and apps, is that companies like Google and Facebook have several applications that overlap in term of functionalities and thus that from the DPI standpoint are almost alike. Furthermore such apps share services as they are very similar. For instance WhatsApp, Instagram and Messenger chat are all based on HTTPS services provide by edge-mqtt.facebook.com. So traffic from/to edge-mqtt.facebook.com should be classified as WhatsApp, Instagram or Messenger? You might consider this as a non important question but it’s very important instead when using nDPI to monitor inline traffic (e.g. in ntopng Edge), because you might want to block Instagram but not WhatsApp.

To make things short we have started to enhance nDPI to support these changes in Internet protocols. The latest nDPI versions implement a STUN cache to handle protocols based on it such as WhatsApp. This cache is used to detect as WhatsApp the main connection as well the sub-connection that otherwise would be marked as STUN and not as WhatsApp.

For the other problem (i.e. the same service shared by multiple protocols) we are implementing another cache that in case, for instance, a Instagram user accessed edge-mqtt.facebook.com that connection would be marked as Instagram instead of Messenger as it is today. As soon as we have finalised this implementation we will merge the code in nDPI.

Bottom line. DPI is still relevant today, but new protocols (that don’t follow stands but jeopardise them) are creating new challenges. nDPI is tackling them, but over time things are getting more complicated due to encryption and these bad practices. At ntop we like challenges so we’re implementing solutions, however it is a pity that Internet protocols are becoming so messy, and completely non-standard. I really miss the early Internet days!

Enjoy!

Viewing all 544 articles
Browse latest View live