From time to time, we receive inquiries asking us to position PF_RING (DNA and Libzero) against Intel DPDK (Data Plane Development Kit). As we have no access to DPDK, all we can do is to compare these two technologies by looking at the documents about DPDK we can find on the Internet. The first difference is that PF_RING is an open technology, whereas DPDK is available only to licensees. Looking at DPDK performance reports, PF_RING seems to be slightly more efficient (you can run DNA tests yourself using the companion demo applications) than DPDK on minimal size packets (DNA/Libzero do 14.88 Mpps on multiple 10 Gbit ports), whereas with larger packets their performance should be equivalent. PF_RING’s performance advantage (both in terms of speed and also packet latency) is also with respect to netmap, we believe due to the fact that for minimal size packets the cost of netmap’s system calls is not neglectable.
So where is the main difference then? PF_RING has been created as a comprehensive packet processing framework that could be used out of the box by both end-users and applications developers (the DPDK name looks like is only for developers).
PF_RING for End-Users
Using libpcap-over-PF_RING you can take any pcap-based legacy application (e.g. wireshark, bro), or use PF_RING DAQ for snort, and run all those application at line rate without any code change. The support of symmetric flow-aware hardware traffic balancing in DNA, allows application to really scale by simply spawning more applications as the number of packets to process increase. Using Libzero you can implement your own packet balancing policy across applications in zero-copy as demonstrated by the pfdnacluster_master application. Versatile packet distribution in zero-copy is a unique feature of libzero that allows you to take existing applications, and without changing a single line of code you can balance the traffic across applications at line-rate. This enables PF_RING users to scale exiting applications by balancing incoming traffic across multiple instances, and thus preserve their investments (i.e. you do not have to buy new/faster hardware as traffic rate increases).
PF_RING for Programmers
We have designed PF_RING mostly for programmers, trying to implement all those features that programmers need, so that developers can focus on application development, rather than on packet processing.
- PF_RING is released under LGPL so that you can build royalty-free open and (closed-source) proprietary applications.
- No maintenance or SDK fees: ntop does it for free.
- Line rate packet RX/TX by means of a simple receive/send API call.
- Low-latency (3.45 usec) packet switching across interfaces. DNA takes care of all low-level details, so that for instance you can receive a packet on a 1 Gbit NIC and forward it on a 10 Gbit NIC. All in zero-copy, and at line-rate of course.
- Seamless support for hardware (Intel 82599 and Silicom Redirector) and software (PF_RING filters and BPF) packet filtering so that independently of the NIC you use, the PF_RING frameworks takes care of filtering.
- PF_RING pre-fetches packets content for you, so that if instead of counting traffic you want to do real accounting/payload inspection (e.g. DPI or NetFlow), your CPU does not have to spend cycles just to fetch the packet from memory. You can verify yourself running “pfcount -t” or doing the same on other similar frameworks (netmap’s test application recv_test.c decreases its performance of about 50% when reading a byte from the received packet) to see what we mean.
- Packet processing in software does not happen in sequence as in hardware. Using libzero you can queue packets in zero-copy and hold the packet buffer until you are done with it (you can even transmit the buffer holding a packet you have just received) so that your application can work on multi-packets (e.g. whenever you need to reassemble a fragmented packet) or multi-thread without copying packet contents (and thus jeopardize performance) as it happens with other frameworks.
- Native Libzero’s zero-copy packet distribution, supports both divide-and-conquer and scatter-and-gather so that you can partition your packet processing workload across both applications and threads leveraging on the PF_RING framework.
Final Remarks
While most benchmark tools simply count packets without doing any processing to them, real-life packet processing is a much more complex activity. We do not believe that measuring packet capture/transmission performance is the only metric to look at. Instead seamless hardware NIC support, ability to support legacy application, out-of-the-box components for popular applications (e.g PF_RING DAQ), zero-copy operations across threads and applications, framework API richness, are as important as speed. PF_RING features all of them.