Datadog Network Monitoring

Device and performance monitoring

Network Performance Monitoring

End-to-end visibility into on-prem and cloud networks, including application-layer performance and the health of bare-metal appliances.

Act on real-time network insights

  • Monitor the health of traffic between any two endpoints at the app, IP address, port, or process ID (PID) layers.
  • Track key network metrics, such as TCP retransmits, latency, and connection churn.
  • Use visualizations of network traffic across applications, containers, availability zones, and datacenters to help optimize your migrations.

See what matters—not just IP addresses

  • View communication between services, pods, cloud regions, and cloud resources.
  • Isolate network issues in your Envoy-powered service mesh and troubleshoot inefficient load balancing.
  • Manage cloud networking costs by pinpointing the services and teams responsible for large traffic spikes.

Gain deep DNS visibility

  • Analyze system-wide DNS performance without having to SSH into individual machines.
  • Distinguish between client-side errors and server-side failures.
  • Assess DNS server health with request-volume, response-time, and error-code metrics.

Monitor connections to cloud services

  • Pivot to integration metrics to determine if an issue lies with a cloud provider or originates from your systems.
  • Filter down into subcomponents such as specific S3 buckets or RDS databases for more granular insights.
  • Observe and analyze traffic to Amazon S3, Amazon Elastic Load Balancing (ELB),GCP BigQuery, and other managed cloud services.

Network Device Monitoring

Unified health monitoring, troubleshooting, and capacity planning for network equipment.

View stats on any interface, on any device, on any network

  • Take inventory with a complete list of all your network hardware.
  • Monitor even the largest environments with a highly scalable and lightweight agent.
  • Automatically discover and collect metrics on your network from any device, including those from Cisco, Palo Alto, Dell, F5, Juniper, and other leading brands.

Quickly expose performance with next-level tooling

  • Search for devices and then drill down to view the performance of specific interfaces with a single click.
  • Filter device data according to helpful parameters, such as location.
  • Create custom dashboards with drag-and-drop widgets to help you evaluate device performance at a glance.

Get full visibility into traffic flows for troubleshooting and capacity planning

  • Easily surface top talkers and rates of TCP flags such as RST, SYN, ACK, PSF, FIN, and URG.
  • Filter network flow data by key information including volume, IP address, protocol, TCP/UDP source/destination port, and other out-of-the-box and custom tags.
  • Collect network flow data in common formats and standards including Netflow, IPFIX, sFlow, and J-Flow.

Identify device-level problems with SNMP Traps

  • Create monitors on specific SNMP Trap events to get alerted to device issues as soon as they occur.
  • Speed up troubleshooting with access to full Trap context, including Trap name, OID, namespace, admin status, and more.
  • Drill down into individual devices to see its SNMP Trap history.

Detect issues faster through advanced alerting

  • Quickly configure alerts on many devices or interfaces at once.
  • Use forecasting to determine when interfaces will exceed their available bandwidth.
  • Rely on anomaly detection to quickly spot a malfunctioning device.

Reduce mean time to resolution (MTTR)

  • Collaborate across teams, move faster, and solve problems more efficiently.
  • Consolidate all your network monitoring needs into a single pane of glass.
  • Correlate issues between network and application teams thanks to an all-in-one tool.

Troubleshoot network problems and create alerts using syslog messages

  • Set syslog-based monitors on specific SNMP trap names or other elements of the syslog record.
  • See syslog messages in the context of other events for improved troubleshooting.
  • Retain device logs for root cause analysis and compliance.