Combining Passive DNS and BGP Big‑Data for Threat Intelligence
- by Staff
In the evolving landscape of cyber threats, the integration of diverse telemetry sources has become essential for achieving high-fidelity detection and attribution. Two of the most powerful yet complementary datasets available for network-centric threat intelligence are passive DNS (pDNS) and Border Gateway Protocol (BGP) telemetry. Passive DNS captures DNS resolution activity across broad populations of resolvers and clients, revealing which domains are queried, how often, and what IP addresses are returned. BGP, on the other hand, records the control-plane dynamics of internet routing, including how IP prefixes are announced, withdrawn, and propagated across autonomous systems (ASes). When analyzed together at big-data scale, pDNS and BGP provide unparalleled visibility into the infrastructure footprint of malicious actors and enable analysts to detect, correlate, and anticipate threats with a depth that would be impossible using either data source in isolation.
Passive DNS is inherently behavioral and temporal. It records the actual usage of DNS on the internet, providing insight into which domains are being resolved by users and systems in near real time. These logs typically include the queried domain, query type, response IPs, TTLs, and the timestamp of observation. When collected from multiple vantage points and federated into a central data platform, pDNS creates a longitudinal history of domain-to-IP mappings. This is invaluable for tracking fast-flux infrastructure, identifying domains used in botnets, phishing, or malware delivery, and observing patterns of domain aging, use, and retirement. However, while pDNS excels at the application layer, it offers limited insight into the ownership and routing properties of the resolved IPs.
BGP telemetry fills this gap by providing authoritative information about how IP space is advertised and routed. Each BGP update contains details such as the origin AS, the AS path, the prefix, the next hop, and the timestamp of the announcement or withdrawal. This data reveals which organizations control which blocks of IP addresses, how routes change over time, and whether prefixes are being hijacked, rapidly reassigned, or used inconsistently. BGP also enables the identification of transient, newly allocated, or suspiciously hijacked networks, all of which are common indicators of infrastructure abuse.
The power of combining these datasets lies in their cross-resolution. When a domain is observed in pDNS resolving to an IP address, that IP can be mapped to its corresponding BGP prefix and origin AS at the time of observation. This allows analysts to construct a timeline of domain-to-infrastructure binding, enriched with ownership and routing metadata. For example, if a previously benign domain suddenly resolves to an IP in a new AS with a short-lived BGP announcement, it could indicate that the domain has been repurposed for malicious use. Conversely, if multiple suspicious domains begin resolving to different IPs within a single BGP prefix controlled by an obscure hosting provider with minimal historical reputation, this could indicate the emergence of a bulletproof hosting operation.
To operationalize this fusion of pDNS and BGP data, organizations deploy large-scale ingestion and processing pipelines. pDNS logs are typically streamed from sensors into cloud data lakes or distributed file systems in columnar formats like Parquet. BGP updates, sourced from collectors like RIPE RIS, RouteViews, or BGPmon, are ingested into time-indexed stores using frameworks such as Kafka, Apache Flink, or Apache Beam. Both datasets are time-partitioned and indexed by IP and domain to enable efficient joins. A common enrichment step involves matching pDNS records against BGP prefix tables to derive the AS owner, route age, and stability at the time of the DNS resolution. This enrichment must account for the fact that both DNS and BGP data are highly dynamic and time-sensitive; temporal joins must be precise to avoid incorrect attribution.
Once enriched, the combined dataset enables a wide range of threat intelligence use cases. Analysts can identify domain clusters that resolve into the same BGP block over time, suggesting common operator control. They can detect changes in AS origin for the same domain, revealing shifts in hosting strategy or infrastructure reallocation. They can build models to score the reputation of IP addresses not only by the domains that resolve to them, but also by the volatility and reputation of the AS announcing them. These insights are particularly useful for early detection of malicious campaigns, which often rely on blending into benign-looking DNS patterns while leveraging transient or compromised IP space.
Machine learning also benefits from this data fusion. Feature vectors for DNS-based classifiers can be enriched with BGP-derived attributes such as AS age, prefix size, frequency of route changes, presence of MOAS (multi-origin AS) anomalies, or similarity to known bad infrastructure. Time-series models can correlate domain resolution bursts with route flaps, sudden prefix advertisements, or AS-path anomalies. Graph-based analysis can link domains to shared infrastructure not only by common IPs but by routing proximity, revealing more persistent threat actor fingerprints.
Another practical benefit is in attack surface reduction and blocking. Instead of blocking only known malicious domains, defenders can preemptively block or monitor entire BGP prefixes or ASNs associated with malicious infrastructure, particularly when supported by pDNS evidence showing active use in phishing, malware distribution, or command-and-control. For example, if multiple DGA domains start resolving to IPs within a newly allocated /24 that has no prior history and is being advertised by an AS with known abuse, that entire prefix can be flagged for inspection or blackholing.
For threat attribution, this dual telemetry approach is invaluable. Threat actors often reuse ASNs or specific routing behaviors across campaigns. By correlating domain use in pDNS with the BGP characteristics of their hosting infrastructure, analysts can track adversary behavior even when domain names and IPs change. Attribution models can incorporate patterns such as preference for small ASNs in specific regions, short-lived prefix announcements, or usage of dark fiber providers, adding a layer of infrastructure behavior to traditional content and payload analysis.
Visualization and dashboarding are also enhanced by this fusion. Heatmaps can show resolution activity to high-risk ASNs over time. Graphs can map domain clusters to BGP topology, illustrating infrastructure reuse or divergence. Alerting systems can combine triggers—such as a domain newly resolving to a prefix with no previous history in the last six months—with automated scoring and triage workflows.
In terms of compliance and operational transparency, combining BGP with pDNS also supports better data governance. Organizations can track which DNS queries resolve to IP space that leaves national boundaries, enters embargoed jurisdictions, or passes through ASNs with privacy or censorship concerns. This is essential for companies subject to data sovereignty rules or security policies that mandate traffic locality.
In conclusion, the integration of passive DNS and BGP big data unlocks a powerful multidimensional view of internet activity. Where DNS provides a window into application-layer behavior and domain usage patterns, BGP adds the structural and ownership context of the underlying infrastructure. Together, they form a complete picture of how names are resolved to resources, how those resources are routed, and how adversaries attempt to exploit the gaps between them. For threat intelligence teams, security researchers, and large-scale defenders, this fusion provides the analytical edge necessary to detect, understand, and mitigate advanced threats with greater precision, speed, and scope. As both datasets continue to grow in availability and fidelity, their combination will remain a cornerstone of proactive and infrastructure-aware cybersecurity.
In the evolving landscape of cyber threats, the integration of diverse telemetry sources has become essential for achieving high-fidelity detection and attribution. Two of the most powerful yet complementary datasets available for network-centric threat intelligence are passive DNS (pDNS) and Border Gateway Protocol (BGP) telemetry. Passive DNS captures DNS resolution activity across broad populations of…