Passive Information Gathering
Interesting files to check
robots.txt
Check /robots.txt for hidden directories/files.
sitemap.xml/sitemaps.xml
An XML file that provides search engines with a map of a site.
Web technologies footprinting
Add-ons: - BuiltWith - Wappalyzer provides insights about the technologies used on the visited websites. Such extension is handy, primarily when you collect all this information while browsing the website like any other user. A screenshot of Wappalyzer is shown below. You can find Wappalyzer for Firefox here.
Kali utility:
- whatweb: whatweb TARGET_URL
Host command
Usage: host DOMAIN
WHOIS, dig and nslookup
-
Identifying Key Personnel: WHOIS records often reveal the names, email addresses, and phone numbers of individuals responsible for managing the domain. This information can be leveraged for social engineering attacks or to identify potential targets for phishing campaigns.
-
Discovering Network Infrastructure: Technical details like name servers and IP addresses provide clues about the target's network infrastructure. This can help penetration testers identify potential entry points or misconfigurations.
-
Historical Data Analysis: Accessing historical WHOIS records through services like WhoisFreaks can reveal changes in ownership, contact information, or technical details over time. This can be useful for tracking the evolution of the target's digital presence.
-
This information might be redacted!
| Purpose | Commandline Example |
|---|---|
| Lookup WHOIS record | whois WEBSITE |
| Lookup WHOIS record | whois IP |
| Lookup DNS A records | nslookup -type=A WEBSITE |
| Lookup DNS MX records at DNS server | nslookup -type=MX WEBSITE 1.1.1.1 |
| Lookup DNS TXT records | nslookup -type=TXT WEBSITE |
| Lookup DNS A records | dig WEBSITE A |
| Lookup DNS MX records at DNS server | dig @1.1.1.1 WEBSITE MX |
| Lookup DNS TXT records | dig WEBSITE TXT |
Netcraft
It is a service used to gather information regarding a target domain.
Research tools: https://www.netcraft.com/resources/research-tools
DNS
The Hosts File
- It is a simple text file used to map hostnames to IP addresses, providing a manual method of domain name resolution that bypasses the DNS process.
- It is located in
C:\Windows\System32\drivers\etc\hostson Windows and in/etc/hostson Linux and MacOS. - It can also be used to block unwanted websites by redirecting their domains to a non-existent IP address
Tools
DNSRecon
dnsrecon -d DOMAIN_NAME
DNS Dumpster
Website: https://dnsdumpster.com/
WAF Detection with wafw00f
It identifies wheter a website is protected by a WAF and gives you information regarding it.
GitHub repository: https://github.com/EnableSecurity/wafw00f
Usage
-
Default usage:
wafw00f DOMAIN_NAME -
Test for all possible WAF Instances:
wafw00f DOMAIN_NAME -a
Passive Subdomain Enumeration
This relies on external sources of information to discover subdomains without directly querying the target's DNS servers. One valuable resource is Certificate Transparency (CT) logs, public repositories of SSL/TLS certificates. Another passive approach involves utilising search engines like Google or DuckDuckGo. By employing specialised search operators (e.g., site:), you can filter results to show only subdomains related to the target domain.
Sublist3r
Sublist3r is a python tool designed to enumerate subdomains of websites using OSINT.
GitHub repository: https://github.com/aboul3la/Sublist3r
Usage
- Default usage:
sublist3r -d DOMAIN_NAME - Check with specific engines:
sublist3r -d DOMAIN_NAME -e google,yahoo
Wayback URLs
To dump all of the links that are saved in Wayback Machine, we can use the tool called waybackurls. Hosted in GitHub.
Google Dorks
Examples
Limit all results to a specific domain: site:DOMAIN_NAME
Look for specific results in URL: site:DOMAIN_NAME inurl:URL_STRING
Look for specific results in the site's title: site:DOMAIN_NAME intitle:URL_STRING
Look for specific file types (PDF,docx,zip,etc): site:DOMAIN_NAME filetype:FILE_TYPE
Enumerate subdomains: site:*.DOMAIN_NAME
Sites with directory listing enabled: intitle:"index of"
Find an older version of a website: cache:DOMAIN_NAME
Search for exposed passwords: inurl:auth_user_file.txt
Search for exposed passwords: inurl:password.txt
Google Hacking Database
https://www.exploit-db.com/google-hacking-database
Email Harvesting using theHarvester
Tool that enumerates emails beloing to a specific domain.
GitHub repository: https://github.com/laramies/theHarvester
It also contains modules that perform active recon.
Usage
-
Using the domain name:
theHarvester -d DOMAIN_NAME -b SOURCES -
Using the organization name:
theHarvester -d ORG_NAME -b SOURCES
Leaked Password Databases
Website: https://haveibeenpwned.com/
Other passive reconnaissance tools
- https://www.shodan.io/
- https://archive.org