Next generation web scanner. Identify what websites are running.
Released 29th November, 2009 at the Kiwicon conference (kiwicon.org) in Wellington, New Zealand.
Download whatweb-0.3.tar.gz
Latest Version 0.3
License GPLv3
Author Andrew Horton aka urbanadventurer
Introduction
Identify content management systems (CMS), blogging platforms, stats/analytics packages, javascript libraries, servers and more.
WhatWeb has over 70 plugins and needs community support to develop more. Plugins can identify systems with obvious signs removed by looking for subtle clues. For example, a WordPress site might remove the tag but the WordPress plugin also looks for “wp-content” which is less easy to disguise. Plugins are flexible and can return any datatype, for example plugins can return version numbers, email addresses, account ID’s and more.
There are both passive and aggressive plugins, passive plugins use information on the page, in cookies and in the URL to identify the system. A passive request is as light weight as a simple GET / HTTP/1.1 request. Aggressive plugins guess URLs and request more files. Plugins are easy to write, you don’t need to know ruby to make them.
Example Usage
Using WhatWeb on a handful of websites.

Aggressive Plugins
There are currently aggressive plugins for Joomla, phpBB, FluxBB, OSCommerce and Tomcat.
With the passive plugin we know that ardentcreative.co.nz is running Joomla version 1.5
Be cafeful when using aggressive plugins with recursive site crawling. WhatWeb has no understanding of a website, instead it currently treats each URL separately. It also has no caching so if you use aggressive plugins with recursion you will fetch the same files multiple times.
Help
WhatWeb - Next generation web scanner.
Version 0.4 by Andrew Horton aka urbanadventurer, MorningStar Security
www.morningstarsecurity.com
Usage: whatweb [options]
--input-file=FILE, -i Identify URLs found in FILE
--aggression, -a 1 passive - on-page
2 polite - follow on-page links if in the extra-urls list (default)
3 impolite - try extra-urls when plugin matches (smart, guess a few urls)
4 aggressive - try extra-urls for every plugin (guess a lot of urls)
--recursion, -r Follow links recursively. Only follows links under the path (default: off)
--depth, -d Maximum recursion depth (default: 3)
--max-links, -m Maximum number of links to follow on one page (default: 25)
--list-plugins, -l List the plugins
--run-plugins, -p Run comma delimited list of plugins. Default is to run all
--info-plugins, -I Display information about a comma delimited list of plugins. Default is all
--example-urls, -e Add example urls for each plugin to the target list
--colour=[WHEN],
--color=[WHEN] control whether colour is used. WHEN may be `never', `always', or `auto'
--log-full=FILE Log verbose output
--log-brief=FILE Log brief, one-line output
--user-agent, -U Identify as user-agent instead of WhatWeb/VERSION.
--max-threads, -t Number of simultaneous threads identifying websites in parallel (CPU intensive). Default is 5.
--help, -h This help
--verbose, -v Increase verbosity (recommended), use twice for debugging.
Plugins
There are over 70 plugins as of March, 2010. Plugins are easy to make.
Matches are made with regular expressions, Google Hack Database queries, and custom ruby code.
For now the probability means maybe (25%), probably (75%) and certain (100%).
Plugin Information
------------------------------
Acclipse version 0.1
[53] examples, [4] matches, [ ] aggressive, [ ] passive.
Description: Acclipse is a web CMS found mainly in New Zealand and Australia. It is popular with accountants. Websites: www.acclipse.co.nz & www.acclipse.com.au
--------------------------------------------------------------------------------
Advanced-Guestbook version 0.1
[5] examples, [3] matches, [ ] aggressive, [x] passive.
Description: Web guestbook script. Homepage: http://proxy2.de/scripts.php. http://johnny.ihackstuff.com/ghdb/?function=detail&id=228 Version 2.2 is vulnerable http://www.securityfocus.com/bid/10209/info
--------------------------------------------------------------------------------
BlogSmithMedia version 0.1
[ ] examples, [1] matches, [ ] aggressive, [ ] passive.
Description: Pro bloggers - www.blogsmithmedia.com
--------------------------------------------------------------------------------
Blogger version 0.1
[ ] examples, [2] matches, [ ] aggressive, [ ] passive.
Description: Blogger.com free blogging site
--------------------------------------------------------------------------------
DiBos version 0.1
[2] examples, [3] matches, [ ] aggressive, [ ] passive.
Description: DiBos security surveillance system homepage: www.boschsecurity.com
--------------------------------------------------------------------------------
Drupal version 0.1
[41] examples, [5] matches, [ ] aggressive, [x] passive.
Description: CMS drupal.org
--------------------------------------------------------------------------------
EarlyImpact-ProductCart version 0.1
[3] examples, [2] matches, [ ] aggressive, [ ] passive.
Description: EarlyImpact ProductCart is an ASP commercial ecommerce system from www.earlyimpact.com. Version < 2.53 is vulnerable http://www.securityfocus.com/bid/9669 Googledork http://johnny.ihackstuff.com/ghdb/?function=detail&id=64
--------------------------------------------------------------------------------
Echo version 0.1
[ ] examples, [1] matches, [ ] aggressive, [ ] passive.
Description: CMS - www.helloecho.com
--------------------------------------------------------------------------------
FluxBB version 0.1
[7] examples, [4] matches, [x] aggressive, [ ] passive.
Description: Opensource forum written in PHP. Homepage: http://fluxbb.org/. Aggressive plugin can identify 1.2.x, 1.3-r718, 1.3-beta2, 1.4-beta2, 1.4-rc1
--------------------------------------------------------------------------------
GoAhead-Webs version 0.1
[2] examples, [1] matches, [ ] aggressive, [x] passive.
Description: Opensource, embedded webserver. Homepage: http://www.goahead.com/products/webserver/default.aspx
--------------------------------------------------------------------------------
Google-Analytics-GA version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: web visitor statistics www.google-analytics.com
--------------------------------------------------------------------------------
Google-Analytics-urchin version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: web visitor statistics www.google-analytics.com
--------------------------------------------------------------------------------
IIS-SiteNotFound version 0.1
[1] examples, [1] matches, [ ] aggressive, [ ] passive.
Description: Microsoft/IIS 5.0 default
--------------------------------------------------------------------------------
IIS-UnderConstruction version 0.1
[6] examples, [4] matches, [ ] aggressive, [ ] passive.
Description: Microsoft/IIS under construction page
--------------------------------------------------------------------------------
ISP-Config version 0.1
[1] examples, [2] matches, [ ] aggressive, [ ] passive.
Description: ISPConfig is a free, opensource hosting control panel
--------------------------------------------------------------------------------
JQuery version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: Javascript library
--------------------------------------------------------------------------------
Jboss version 0.1
[ ] examples, [4] matches, [ ] aggressive, [ ] passive.
Description:
--------------------------------------------------------------------------------
Joomla version 0.2
[14] examples, [3] matches, [x] aggressive, [x] passive.
Description: Opensource CMS written in PHP. Homepage: http://joomla.org. Plugin can aggresively identify version by comparing md5 hashes of 4 files.
--------------------------------------------------------------------------------
Lightbox version 0.1
[ ] examples, [1] matches, [ ] aggressive, [ ] passive.
Description: Javascript for nice image popups
--------------------------------------------------------------------------------
Mailto version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: email addresses in mailto: links
--------------------------------------------------------------------------------
Mambo version 0.1
[ ] examples, [2] matches, [ ] aggressive, [ ] passive.
Description: CMS Mambo.org
--------------------------------------------------------------------------------
Minify version 0.1
[5] examples, [3] matches, [ ] aggressive, [ ] passive.
Description: Minify is a PHP5 app that can combine multiple CSS or Javascript files, compress their contents (i.e. removal of unnecessary whitespace/comments), and serve the results with HTTP encoding (gzip/deflate) and headers that allow optimal client-side caching. It uses an enhanced port of Douglas Crockford's JSMin library. http://code.google.com/p/minify/
--------------------------------------------------------------------------------
Moodle version 0.1
[ ] examples, [2] matches, [ ] aggressive, [x] passive.
Description: Educational. Homepage: www.moodle.org
--------------------------------------------------------------------------------
MovableType version 0.1
[ ] examples, [4] matches, [ ] aggressive, [x] passive.
Description: Blogging platform www.movabletype.org
--------------------------------------------------------------------------------
NovellGroupwise version 0.1
[ ] examples, [3] matches, [ ] aggressive, [ ] passive.
Description:
--------------------------------------------------------------------------------
OSCommerce version 0.1
[32] examples, [3] matches, [x] aggressive, [x] passive.
Description: Open Source Ecommerce System in PHP. It was first released in March 2000 as 'The Exchange Project'. Branched projects include : Ian's Loaded, ZenCart, CRE Loaded, http://www.oscommerce.com
--------------------------------------------------------------------------------
Oce version 0.1
[1] examples, [2] matches, [ ] aggressive, [ ] passive.
Description: Oce Print Exec Workgroup is easy-to-use, web-based print management software for job submission of sets of technical drawings to a single large format printer. Homepage: global.oce.com/products/print-exec-workgroup/default.aspx
--------------------------------------------------------------------------------
OpenCms version 0.1
[ ] examples, [5] matches, [ ] aggressive, [x] passive.
Description: OpenCms, professional and eassy to use CMS. Homepage: http://www.opencms.org
--------------------------------------------------------------------------------
PHP-Nuke version 0.1
[11] examples, [7] matches, [ ] aggressive, [x] passive.
Description: PHP-Nuke is a free CMS. Homepage: phpnuke.org. The plugin passively recognises modules. An obvious improvement would be to aggresively discover modules and discover the phpnuke version
--------------------------------------------------------------------------------
Plesk version 0.1
[3] examples, [10] matches, [ ] aggressive, [ ] passive.
Description: Plesk is a web control panel Homepage: http://www.parallels.com/products/plesk/
--------------------------------------------------------------------------------
Plone version 0.1
[4] examples, [5] matches, [ ] aggressive, [x] passive.
Description: CMS http://plone.org
--------------------------------------------------------------------------------
Prototype version 0.1
[ ] examples, [1] matches, [ ] aggressive, [ ] passive.
Description: Javascript library
--------------------------------------------------------------------------------
Quantcast version 0.1
[ ] examples, [1] matches, [ ] aggressive, [ ] passive.
Description: Visitor demographics and statistics. www.quantcast.com
--------------------------------------------------------------------------------
Scriptaculous version 0.1
[ ] examples, [1] matches, [ ] aggressive, [ ] passive.
Description: Javascript library
--------------------------------------------------------------------------------
Siemens-SpeedStream-Router version 0.1
[ ] examples, [3] matches, [ ] aggressive, [ ] passive.
Description:
--------------------------------------------------------------------------------
SilverStripe version 0.1
[11] examples, [1] matches, [ ] aggressive, [ ] passive.
Description: CMS http://www.silverstripe.org
--------------------------------------------------------------------------------
Tomcat version 0.1
[ ] examples, [2] matches, [x] aggressive, [ ] passive.
Description:
--------------------------------------------------------------------------------
TypePad version 0.1
[ ] examples, [1] matches, [ ] aggressive, [ ] passive.
Description: Blogging platform http://www.typepad.com/
--------------------------------------------------------------------------------
VSNS-Lemon version 0.1
[4] examples, [6] matches, [ ] aggressive, [x] passive.
Description: VSNS is a Very Simple News System written in PHP. VSNS Lemon vulnerabilities: http://evuln.com/vulns/106/summary.html
--------------------------------------------------------------------------------
Windows-SBS version 0.1
[4] examples, [3] matches, [ ] aggressive, [ ] passive.
Description: Microsoft Small Business Server Homepage:www.microsoft.com/sbs/en/us/default.aspx
--------------------------------------------------------------------------------
WordPress version 0.2
[ ] examples, [3] matches, [ ] aggressive, [x] passive.
Description:
--------------------------------------------------------------------------------
WordPressSpamFree version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: Wordpress spam package
--------------------------------------------------------------------------------
antiboard version 0.1
[4] examples, [2] matches, [ ] aggressive, [ ] passive.
Description: PHP forum. homepage (gone):http://www.resynthesize.com/code/antiboard.php.
--------------------------------------------------------------------------------
apache-default version 0.2
[3] examples, [5] matches, [ ] aggressive, [ ] passive.
Description: homepage:www.apache.org
--------------------------------------------------------------------------------
asp-nuke version 0.1
[9] examples, [10] matches, [ ] aggressive, [x] passive.
Description: ASP Nuke homepage: www.aspnuke.com
ASP Nuke is an open-source software application for running a community-based web site on a web server. The requirements for the ASP Nuke content management system are: 1. Microsoft SQL Server 2000 and 2. Microsoft Internet Information Server (IIS) 5.0
--------------------------------------------------------------------------------
belkin-modem version 0.1
[3] examples, [6] matches, [ ] aggressive, [ ] passive.
Description: Homepage: http://www.belkin.com/
--------------------------------------------------------------------------------
bing-searchengine version 0.1
[1] examples, [2] matches, [ ] aggressive, [ ] passive.
Description: Bing.com is Microsoft's search engine
--------------------------------------------------------------------------------
citrix-metaframe version 0.1
[1] examples, [3] matches, [ ] aggressive, [ ] passive.
Description:
--------------------------------------------------------------------------------
comersus version 0.1
[3] examples, [4] matches, [ ] aggressive, [ ] passive.
Description: ASP opensource shopping cart. homepage: www.comersus.com
Comersus is an active server pages software for running a professional store, seamlessly integrated with the rest of your web site. Comersus Cart is free and it can be used for commercial purposes. Full source code included and compatible with Windows and Linux Servers.
--------------------------------------------------------------------------------
coppermine version 0.1
[11] examples, [6] matches, [ ] aggressive, [x] passive.
Description: PHP & MySQL Photo Gallery homepage: www.coppermine-gallery.net
--------------------------------------------------------------------------------
cpanel version 0.1
[1] examples, [3] matches, [ ] aggressive, [ ] passive.
Description: homepage:www.cpanel.net
--------------------------------------------------------------------------------
formmail version 0.1
[5] examples, [8] matches, [ ] aggressive, [x] passive.
Description: Common form email script.
FormMail is a Perl script written by Matt Wright to send mail with sendmail from the cgi-gateway. Early version didn’ have a referer check. New versions could be misconfigured. Spammers are known to hunt them down (by means of cgi-scanning) and abuse them for their own evil purposes if the admin forgot to check the settings.http://www.securityfocus.com/bid/3954/discussion/
--------------------------------------------------------------------------------
index-of version 0.1
[7] examples, [7] matches, [ ] aggressive, [ ] passive.
Description: Index of
--------------------------------------------------------------------------------
invision-power-board version 0.1
[15] examples, [12] matches, [ ] aggressive, [x] passive.
Description: PHP Web Forum homepage:www.invisionpower.com
--------------------------------------------------------------------------------
ispCP-omega version 0.1
[1] examples, [3] matches, [ ] aggressive, [ ] passive.
Description: PHP opensource, virtual hosting system homepage: http://www.isp-control.net/
--------------------------------------------------------------------------------
mailsite-express version 0.1
[5] examples, [5] matches, [ ] aggressive, [ ] passive.
Description: Webmail in ASP. Versions < 6.1.2 insecure http://marc.info/?l=bugtraq&m=113053680631151&w=2 Homepage: http://www.mailsite.com/products/express-webmail-server.asp
--------------------------------------------------------------------------------
md5 version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: MD5 sum of html body. Useful to find matching pages
--------------------------------------------------------------------------------
meta-generator version 0.1
[1] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: meta generator tag
--------------------------------------------------------------------------------
meta-powered-by version 0.1
[1] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: meta generator tag
--------------------------------------------------------------------------------
mnoGoSearch version 0.1
[5] examples, [2] matches, [ ] aggressive, [ ] passive.
Description: mnoGoSearch is an opensource website search engine. Versions 3.1.19 to 3.2.15 are vulnerable, http://www.securityfocus.com/bid/9667. http://johnny.ihackstuff.com/ghdb/?function=detail&id=65. Homepage www.mnogosearch.org
--------------------------------------------------------------------------------
oki-pbx version 0.1
[1] examples, [3] matches, [ ] aggressive, [ ] passive.
Description: OKI PBX (phone exchange) http://www.oki.com/en/iptel/products/mxsx/maintenance.html
--------------------------------------------------------------------------------
php-cake version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: PHP MVC web framework
--------------------------------------------------------------------------------
phpBB version 0.1
[16] examples, [6] matches, [x] aggressive, [x] passive.
Description: phpBB is a free forum phpbb.org
--------------------------------------------------------------------------------
powered by... version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: Powered by xxx. This needs improvement to strip out
--------------------------------------------------------------------------------
redirect-location version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: HTTP Server string location. used with http-status 301 and 302
--------------------------------------------------------------------------------
server-header version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: HTTP Server strings
--------------------------------------------------------------------------------
snom-phone version 0.1
[2] examples, [4] matches, [ ] aggressive, [ ] passive.
Description: voip phone homepage:www.snom.com
--------------------------------------------------------------------------------
title version 0.1
[1] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: The page title
--------------------------------------------------------------------------------
toshiba-printer version 0.1
[1] examples, [1] matches, [ ] aggressive, [ ] passive.
Description: Toshiba printer Top Access
--------------------------------------------------------------------------------
uncommon-headers version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: Uncommon HTTP server headers. The blacklist includes all the standard headers and many non standard but common ones. Interesting but fairly common headers should have their own plugins, eg. x-powered-by, server and x-aspnet-version. Info about headers can be found at www.http-stats.com
--------------------------------------------------------------------------------
vbulletin version 0.1
[12] examples, [7] matches, [ ] aggressive, [x] passive.
Description: VBulletin is a PHP forum. http://johnny.ihackstuff.com/ghdb/?function=detail&id=336
--------------------------------------------------------------------------------
vp-asp version 0.1
[7] examples, [5] matches, [ ] aggressive, [x] passive.
Description: VP-ASP (Virtual Programming - ASP) Shopping Cart. Free & commercial versions. http://johnny.ihackstuff.com/ghdb/?function=detail&id=324 Homepage:www.vpasp.com
--------------------------------------------------------------------------------
webguard version 0.1
[3] examples, [3] matches, [ ] aggressive, [ ] passive.
Description: security surveillance homepage: http://novuscctv.com/
--------------------------------------------------------------------------------
x-aspnet-version-header version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: HTTP header, x-aspnet-version
--------------------------------------------------------------------------------
x-powered-by-header version 0.1
[ ] examples, [ ] matches, [ ] aggressive, [x] passive.
Description: HTTP header, x-powered-by
--------------------------------------------------------------------------------
xtra-business-hosting version 0.1
[1] examples, [1] matches, [ ] aggressive, [ ] passive.
Description: Hosting at Xtra.co.nz
--------------------------------------------------------------------------------
Writing Plugins
A typical plugin looks like this:

There are 3 levels to a plugin. Simple matches, passive and agressive tests. You don’t need to know ruby to write plugins with simple matches. Passive and aggressive tests are written in ruby.
If you port a GHDB match, use :ghdb. I usually rewrite the GHDB matches with regular expressions, especially if they require inurl:
Example:
# http://johnny.ihackstuff.com/ghdb?function=detail&id=1840
{:name=>"GHDB: \"Powered by Vsns Lemon\" intitle:\"Vsns Lemon\"",
:probability=>100,
:ghdb=>'"Powered by Vsns Lemon" intitle:"Vsns Lemon"'}
Note the GHDB queries are case insensitive, as a Google query is. Support codes are intitle:, inurl: and filetype:.
Each plugin can access @body, @meta, @status and @base_uri variables.
Passive tests add matches to the m array, each match is a hash containing the name of the match, probability and more.
The entire hash is returned with Full output, Brief output returns just the match, :version and :string
To discover the regular expressions to match against, wget about 20-30 examples into the tests/ folder. Be aware that some software can have dramatic variations between versions.
First view the META data and HTML of a few examples.
The find-common-stuff tool can help discover unexpected similarities in the examples.
Recursive Spider
The recursion option is used to scan some or all of a website with whatweb. Recursive spidering will follow each link on a webpage if it is within the same website, then repeat the process on the followed pages.
The configurable settings for recursive spidering are:
–recursion, -r Follow links recursively. Only follows links under the path (default: off)
–depth, -d Maximum recursion depth (default: 3)
–max-links, -m Maximum number of links to follow on one page (default: 25)
Limitations of the spidering. This follows links in <a> tags, these are the HTML tags designed specifically for links. The spider does not obtain urls from other sources. Some good choices for future improvement are image tags, eg. <img src=”/images/boats.jpg”>, form tags, eg. <form action=”/vote.php”>, url paths in CSS files, etc.
Related Projects
WhatWeb is unique however there are some web projects with the same goal of identifying a website.
WAFP – Web Application Finger Printing
Wafp identifies systems by requesting a large quantity of URLs and comparing md5 sums of the results against a database. This method is reliable for known systems in the database and it is simple to add new ones. Unlike whatweb, this method is intrusive and will create a lot of webserver log entries.
http://www.mytty.org/wafp
Wappalyzer
This is the most similar project to WhatWeb.
Firefox plugin identifies sites using 1 regexp each. Only looks for obvious identifiers like meta generator tags. Sends all recognized urls to a DB. Has nice icons
https://addons.mozilla.org/en-US/firefox/addon/10229
www.http-stats.com
Lots of info about HTTP server names
Funny & Unusual
Slashdot.org
X-Fry: You mean Bender is the evil Bender? I’m shocked! Shocked! Well not that shocked.
popurls.com
X-popurls-a: in the future every url will be popular for 1.5 seconds
Notes
Version 0.3 Released at Kiwicon III (kiwicon.org), 2009.
Credits
Written by Andrew Horton aka urbanadventurer