website fingerprinting

Next generation web scanner. Identify what websites are running.

Download whatweb-0.4.7.tar.gz
Latest Version 0.4.7, 5th April 2011
License GPLv2
Author urbanadventurer aka Andrew Horton
Wiki The WhatWeb Wiki
Development Version WhatWeb on GitHub

Introduction

WhatWeb identifies websites. Its goal is to answer the question, “What is that Website?”. WhatWeb recognises web technologies including content management systems (CMS), blogging platforms, statistic/analytics packages, JavaScript libraries, web servers, and embedded devices. WhatWeb has over 900 plugins, each to recognise something different. WhatWeb also identifies version numbers, email addresses, account IDs, web framework modules, SQL errors, and more.

WhatWeb can be stealthy and fast, or thorough but slow. WhatWeb supports an aggression level to control the trade off between speed and reliability. When you visit a website in your browser, the transaction includes many hints of what web technologies are powering that website. Sometimes a single webpage visit contains enough information to identify a website but when it does not, WhatWeb can interrogate the website further. The default level of aggression, called ‘passive’, is the fastest and requires only one HTTP request of a website. This is suitable for scanning public websites. More aggressive modes were developed for in penetration tests.

Most WhatWeb plugins are thorough and recognise a range of cues from subtle to obvious. For example, most WordPress websites can be identified by the meta HTML tag, e.g. ‘‘, but a minority of WordPress websites remove this identifying tag but this does not thwart WhatWeb. The WordPress WhatWeb plugin has over 15 tests, which include checking the favicon, default installation files, login pages, and checking for “/wp-content/” within relative links.

Features:
* Over 900 plugins
* Control the trade off between speed/stealth and reliability
* Plugins include example URLs
* Performance tuning. Control how many websites to scan concurrently.
* Multiple log formats: Brief (greppable), Verbose (human readable), XML, JSON, MagicTree, RubyObject, MongoDB.
* Recursive web spidering
* Proxy support including TOR
* Custom HTTP headers
* Basic HTTP authentication
* Control over webpage redirection
* Nmap-style IP ranges
* Fuzzy matching
* Result certainty awareness
* Custom plugins defined on the command line

Example Usage

Using WhatWeb on a couple of websites:

Using a higher aggression level to identify the version of Joomla in use.

Help


.$$$     $.                                   .$$$     $.
$$$$     $$. .$$$  $$$ .$$$$$$.  .$$$$$$$$$$. $$$$     $$. .$$$$$$$. .$$$$$$.
$ $$     $$$ $ $$  $$$ $ $$$$$$. $$$$$ $$$$$$ $ $$     $$$ $ $$   $$ $ $$$$$$.
$ `$     $$$ $ `$  $$$ $ `$  $$$ $$' $ `$ `$$ $ `$     $$$ $ `$      $ `$  $$$'
$. $     $$$ $. $$$$$$ $. $$$$$$ `$  $. $  :' $. $     $$$ $. $$$$   $. $$$$$.
$::$  .  $$$ $::$  $$$ $::$  $$$     $::$     $::$  .  $$$ $::$      $::$  $$$$
$;;$ $$$ $$$ $;;$  $$$ $;;$  $$$     $;;$     $;;$ $$$ $$$ $;;$      $;;$  $$$$
$$$$$$ $$$$$ $$$$  $$$ $$$$  $$$     $$$$     $$$$$$ $$$$$ $$$$$$$$$ $$$$$$$$$'

WhatWeb - Next generation web scanner.
Version 0.4.7 by Andrew Horton aka urbanadventurer from Security-Assessment.com
Homepage: http://www.morningstarsecurity.com/research/whatweb

Usage: whatweb [options] 

TARGET SELECTION:
                  Enter URLs, filenames or nmap-format IP ranges.
                        Use /dev/stdin to pipe HTML directly
  --input-file=FILE, -i Identify URLs found in FILE, eg. -i /dev/stdin
  --url-prefix          Add a prefix to target URLs
  --url-suffix          Add a suffix to target URLs
  --url-pattern         Insert the targets into a URL. Requires --input-file,
                        eg. www.example.com/%insert%/robots.txt
  --example-urls, -e    Add example URLs for each selected plugin to the target
                        list. By default will add example URLs for all plugins.

AGGRESSION LEVELS:
  --aggression, -a=LEVEL The aggression level controls the trade-off between
                        speed/stealth and reliability. Default: 1
                        Aggression levels are:
        1 (Passive)     Make one HTTP request per target. Except for redirects.
        2 (Polite)      Reserved for future use
        3 (Aggressive)  Triggers aggressive plugin functions only when a
                        plugin matches passively.
        4 (Heavy)       Trigger aggressive functions for all plugins. Guess a
                        lot of URLs like Nikto.

HTTP OPTIONS:
  --user-agent, -U=AGENT Identify as AGENT instead of WhatWeb/0.4.7.
  --user, -u= HTTP basic authentication
  --header, -H          Add an HTTP header. eg "Foo:Bar". Specifying a default
                        header will replace it. Specifying an empty value, eg.
                        "User-Agent:" will remove the header.
  --follow-redirect=WHEN Control when to follow redirects. WHEN may be `never',
                        `http-only', `meta-only', `same-site', `same-domain'
                        or `always'. Default: always
  --max-redirects=NUM   Maximum number of contiguous redirects. Default: 10

SPIDERING:
  --recursion, -r       Follow links recursively. Only follow links under the
                        path Default: off
  --depth, -d           Maximum recursion depth. Default: 10
  --max-links, -m       Maximum number of links to follow on one page
                        Default: 250
  --spider-skip-extensions Redefine extensions to skip.
                        Default: zip,gz,tar,jpg,exe,png,pdf

PROXY:
  --proxy                Set proxy hostname and port
                        Default: 8080
  --proxy-user           Set proxy user and password

PLUGINS:
  --plugins, -p         Comma delimited set of selected plugins. Default is all.
                        Each element can be a directory, file or plugin name and
                        can optionally have a modifier, eg. + or -
                        Examples: +/tmp/moo.rb,+/tmp/foo.rb
                        title,md5,+./plugins-disabled/
                        ./plugins-disabled,-md5
			-p + is a shortcut for -p +plugins-disabled
  --list-plugins, -l    List the plugins
  --info-plugins, -I    Display information for all plugins. Optionally search
                        with keywords in a comma delimited list.
  --custom-plugin       Define a custom plugin called Custom-Plugin,
                        Examples: ":text=>'powered by abc'"
                        ":regexp=>/powered[ ]?by ab[0-9]/"
                        ":ghdb=>'intitle:abc \"powered by abc\"'"
                        ":md5=>'8666257030b94d3bdb46e05945f60b42'"
                        "{:text=>'powered by abc'},{:regexp=>/abc [ ]?1/i}"

LOGGING & OUTPUT:
  --verbose, -v         Increase verbosity, use twice for plugin development.
  --colour,--color=WHEN control whether colour is used. WHEN may be `never',
                        `always', or `auto'
  --quiet, -q		Do not display brief logging to STDOUT
  --log-brief=FILE      Log brief, one-line output
  --log-verbose=FILE    Log verbose output
  --log-xml=FILE        Log XML format
  --log-json=FILE       Log JSON format
  --log-json-verbose=FILE Log JSON Verbose format
  --log-magictree=FILE  Log MagicTree XML format
  --log-object=FILE     Log Ruby object inspection format
  --log-mongo-database  Name of the MongoDB database
  --log-mongo-collection Name of the MongoDB collection. Default: whatweb
  --log-mongo-host      MongoDB hostname or IP address. Default: 0.0.0.0
  --log-mongo-username  MongoDB username. Default: nil
  --log-mongo-password  MongoDB password. Default: nil
  --log-errors=FILE     Log errors

PERFORMANCE & STABILITY:
  --max-threads, -t     Number of simultaneous threads. Default: 25.
  --open-timeout        Time in seconds. Default: 15
  --read-timeout        Time in seconds. Default: 30
  --wait=SECONDS        Wait SECONDS between connections
                        This is useful when using a single thread.

HELP & MISCELLANEOUS:
  --help, -h            This help
  --debug               Raise errors in plugins
  --version             Display version information. (WhatWeb 0.4.7)

EXAMPLE USAGE:
  whatweb example.com
  whatweb -v example.com
  whatweb -a 3 example.com
  whatweb 192.168.1.0/24

Logging & Output

Verbose output is specified with -v


The following types of logging are supported:

–log-brief=FILE Brief, one-line, greppable format
–log-verbose=FILE Verbose
–log-xml=FILE XML format. XSL stylesheet is provided
–log-json=FILE JSON format
–log-json-verbose=FILE JSON verbose format
–log-magictree=FILE MagicTree XML format
–log-object=FILE Ruby object inspection format
–log-mongo-database Name of the MongoDB database
–log-mongo-collection Name of the MongoDB collection. Default: whatweb
–log-mongo-host MongoDB hostname or IP address. Default: 0.0.0.0
–log-mongo-username MongoDB username. Default: nil
–log-mongo-password MongoDB password. Default: nil
–log-errors=FILE Log errors. This is usually printed to the screen in red.

You can output to multiple logs simultaneously by specifying multiple command line logging options.

Plugins

Matches are made with:
* Text strings (case sensitive)
* Regular expressions
* Google Hack Database queries (limited set of keywords)
* MD5 hashes
* URL recognition
* HTML tag patterns
* Custom ruby code for passive and aggressive operations

$ ./whatweb -l

WhatWeb Plugin List

Plugin Name               Description
-------------------------------------------------------------------------------
1024-CMS                  1024 is one of a few CMS's leading the way with the i
360-Web-Manager           360-Web-Manager - homepage: http://www.360webmanager.
4images                   4images is a powerful web-based image gallery managem
... (truncated - there are a lot)

To view more detail about a plugin or plugins

$ ./whatweb -I phpBB
WhatWeb Plugin Information
Searching for phpBB
--------------------------------------------------------------------------------
Plugin Name               Details
phpBB
        Author:              Andrew Horton
        Version:             0.3
        Examples:            16
        Matches:             7
        Passive function:    Yes
        Aggressive function: Yes
        Version detection:   Yes
        Description:
        phpBB is a free forum phpbb.org

--------------------------------------------------------------------------------
1 plugins found

All plugins are loaded by default.

Plugins can be selected by directories, files or plugin names as a comma delimited list with the -p or –plugin command line option.

Each list item may have a modifier: + adds to the full set, – removes from the full set and no modifier overrides the defaults.

Examples :

–plugins +plugins-disabled,-foobar
–plugins +/tmp/moo.rb
–plugins foobar (only select foobar)
-p title,md5,+./plugins-disabled/
-p ./plugins-disabled,-md5

Aggressive Plugins

WhatWeb features several levels of aggression. By default the aggression level is set to 1 (passive) which sends a single HTTP GET request.

1 (Passive) Make one HTTP request per target. Except for redirects.
2 (Polite) Reserved for future use
3 (Aggressive) Triggers aggressive plugin functions only when a plugin matches passively.
4 (Heavy) Trigger aggressive functions for all plugins. Guess a lot of URLs like Nikto.

If aggression is enabled the aggressive plugins will guess more URLs and perform actions that are potentially unsuitable without permission.

With the passive matches we know that smartor.is-root.com/forum/ is running phpBB version 2:

$ ./whatweb smartor.is-root.com/forum/
http://smartor.is-root.com/forum/ [200] PasswordField[password], HTTPServer[Apache/2.2.15], PoweredBy[phpBB], Apache[2.2.15], IP[88.198.177.36], phpBB[2], PHP[5.2.13], test[Smartors Mods Forums – Reloaded], X-Powered-By[PHP/5.2.13], Cookies[phpbb2mysql_data,phpbb2mysql_sid], Title[Smartors Mods Forums – Reloaded], Country[GERMANY][DE]

With the aggressive matches in the phpBB plugin we know that the same website is running phpBB version 2.0.20 or higher:

$ ./whatweb -p plugins/phpbb.rb -a 3 smartor.is-root.com/forum/
http://smartor.is-root.com/forum/ [200] phpBB[2,>2.0.20]

Note the use of the -p argument to select only the phpBB plugin. It is advisable, but not mandatory, to select a specific plugin when attempting to fingerprint software versions in aggressive mode. This approach is far more stealthy as it will limit the number of requests.

Do not use aggressive plugins with recursive site crawling. WhatWeb has no understanding of a website, instead it currently treats each URL separately.

It also has no caching so if you use aggressive plugins with recursion you will fetch the same files multiple times. The same is true for aggressive modes on redirecting URLs.

Recursive Spider

The recursion option is used to scan some or all of a website with WhatWeb. Recursive spidering will follow each link on a webpage if it is within the same website, then repeat the process on the followed pages.

The configurable settings for recursive spidering are:
–recursion, -r Follow links recursively. Only follows links under the path (default: off)
–depth, -d Maximum recursion depth (default: 10)
–max-links, -m Maximum number of links to follow on one page (default: 250)
–spider-skip-extensions Redefine extensions to skip. (Default: zip,gz,tar,jpg,exe,png,pdf)

Limitations of the spidering. This follows links in <a> tags, these are the HTML tags designed specifically for links. The spider does not obtain URLs from other sources. Some good choices for future improvement are image tags, e.g. <img src=”/images/boats.jpg”>, form tags, e.g. <form action=”/vote.php”>, URL paths in CSS files, etc.

The spider is provided by Anemone, a third party ruby gem. It doesn’t follow redirects. For example the URL treshna.com will fail and www.treshna.com will produce results.

Performance & Stability

WhatWeb features several options to increase performance and stability.

–max-threads, -t Number of simultaneous threads. Default: 25.
–open-timeout Time in seconds. Default: 60
–read-timeout Time in seconds. Default: 120
–wait=SECONDS Wait SECONDS between connections
This is useful when using a single thread.

The –wait and –max-threads commands can be used to assist in IDS evasion.

Furthermore, changing the user-agent using the -U or –user-agent command line option will avoid the Snort IDS rule for WhatWeb.

Without the em-resolve-replace gem performance is significantly degraded.

If you are scanning ranges of IP addresses, it is much more efficient to use a port scanner like nmap to discover which have port 80 open before scanning with WhatWeb.

Character set detection, with the Charset plugin, required by JSON and MongoDB logging uses more CPU than otherwise.

Optional Dependencies

Without the em-resolve-replace gem performance is significantly degraded.
gem install em-resolv-replace

To enable JSON logging install the json gem.
gem install json
gem install bson_ext

To enable MongoDB logging install the mongo gem.
gem install mongo

To enable character set detection and MongoDB logging install the rchardet gem.
gem install rchardet

Release History

Version 0.3 Released at Kiwicon III (kiwicon.org), November 2nd, 2009
Version 0.4 Released March 14th, 2010
Version 0.4.1 Released April 28th, 2010
Version 0.4.2 Released April 30th, 2010
Version 0.4.3 Released May 24th, 2010
Version 0.4.4 Released June 29th, 2010
Version 0.4.5 Released August 17th, 2010
Version 0.4.6 Released March 25th, 2011
Version 0.4.7 Released April 5th, 2011

Credits

Written by urbanadventurer aka Andrew Horton from Security-Assessment.com
Homepage: http://www.morningstarsecurity.com/research/whatweb
License: GPLv2

Anemone library (used for spidering) is written by Chris Kite
Homepage: http://anemone.rubyforge.org/
License: MIT

DEVELOPERS

Andrew Horton
Brendan Coles

CONTRIBUTORS

Thank you to the following people who have contributed to WhatWeb

Emilio Casbas
Louis Nyffenegger
Patrik Wallström
Caleb Anderson
Tonmoy Saikia
Aung Khant
Erik Inge Bolsø
nk@dsigned.gr
Michal Ambroz for writing the Makefile and Man pages
Gremwell for improving the MagicTree logging

Updates & Additional Information

The WhatWeb development build features regular updates.

* WhatWeb-dev: https://github.com/urbanadventurer/WhatWeb/
* WhatWeb-dev-unstable: https://github.com/bcoles/WhatWeb/

Browse the wiki for more documentation and advanced usage techniques.

* Wiki: https://github.com/urbanadventurer/WhatWeb/wiki/