I have barely used this blog at all and it’s time for some updates. Let me know if you actually read this blog…

Since January I have been living in Melbourne, Australia and working as a security consultant for Stratsec. I’m keen to hear from other infosec people in the area. Before that I was in Wellington, working for Security-Assessment.com, New Zealand’s coolest IT security consultancy.

So what’s new? Late last year I did a couple of presentations at Kiwicon V. The presentations are:

1) Abode Vulnerabilities. Learn how to bring hardware hacking closer to home by hacking New Zealand’s most popular garage doors. This project is powered by the Arduino, the opensource hardware platform that makes electronics more accessible.

2) Decrypting the Cloud. This is a cautionary tale about failed opsec, weak crypto and misplaced trust in the cloud. Take a guided tour through a treasure trove of cracked ciphertext booty including CCs, SQLis, 0days, password dumps, and more.

I haven’t published anything on this website about these projects yet. I’m considering making page for the Arudino Garage Door hacking if there’s sufficient interest.

 _______ _____ _____                       __
|     __|   __|   __|.-----..-----..-----.|  |.-----..-----..----..---.-..-----.
|    |  |  |  |  |  ||  _  ||  _  ||  _  ||  ||  -__||__ --||  __||  _  ||     |
|_______|_____|_____||_____||_____||___  ||__||_____||_____||____||___._||__|__|
   G-G-Googlescan vo.4 (o4/2o1o)   |_____|          by urbanadventurer

Google scraper for automated searching. Returns URLs and hostnames

Homepage http://www.morningstarsecurity.com/research/gggooglescan
Download http://www.morningstarsecurity.com/downloads/gggooglescan-0.4.tar.gz
Version o.4, 18th April 2011
License GPLv3
Author urbanadventurer aka Andrew Horton from Security-Assessment.com

| Contents
1. Introduction
2. Antibot Avoidance Features
3. Search within a country
4. Using Proxies
5. Troubleshooting
6. Making a Wordlist

| 1. Introduction
GGGooglescan is a Google scraper which performs automated searches and returns results of search queries in the form of URLs or hostnames. Datamining Google’s search index is useful for many applications. Despite this, Google makes it difficult for researchers to perform automatic search queries. The aim of GGGooglescan is to make automated searches possible by avoiding the search activity that is detected as bot behaviour. Please note that using this software may be in violation of Google’s terms of service.

| 2. Antibot Avoidance Features

* To appear more like a human, it does not search beyond a page with the text: “In order to show you the most relevant results, we have omitted some entries very similar”

* When Google detects bot-type activity, it redirects to this page which contains a captcha: http://sorry.google.com/sorry/?continue=. gggooglescan will detect this and wait for 60 minutes before re-attempting the search query.

* Horizontal searching is a technique to harvest a large number of search results without appearing like a bot. Deep query searching is requesting high numbers of search result pages, eg. requesting result pages 1 through 50 which is detected as bot activity by Google. Horizontal searching allows you to query a large search result space without deep query searching. It works by combining your search query with each word in a word list to create a large quantity of shallow depth search queries.

For example, if you wanted to gather a list of forums powered by the myBB software you could use the following search query, “MyBB Group. Copyright”.
Google reports there are 79,500 results however it is not easy to obtain those results.
Let’s try the following command which searches 200 pages of 10 results which will hopefully net 2000 results.

./gggooglescan -l mybb-forum.log -d 200 ‘”powered by mybb”‘

After just 46 pages and 464 results Google does not display a link to the next page and reports “In order to show you the most relevant results, we have omitted some entries very similar to the 457 already displayed.”. This query returned 374 unique hostnames. After this search completed, the IP address used was restricted from further searching due to bot detection.

Using the horizontal search technique we can gain a larger list of mybb forums.

./gggooglescan -l mybb-forum-horizontal.log -d 1 -e ./wordlist ‘”powered by mybb”‘

This search was detected as bot activity after 47 combined dictionary words (a,aachen,aachen’s,aaliyah,…) and obtained 8236 results and 1231 unique hostnames.

./gggooglescan -l mybb-forum-horizontal2.log -d 4 -e ./wordlist ‘”powered by mybb”‘

This search got as far as ‘abbreviates’, the 62nd word in the list and yielded 2389 results, 721 unique hostnames

Here are some results I got, your mileage may vary:

./gggooglescan -l mybb-forum-horizontal.log -d 1 -e ./wordlist ‘”powered by mybb”‘
Detected as a bot. Got 8236 results

./gggooglescan -l mybb-forum-horizontal2.log -d 4 -e ./wordlist ‘”powered by mybb”‘
Detected as a bot. Got 2389 results

./gggooglescan -s 2 -l mybb-forum-horizontal3.log -d 1 -e ./wordlist ” ”
Avoided detection. Got over 10,000 results before i stopped it.

./gggooglescan -s 2 -l mybb-forum-horizontal4.log -d 1 -e ./wordlist ” ”
Avoided detection. Got over 100,000 results before i stopped it

./gggooglescan -s 2 -l mybb-forum-horizontal5.log -d 1 -e ./wordlist ‘”powered by mybb”‘
Detected as a bot. Got 4612 results.

./gggooglescan -s 2 -l phpbb-forums2.log -d 1 -e ./wordlist “memberlist goto page”
Avoided detection. Got over 6279 results.

| 3. Search within a country

Are you only interested in results in Australia? You can restrict your searches to a country with the -c parameter

./gggooglescan -c au hello

Perhaps you want a large set of Australian hostnames. Try the following:

./gggooglescan -c au -d 5 -v -l aussie.log -e ./wordlist a

| 4. Using Proxies

Use the -x option to pass curl command line parameters.

The following curl options control proxy usage:
-x/–proxy Use HTTP proxy on given port
–proxy-anyauth Pick “any” proxy authentication method (H)
–proxy-basic Use Basic authentication on the proxy (H)
–proxy-digest Use Digest authentication on the proxy (H)
–proxy-negotiate Use Negotiate authentication on the proxy (H)
–proxy-ntlm Use NTLM authentication on the proxy (H)
-U/–proxy-user Set proxy user and password
–socks4 SOCKS4 proxy on given host + port
–socks4a SOCKS4a proxy on given host + port
–socks5 SOCKS5 proxy on given host + port
–socks5-hostname SOCKS5 proxy, pass host name to proxy
–socks5-gssapi-service SOCKS5 proxy service name for gssapi
–socks5-gssapi-nec Compatibility with NEC SOCKS5 server

For example:
gggooglescan -x “–socks5 localhost:1234″ hax0r
gggooglescan -x “–proxy localhost:8080 –proxy-user bob:12345″ hax0r

You can also specify proxy settings with environment variables which will be used by curl, for example:

export ALL_PROXY=http://localhost:8118/
./gggooglescan foo

| 5. Troubleshooting

Q. I’m getting detected as a bot, what can I do?
A. Experiment with different (lower) values for -d depth.
Try combining searches with a wordlist.
Use proxies.

Q. I can’t see any output
A. Try using -v for verbose output. You may be waiting for the bot captcha to timeout.

Q. I’m using a custom user-agent and it fails to get results.
A. Google responds differently depending on the user-agent provided.
-c is incompatible with a user agent for a mobile device
User agents with MSIE 6.0 do not work with gggooglescan

| 6. Making a Wordlist

You can make a wordlist on a Unix system with a command such as:

cat /usr/share/dict/words| tr ‘[:upper:]‘ ‘[:lower:]‘ | egrep “[a-z’]” | sort -u > wordlist

There are more wordlists at openwall.com


It’s time for the Christchurch ISIG (Information Security Interest Group) meeting again.

When: 6.45pm, Thursday the 29th of April (The last Thursday of the month)

Where: Upstairs in the couch area at the Canterbury Innovation Incubator, 200 Armagh St. The doors to the Canterbury Innovation Incubator will be locked. Press the doorbell inside the open roller doors or TXT 0272 646 959 for entry.

Speaker: Caleb Anderson will be presenting on social network security issues.

Sponsor: Nick FitzGerald of Computer Virus Consulting is sponsoring this month’s beer.

As usual, it’s a casual beer-friendly event.

For more information and the mailing list see NZISIG.


It’s time for the ISIG (Information Security Interest Group) meeting again. This week MorningStar Security will be sponsoring some beer.

When: 6.45pm, Thursday the 25th of March (The last Thursday of the month)

Where: Upstairs in the couch area at the Canterbury Innovation Incubator, 200 Armagh St. The doors to the Canterbury Innovation Incubator will be locked. Press the doorbell inside the open roller doors or TXT 0272 646 959 for entry.

Speaker: Andrew Horton (urbanadventurer) will be speaking about fingerprinting the top 100,000 websites with WhatWeb.

See you there :)


Tomorrow I leave for India to conduct a workshop at IIT Guwahati, a prestigious Indian university. I was invited by Vivek Ramachandran of Security Tube fame to lecture and provide a workshop on information security for ISEA (Indian Security Education & Awareness) which is a project organised by the Department of Information Technology of the Government of India.

The purpose of ISEA is to improve understanding of IT security so my first thought was that the OWASP Top 10 Risks is perfect for this so I’m going to explain the new 2010 release candidate list.

Here’s my talk abstract:
Introduction to web hacking. Information on how to detect, prevent and exploit the top ten most
common web vulnerabilities as specified by OWASP (Open Web Application Security Project). Practical
attack scenarios and demonstrations will be given for each of the classes of vulnerability. The 2010
OWASP Top 10 vulnerability classes are injection, cross site scripting (XSS), broken authentication
and session management, insecure direct object references, cross site request forgery (CSRF),
security misconfiguration, failure to restrict url access, unvalidated redirects and forwards,
insecure cryptographic storage, insufficient transport layer protection. Examples will be given in
PHP because it is the most common web language.

Interestingly enough, IIT is an Indian university featured in the Dilbert cartoon strips.


Update: CISG has been absorbed into the Information Security Interest Group (ISIG). All meeting details are the same except for the name which is now, ISIG Christchurch Chapter.

I’m setting up the Christchurch Information Security Group (CISG) ISIG Christchurch Chapter to help organise the local Information Security community. It’s a casual meeting for information security enthusiasts to network and collaborate on projects. Business, academic and amateur people are welcome.

When: 6.45pm, the last Thursday of the month, beginning Thursday 28th of January.

Where: Upstairs in the couch area at the Canterbury Innovation Incubator, 200 Armagh St.
The doors to the Canterbury Innovation Incubator will be locked. Press the doorbell inside the open roller doors or TXT 0272 646 959 for entry.

Questions and comments are welcome :)


I was a speaker at the annual, New Zealand IT security conference, Kiwicon, in Wellington this year. I spoke on “New Zealand Web Reconnaisansse with WhatWeb”. Kiwicon is fast growing a reputation as a conference of the highest international standard.

Talk abstract: Ever wanted to web scan all of New Zealand but didn’t have the right tools? Me too, so I developed WhatWeb, a next generation website identification scanner. With stealth-mode turned all the way up to 11 it’s less intrusive than the Google crawler and eminently suitable for large scale internet scanning. Look foward to juicier web statistics than at NetCraft.com and a guided tour to the unindexed websites hidden among NZ’s 6 million allocated IPs. The web space is littered with voip phones, web cameras, printers, routers and bizzare devices to amaze and astound you. WhatWeb will be officially released at Kiwicon 2009.

Tools published at the Kiwicon conference:

  • Whatweb – next generation webscanner. Whatweb homepage
  • bing-ip2hosts – Enumerate hostnames from Bing.com for an IP address.
    Bing.com is Microsoft’s search engine which has an IP: search parameter. Homepage
  • gggooglescan – Enumerate hostnames and URLs from Google.
    Features: antibot avoidance, search within a country, custom search appliance Homepage
  • basedomainname – Extract TLD (Top Level Domain), domain extensions (Second Level Domain + TLD), domain name, and hostname from fully qualified domain names. Homepage

Link to Kiwicon III presentations : https://kiwicon.org/presentations/#urbanadventurer


I was a guest speaker at the NZITF (New Zealand Internet Task Force) meeting on Friday, November 27th. I spoke on the topic of WebWatcher and next generation web scanning. I wish to thank Paul McKitrick for inviting me to speak. The talk was well received, I enjoyed presenting and met some interesting people.