Using the Google custom search engine for OSINT
Today we take a step back towards the most basic OSINT resource – the Google search engine.
Google products are often labelled as privacy unfriendly, equipped with built-in overt and covert features that concentrate on tracking users’ online activities and their physical movements.
That said, the Google search engine remains the best and the most effective out there, which makes it impossible for any OSINT practitioner to disregard.
Anybody can google – but the results will vary drastically.
With speed, accuracy and efficiency in mind, the objective is to refine, narrow down, isolate and prioritize your search results by using a correct combination of sources (websites) to query.
This is the true power of Google custom search engines (CSEs).
So let’s take a look at how they work and how to build them.
As seen above, you can add websites that will be searched for against your query and also filter by language.
You can decide here whether you want to search whole websites (for example, the whole of Reddit), or just the selected parts (say like various Reddit threads), or maybe specific subdomains that belong to the main site, which you might wish to omit in your search.
If you are unsure about how domain addressing works, check out this post that contains an explanation on web domain addressing structure.
After you have created your custom search engine, you can modify it and change the parameters using the “Edit search engine” tab.
You can also choose to embed a CSE you created on a website (or link it up via traditional URL pasting / shortening methods outside of this panel).
One helpful option is “Refinements” – available under the “Search features” section, after you’ve chosen to edit the search engine.
This will allow you to limit search result to a specific website per refinement, but you can display multiple results segregated by tabs, each with their own results:
Refining search results by file formats and file extensions will allow you to build effective custom search engines for PDF documents, Excel spreadsheets, video files and whatever else you want to focus on.
By applying the method described above, you can filter your search by limiting results to files with a specific file extension.
This will require specifying the “Optional word(s)” value in a way that Google understands as filtering by file extension, for instance:
ext:pdf ext:jpeg ext:ppt
The main advantages of building custom search engines with Google are accuracy of sources and results limitation.
The trade off is that your results will be limited to 10 pages, with each page displaying only 10 results – so you get a maximum of 100 hits per query.
This means you really have to define your queries well and avoid broad searches – for which you can always use the general Google search engine.
So that’s it in a nutshell; there are some more granular options within the CSE interface that you can explore and tweak to make the results display better or be more relevant to your OSINT angle.
Or, if you are feeling lazy, you can use some of my own custom search engines that I share below…
Social media sites
Forums & chats
Corporate & business
Files & content
Github – searches for code and open source software on Github (also works on usernames!)
Documents – searches for document files online, filters by extension type.
Photos & images – as above, but for graphical files in varieties of file formats.
Slideshare – looks for slide decks and presentations on Slideshare.
Google Drive – searches through publicly available content on people’s Google Drives.
Most wanted & sanctioned lists
Feel free to use my searches and tell me what works / doesn’t work!
Let me know if you would like to see any other custom search engines – reach out on Twitter