How to Extract a List of Image URLs Indexed by Google

 

Back in 2014 I coded a nifty little JavaScript tool (adapting previous work by Liam Delahunty) to help with site auditing by extracting a list of URLs indexed by Google for a domain of your choice. The tool was initial created for internal use but due to its popularity within the High Position office I decided to release it for all to use.

Since publishing the bookmarklet and the related tutorial it’s proved to be a very useful tool to many people and I’ve had some fantastic feedback from those who have used it, which for a relatively straightforward piece of JavaScript coding is welcoming; however for some people that’s just not enough.

The Challenge

In a recent comment a chap named David Radicke requested a similar tool, but this time extracting a list of URLs from Google’s Image search.

It’s not often I get the chance to code anymore so when the opportunity does present itself I feel obliged to grab it with both hands. So, David your wish is my command.

The Bookmarklet

I’ve created a bookmarklet which extracts the following information:

  • Image Domain - the domain on which the image resides
  • Image URL - the full URL of the image
  • Source Domain - the domain which references the image
  • Source URL – the page on which the image is references

In addition I’ve provided a raw list of image URLs for your reference.

To get started drag and drop this bookmarklet into your ‘Bookmarks’ toolbar:

Next, head to Google and perform your search query within Google Image search. For the purpose of this example I’ve simply queried my own name.

With the image results on-screen in front of you, click on the bookmarklet and all being well you should see something similar to this:

Voila! Mission complete.

You can now use the tool to extract a list of images for a particular query (please note the code is restricted to UK image SERPS only. See below for more information on how to change that). In David’s case he wanted to use this to see how many images for a particular query originate from his domain so I hope I’ve helped him to achieve his goal.

The Code

In case anyone is interested in developing and/or improving the bookmarklet I have provided the code below. As I mentioned earlier it’s relatively straightfoward but please let me know if you have any questions.


javascript:(function(){
	output='<html><head><title>High Position SERP Link Generator</title><style type=\'text/css\'>body,table{font-family:Tahoma,Verdana,Segoe,sans-serif;font-size:11px;color:#000}h1,h2,th{color:#405850}th{text-align:left}h2{font-size:11px;margin-bottom:3px}table.data{table-layout: fixed;width: 100%;border-collapse: collapse;}table.data th, table.data td {overflow: hidden;border-bottom:1px solid #9eb8b0;padding:4px;}table.data th.id, table.data td.id {width: 50px;}</style></head><body>';
	output+='<table><tbody><tr><td><a href=\'https://www.highposition.com\'><img src=\'https://www.highposition.com/images/high-position.png\'></a></td><td><h1>Google Image Search URL Extractor</h1></td></tr></tbody></table>';
	pageAnchors=document.getElementsByClassName('rg_l');
	divClasses=document.getElementsByTagName('div');
	var linkcount=0;
	var imgURLList='';
	output+='<table class=\'data\'><th class=\'id\'>ID</th><th>Image Domain</th><th>Image URL</th><th>Source Domain</th><th>Source URL</th>';
	for(i=0;i<pageAnchors.length;i++){
		linkcount++;
		var query = pageAnchors[i].href;
		var vars = query.split("&");
		var arr;
		for (var t=0;t<vars.length;t++) {
			var pair = vars[t].split("=");
			var imgurl;
			var sourceurl;
			if(pair[0] == 'https://www.google.co.uk/imgres?imgurl'){
				imgURLList+=pair[1]+'<br />';
				imgurl=pair[1];
			}
			else if (pair[0] == 'imgrefurl'){
				sourceurl=pair[1];
			}
		}
		output+='<tr>';
		output+='<td class=\'id\'>'+linkcount+'</td>';
		arr = imgurl.split('/');
		output+='<td style=\'width:80%;\'>'+arr[0] + '//' + arr[2]+'</td>';
		output+='<td>'+decodeURI(imgurl)+'</td>';
		arr = sourceurl.split('/');
		output+='<td>'+arr[0] + '//' + arr[2]+'</td>';
		output+='<td>'+decodeURI(sourceurl)+'</td>';
		output+='</tr>\n';
	}
	output+='</table><br/><h2>Image URL List</h2><div>';
	output+=imgURLList;
	output+='</div><br/><br/><p align=center><a href=\'https://www.highposition.com\'>www.highposition.com</a></p>';
	with(window.open()){document.write(output);
	document.close();
	}
})();

Using the Bookmarklet Outside of the UK?

As I mentioned earlier (hat tip to David for this quick fix) the code is currently restricted to Google.co.uk which inevitably means that image searches in other languages/geographic locations will fail.

This is because there is an IF statement at line 27 which references Google.co.uk as part of the URL stripping method


if(pair[0] == 'http://www.google.co.uk/imgres?imgurl'){

To make the bookmarklet work in other geo-locations simply change .co.uk to you cTLD. For example:

Change to Google.de for Germany:


if(pair[0] == 'http://www.google.de/imgres?imgurl'){

Change to Google.com.au for Australia:


if(pair[0] == 'http://www.google.com.au/imgres?imgurl'){

In future I’ll try to re-code the bookmarklet so it’ll work in any geo-location but for the time being you’ll have to manually amend the code. Still, it’s pretty simple! :-)

Other Nifty Tools

For anyone looking for other bookmarklets to assist with common tasks Tom Jepson created this awesome list of SEO bookmarklets which features many time saving tools. Check it out, and if you have any suggestions please let us know.

Disclaimer:There are a couple of points to bear in mind when using my Google Images URL extractor tool.

  1. I threw this together pretty quickly. I’ve tested it using several different scenarios and to the best of my knowledge it works fine, but it may be prone to error.
  2. Google change their coding all the time so whilst the tool works at the moment it may need tweaking at a later date.

If anyone noticed the tool not functioning as expected please let me know.

Happy extracting!

 

11 thoughts on “How to Extract a List of Image URLs Indexed by Google

    • Glad you found it useful. Interesting that it dies in Google.de. If you could share how you altered it that would be great and save me debugging it. Perhaps I can somehow incorporate a location filter? Now that really would be a challenge.

      As for a beer unfortunately I’m not sure. We definitely need a UK version of beer2buds.com for cases just like this. Alternatively I’ll take a PayPal donation 😉

      Chris :-)

      • Hi Chris,
        well for the country change I just replaced the (co.uk) by (.de) in this line :-)
        if(pair[0] == ‘http://www.google.co.uk/imgres?imgurl’){
        That was so easy that even I could do it :-)

        If you had a paypal-buy-a-beer button or a link to your amazon wishlist, I´d be glad to use it :-) (there are too many Chris Aintworths on amazon.co.uk….)

      • Thanks for the code change, I’ll add that to the tutorial in a sec as that’s a good point.

        I’ll definitely suggest a PayPal ‘Buy a Beer’ button to the web dev team. Someone also suggested a ‘donate Bitcoin’ button but I’m not sure either of the extract URL tutorials are that worthy lol :-)

        Chris

  1. hi chris, it’s so useful bookmarklet I’ve ever use, I’m using it to save many wallpaper to my computer. it’s so time saver.

    By the way chris, do you have bookmarklet to extract url from bing image search?

    thanks, your article fans
    Gerga

  2. Hi Chris,

    Good idea. I could use this tool.
    Unforunatley, I have no idea how the recoding works..
    I simply don’t know where the code is which I could manually amend…
    Can you give me a hint?

    Thank you so much,

    jens

Leave a Reply

Your email address will not be published. Required fields are marked *