• Home
  • New Entries
  • Popular Entries
  • Submit a Story
  • About

Manipulating Google Results – Ajax Version ...

A good friend of mine has requested my help recently regarding Javascript. Don’t get me wrong, I’m no expert on the subject, but I’m always curious when experts turn to me for help (unfortunately it doesn’t happens so often).

My friend has developed a script in Greasemonkey that manipulate Google search results. Basically, this is not a complicated task, and there are plenty of scripts out there that perform this task for several objectives. for example:

    * Google FX
    * Twitter Search Result on Google

A short description about Greasemonkey:
According to Wikipedia: “Greasemonkey is a Mozilla Firefox add-on that allows users to install scripts that make on-the-fly changes to most HTML-based web pages“. Basically, Greasemonkey allows you to run Javascript code after a page is loaded. meaning, you can manipulate the DOM object as much as you want. In our case, after the page is loaded you can access the search results and do what ever you want with it (for example, adding a thumbnail of the URL)

The thing that bothered my friend was that suddenly his code stopped worked in some of the cases. After running a few tests, he found out that his code is being called before the search results were even loaded into the Dom object. Taking into consideration what I’ve just explained about Greasemonkey, this is very strange behavior.

After running a few more tests (thank god for Firebug), i found out that Google has started to use their Ajax engine in some of their traditional search.  I say “in some” since i didn’t actually understood when they use the ajax and when the return the results stright back from the server. In order to tell if a search is being done through the ajax engine you can check the URL of the results. If you see the # sign  it means that it’s the ajax version. Actually, using the the # sign is a nice trick when developing web application (i think it would also be an interesting post).


Ajax search

The Google ajax search works in the following way: the first page that the server returns is the skeleton document, or the container document which makes ajax queries to Google and fetch the search results. So now the question is – how can one know when the ajax callback is being executed? that is, when the results are ready for use?

I can ask the question in a more generalized way: how can you tell when an an external code (not a code that you wrote) has called an ajax callback? This is a very legitimate question, especially when writing Greazsemonkey scripts.

I want to introduce here 3 techniques that solves this problem. While one of these solution is better than the other two, i think it’s important to go over them since you may find one of them be more suited in similar situations:

   1. Recursive Lookups
   2. Hijacking the Callback
   3. DOM Notification Events

Recursive Lookups

This is the most straightforward implementqation that I can think of, but probably will work in almost any similar situation. Since we know which Dom object we are trying to access, we can define a time frame of X seconds in which we will check every Y milliseconds weather or not the Dom object is accessible or not.

The following code demonstrate this technique:


01.<html>
02.<head>
03.<script type="text/javascript">
04.//the number of seconds to run the loop
05.var timeFrame = 5; //up to 5 seconds
06.//the number of repeats in one second
07.var repeatsInSecond = 5; //5 checks in one seoncd (every 200 milisoncs)
08.//the dom object we want to see if its there
09.var domObjId = "myDomObj";
10.
11.var domExist = false;
12.(function()
13.{
14.for( var x = 0 ; x < timeFrame ; x++ )
15.{
16.  for( var i = 0 ; i < repeatsInSecond ; i++ )
17.  {
18.    window.setTimeout( function(){
19.    if( domExist ) return;
20.
21.    if( document.getElementById(domObjId) != undefined )
22.    {
23.      domExist = true;
24.      x = timeFrame;
25.      i = repeatsInSecond;
26.      alert("do something here...");
27.    }
28.  } , 1000/repeatsInSecond*(i+1)*(x+1) );
29.}
30.}
31.
32.})();
33.
34.</script>
35.
36.<script type="text/javascript">
37.window.setTimeout( function(){ document.getElementById("myDiv").innerHTML = "<span id= amyDomObj>now i am here</span>"; } , 2500 );
38.</script>
39.</head>
40.
41.<body>
42.<div id="myDiv">
43.When you try your best, but you don succeed
44.When you get what you want, but not what you need
45.When you feel so tired, but you can sleep
46.Stuck in reverse...
47.</div>
48.</body>
49.</html>

Hijacking the Callback

This method uses some of the built in functionality of the javascript language, which resemble the method overloading in C++ and Java. The concept here is that if you know the name of the callback method, you can hijack it (don’t worry its not criminal, at least not yet), so your code will be called as the callback function for the ajax request. After you accomplish that, you can decide if you want to run your code and then run the original callback or the other way around.

I’ve written the following script to demonstrate this technique:


01.<html>
02.<head>
03.<script type="text/javascript">
04.var MyObject = {
05.oldMethod: function(text) {
06.  alert("oldMethod:" + text);
07.  }
08.};
09.
10.window.onload = function(e){
11.MyObject.oldMethod("It is me!");
12.}
13.
14.</script>
15.
16.<script type="text/javascript">
17.var Hijacking = {
18.oldMethod: MyObject.oldMethod ,
19.newMethod: function(text) {
20.  alert("newMethod:" + text);
21.  Hijacking.oldMethod(text);
22.  }
23.};
24.MyObject.oldMethod = Hijacking.newMethod;
25.</script>
26.
27.</head>
28.
29.<body>
30.You got wires, going in
31.You got wires, coming out of your skin
32.You got tears, making tracks
33.I got tears, that are scared of the facts
34.</body>
35.</html>

I’ve created an object called MyObject that has only one method that gets a text and print using the alert function along with the method name. For convenience I’ve called this method when the document is loaded, but this could be the ajax callback.

Then I’ve created a script with the Hijacking object which saves a reference to the original callback method that we’re trying to hijack, and also implements its own callback method “newMethod“. This new method print the text and its name, and than calls the original method.  At the end, I’m hijacking the  callback (oldMethod) – and this is pretty much it

When running this script this is the output that you should get:

While this is a nice trick, for our current specific problem, this technique requires us to know the name of the callback method that Google uses. Since Google obfuscate their code it could take a while to locate the callback method name, and even then, the method name could change. This is why we need a simpler and non-intrusive approach.


DOM Notification Events

Usually when an ajax call is being made, as part of the response handling there are some changes in the DOM object. Luckily for us, we can register for such events. According to the W3C there are events that allow you to get notified about changes in the DOM object:

    * DOMNodeInserted
    * DOMNodeRemoved
    * DOMAttrModified

After a short examination of Google search results DOM object I’ve found out that Google has a div with id=’foot’.
Google foot div

Google foot div

This is the div that contains all the HTML elements responsible for the paging controls. I found out that Google is changing this div visibility attribute from visible to hidden and back. This means, that Google hides this div when the ajax request is being made, and shows it back when the results are ready to be displayed. All we  need to do is to catch this event and do what ever it is that we want to do.

The following code demonstrate this technique:


1.document.addEventListener(DOMAttrModified, function (event)
2.{
3.    if (event.target.id == foot)
4.    {
5.         //do your thing...
6.    }
7.}, false);

 Original Source:
http://www.amirharel.com/2009/07/19/manipulating-google-results-ajax-version/

AddThis Social Bookmark Button

Posted at 08:58:13 am | Permalink | Posted in Google  

Related Stuff

  • MooV: Using cutting edge Video phones and Software Video Phones - coupling all that with VoIP and empowering the disabled.

  • Moo Telecom: VoIP communications made easy - Ring anyway with the fun and ease of using a normal phone

  • TagR:Mobile Social Network with Real Time Locations Based services, and Ambience Intelligence, VoiP, IM, Skype, Googletalk, Mapping, Flickr, Events, Calendaring, Scheduling, SecondLife Support

  • ClearSMS : ClearSMS is a Web-based application that lets you send bulk SMS messages to your customers, contacts, or just about anyone.

  • Jajah:jah is a VoIP (Voice over IP) provider, founded by Austrians Roman Scharf and Daniel Mattes in 2005[1]. The Jajah headquarters are located in Mountain View, CA, USA, and Luxembourg. Jajah maintains a development centre in Israel.

  • Skype: It’s free to download and free to call other people on Skype. Skype the number one voice over ip software

  • PrivatePhone: a free local phone number with voicemail and messages you can check online or from any phone.

Top Stuff

e-messenger

MSN Web Messenger

eBuddy

ASP.NET Ajax CalendarExtender and Validation

AIM Express

Ajax Tools for ASP.NET Developers



About Ajaxlines

Ajaxlines is a project focused on providing its audience with a database of most of Ajax related articles, resources, tutorials and services from around the world.

Its purpose is to showcase the power of Ajax and to act as a portal to the Ajax development community.


Search


Topics

  • .Net (176)
  • Ajax (112)
  • Ajax Games (10)
  • Articles (95)
  • Bookmarking (35)
  • Calendar (21)
  • Chat (45)
  • ColdFusion (3)
  • CSS (84)
  • Email (23)
  • Facebook (84)
  • Flash (20)
  • Google (54)
  • Html (29)
  • Image (12)
  • International Calls & VOIP (7)
  • Java (58)
  • Javascript (280)
  • jQuery (200)
  • JSON (75)
  • Perl (2)
  • PHP (172)
  • Presentation (19)
  • Python (3)
  • Resources (2)
  • RSS (8)
  • Ruby (32)
  • Storage (4)
  • Toolkits (103)
  • Tutorials (227)
  • UI (11)
  • Utilities (174)
  • Web2.0 (18)
  • XmlHttpRequest (29)
  • YUI (13)

© 2006 www.ajaxlines.com. All Rights Reserved. Powered by IRange