+353 (0)87 1254033
+33 (0)617 104022

 
Technical Articles

Recent News & Articles

» Timeware Roster System (v2.0 S...
» Recommended site architecture ...
» Enabling Gzip compression on y...
» Secure file_get_contents PHP f...
» Puerto Plata Website Launched ...

Live Support
Contact one of our team members now to discuss your project requirements. Helpful & knowledgable team members are on hand to provide you with expert advice. Live support is currently


Google Base for Vacation Rentals

Posted: Sun, 5th August 2007 in PHP

Recently, one of our clients, Riviera Rental Guide, a provider of holiday rental accommodation in the French and Italian Rivieras contacted us to discuss the possibility of integrating their rental listings with Google Base. We had certainly heard of Base but had only submitted items manually or in bulk, never through the application programming interface (API). So here we are, a comprehensive guide to inserting and updating items in Google Base.

What is Google Base? From the lads themselves: "Google Base is a place where you can easily submit all types of online and offline content, which we'll make searchable on Google". With the API, this means that we can programmatically submit our information (or any information from a data driven website) in a strict markup language (i.e. XML) so that it is "more" searchable. What does "more" searchable mean? Well, the trouble with documents on the web is that relevant elements of a particular page are enclosed, or marked up, in tags which are devoid of context with reference to other pages. If we take 2 websites which are essentially about the same thing, it will be a fairly reliable fact that some things will be bold in one page and not in the other, some things will be italicised, there will be different headings and generally the pages will be very different from each other despite representing quite similar concepts.

This poses a huge problem for search engines because they have to determine the context of a page and indeed what it is about, i.e. vacation rentals in the south of France. They determine this by, among other things, analysing the page content, analysing the page content of pages that link to that page, analysing the link text of pages that link to that page &c. This is fine if there was a 1 to 1 relationship between products/services and websites, i.e. if there was only one place on the web to purchase flights but the fact that search engines need to return more relevant pages first means they need to compare different pages and as we well know, there's little point in comparing apples with oranges.

This is where XML and Google Base level the playing field and make it much easier for users to find what they are looking for. By requiring content providers to publish their information in a standardised format, e.g. for the vacation rentals item type, you have to provide the number of bedrooms, bathrooms, the price of the rental property and the location, any ambiguity is effectively removed. It means that the same rental listing from Riviera Rental Guide will look the same as a rental listing from a competitor. This has advantages. From a users perspective it means that they can easily retrieve the most relevant items and they see the items in a standarised format. From a content creator's perspective, they can be the smallest player in their field but if they have the right product, they no longer need to compete in terms of having the biggest website, the most expensive technology. In fact, in order to use Google Base one doesn't even need a website.

Anyway, let's get down to business, how do we go about adding listings to Base? First, get yourself a Google account. Second, get yourself an API developer key. I realise that this isn't at all helpful but I don't know where you get these nowadays. I've had one in my gmail inbox since the dawn of time which I use for everything. Here's what we're going to do:

- Authenticate ourselves for querying using our developer key
- Grant access to our application
- Pull down all our existing Base items
- Serialize our existing items to an external file
- Authenticate ourselves for inserting/updating using our developer key
- Grant access to our application
- Unserialize our existing items
- Set up the XML feed
- Post the information to Google Base

Here's the logic in PHP, we'll look at the functions in more detail later:


if(empty($_GET['token']))
{
   authenticate(true);
}
else
{
   // Second time the page is loaded
   if($_GET['postStage'])
   {
      // After serialization, post our items
      $mu_token = exchangeToken($_GET['token']);
      $existing_items = unserialize(file_get_contents(".rrg_gbase"));
      $feed = establishFeed();
      $postResponse = postItems($mu_token);
      print $postResponse;
   }
   else
   {
      // Serialize our existing GBase items
      $mu_token = exchangeToken($_GET['token']);
      $existing_items = getItems($mu_token);
      $fh = fopen(".rrg_gbase","w+");
      fwrite($fh,serialize($existing_items));
      fclose($fh);
      authenticate(false);
   }
}


So the first time the page loads, we don't have an authentication token from Google so we need to get one. We get that using our
authenticate
function which goes a little something like this:


function authenticate($query=false)
{
  global $itemsFeedURL, $itemsFeedURLQuery;

  $next_url  = 'http://' . $_SERVER['HTTP_HOST'] . $_SERVER['PHP_SELF'];
  if(!$query) { $next_url .= '?postStage=1'; }
  $redirect_url = 'https://www.google.com/accounts/AuthSubRequest?session=1';
  $redirect_url .= '&next=';
  $redirect_url .= urlencode($next_url);
  $redirect_url .= "&scope=";
  $redirect_url .= ($query) ? urlencode($itemsFeedURLQuery) : urlencode($itemsFeedURL);
  header("Location:".$redirect_url);
  exit();
}


We can pass a flag to the function which tells us what URL we want access to and what URL we would like to return to once authentication is successful. The definition of the two URL variables are as follows:


$itemsFeedURL = "http://www.google.com/base/feeds/items/batch";
$itemsFeedURLQuery = "http://www.google.com/base/feeds/items";


When our page is requested, there is no token so we ask for one. We redirect off to Google. Google prompts the user to login. Google asks, on behalf of our application, permission to access the users Base items. Once the user accepts, they are redirected back to the script with an authentication token which we need in order to do stuff. Our logic now checks whether the
postStage
variable is set. If it's not, we know that we haven't serialized our existing items yet. First though, we need to exchange our single use token for a multi use token. This is accomplished with a call to
exchangeToken
.


function exchangeToken($token)
{
  $ch = curl_init();

  curl_setopt($ch, CURLOPT_URL, "https://www.google.com/accounts/AuthSubSessionToken");
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  curl_setopt($ch, CURLOPT_FAILONERROR, true);
  curl_setopt($ch, CURLOPT_HTTPHEADER, array(
    'Authorization: AuthSub token="' . $token . '"'
  ));

  $result = curl_exec($ch);
  curl_close($ch);

  $splitStr = split("=", $result);
  return trim($splitStr[1]);
}


Then we call our function
getItems
which will query Google Base for all of Riviera Rental Guide's existing holiday rental listings. The function fires off a request to Base passing in our authentication token and our developer key. Base returns an XML document describing all of our existing items. We parse the XML document (not using an XML parser because they are desperately slow) and link Google's id for the entry (which is basically a URL) to Riviera Rental Guide's internal id for the rental listing. Every time we create an item, we associate a custom variable with the "g" namespace in the XML document called "iid", eh, internal id. The
getItems
function looks like this:


function getItems($token)
{
  $ch = curl_init();
  global $developerKey, $itemsFeedURLQuery;

  curl_setopt($ch, CURLOPT_URL, $itemsFeedURLQuery);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  curl_setopt($ch, CURLOPT_HTTPHEADER, array(
    'Content-Type: application/atom+xml',
    'Authorization: AuthSub token="' . trim($token) . '"',
    'X-Google-Key: key=' . $developerKey
  ));

  $result = curl_exec($ch);
  curl_close($ch);

  $offset = 0;
  $existing_items = array();
  while(1)
  {
     $strpos_a = strpos($result,"",$offset)+strlen("");
     if(strpos($result,"",$offset)===FALSE) break;
     $strpos_b = strpos($result,"",$strpos_a);
     $google_id = substr($result,$strpos_a,($strpos_b-$strpos_a));
     $strpos_c = strpos($result,"",$strpos_b)+strlen("");
     $strpos_d = strpos($result,"",$strpos_c);
     $rrg_id = substr($result,$strpos_c,($strpos_d-$strpos_c));
     $offset = $strpos_d;
     $existing_items[$rrg_id] = $google_id;
   }
   return $existing_items;
}


This function returns an array with Riviera Rental Guide's internal id as the key and Google's id as the value. We then serialize this array off to the file system so that we have a record of it. Once we have serialized the data, we re-authenticate ourselves so that we can update our existing items and add any new rental listings. This time, we don't set the query flag...

authenticate(false);


...so the function will now append a variable called
postStage
to our return URL. This is so we can literally determine are we supposed to be pulling down our existing items or posting our items. What this will look like to the user who accesses our script, is just that the Google page asking for permission on behalf of our script refreshes and asks twice, they won't even notice that all of the existing items have been serialized. Once the user grants access, our page gets another token. Once again, we exchange the token for a multi use token. It's not actually necessary but a handy step if we decide to modify the script in anyway. After that, we call
establishFeed
which creates the XML data from Riviera Rental Guide's database. Due to client confidentiality we can only show bits and pieces of this function but at a certain level, all that happens is that we create a data abstraction layer that maps Riviera Rental Guide's database fields to Google's preferred attribute names. Once we have pulled down all the vacation rental listings and we are in the loop, for each entry we would like to add to Google Base we execute the following lines of code:


$entry = new GBaseEntry();
if(array_key_exists($id,$existing_items))
{
  $entry->setParam('id',$existing_items[$id]);
  $entry->setParam('batch_operation',"update");
}
else
{
   $entry->setParam('id',$id);
   $entry->setParam('batch_operation',"insert");
}
$entry->setParam('title', ...);
$entry->setParam('g_agent', ...);
$entry->setParam('g_summary', ...);
$entry->setParam('g_description', ...);
$entry->setParam('g_location', ...);
$entry->setParam('g_latitude', ...);
$entry->setParam('g_longitude', ...);
$entry->setParam('link', ...);
$entry->setParam('g_price', ...); // e.g. 35 EUR
$entry->setParam('g_price_type', ...); // e.g. starting
$entry->setParam('g_price_units', ...); // e.g. night
$entry->setParam('g_property_type', ...);
$entry->setParam('g_bedrooms', ...);
$entry->setParam('g_bathrooms', ...);
$entry->setParam('g_sleeps', ...);
$entry->setParam('g_area', ...); // e.g. 42 sq.m.
$entry->setParam('g_feature', ...); // e.g. BBQ, Swimming Pool, Terrace
$entry->setParam('g_zoning', ...); // e.g. Residential
$entry->setParam('g_listing_status', ...); // e.g. active
$entry->setParam('g_item_type', ...); // e.g. Vacation rentals
$entry->setParam('c_iid', ...); // Custom Internal Id
if(!empty($img_src)) { $entry->setParam('g_image_link', ...); }
$feed->addEntry($entry->XMLize());


We use two classes throughout the process,
GBaseFeed
and
GBaseEntry
to provide solid, extensible functionality to Riviera Rental Guide. The included file containing these two classes looks as follows. Please note that if I wasn't so lazy, I probably would've written individual get/set functions for each parameter instead of just the two,
getParam
and
setParam
.


title = $title;
        }

        function setFeedDescription ( $desc )
        {
            $this->desc = $desc;
        }

        function setFeedAuthor ( $author , $email)
        {
            $this->author = $author;
            $this->author_email = $email;
        }

        function setFeedId ( $id )
        {
            $this->id = $id;
        }

        function setFeedLink ( $link )
        {
            $this->link = $link;
        }

        function addEntry($item_data)
        {
           $this->listings[] = $item_data;
        }

        function newBaseFeed ( )
        {
            $feed = "<?xml version=\"1.0\" encoding=\"UTF-8\" ?>\r\n";
            $feed .= "\r\n"

;

            if(!empty($this->title))
            {
               $feed .= "".$this->title."\r\n";
            }
            if(!empty($this->link))
            {
               $feed .= "link."\"/>\r\n";
            }
            if(!empty($this->author))
            {
               $feed .= "\r\n".$this->author."\r\n";
               $feed .= "".$this->author_email."\r\n\r\n";
            }
            if(!empty($this->id))
            {
               $feed .= "".$this->id."\r\n";
            }
            foreach($this->listings as $listing)
            {
               $feed .= $listing;
            }
            $feed .= "";
            return str_replace("&","&",$feed);
        }
}

class GBaseEntry
{
   var $batch_id;
   var $batch_operation;
   var $id;
   var $title;
   var $link;
   var $g_agent;
   var $g_summary;
   var $g_location;
   var $g_price;
   var $g_price_type;
   var $g_image_link;
   var $g_property_type;
   var $g_bedrooms;
   var $g_bathrooms;
   var $g_sleeps;
   var $g_area;
   var $g_feature;
   var $g_zoning;
   var $g_listing_status;
   var $g_listing_type;
   var $g_latitude;
   var $g_longitude;
   var $g_item_type;
   var $g_description;
   var $g_price_units;
   var $c_iid;

   function setParam($name, $value)
   {
           $str_to_eval = "\$this->".$name." = \$value;";
           eval($str_to_eval);
   }

   function getParam($name)
   {
           eval("return \$this->".$name.";");
   }

   function XMLize()
   {
	$entry = '';
	$entry .= "\r\n";

	if(!empty($this->batch_id)) 
	$entry .= "".$this->batch_id.
	"\r\n";

	if(!empty($this->batch_operation)) 
	$entry .= "batch_operation."\" />\r\n";

	if(!empty($this->id)) 
	$entry .= "".$this->id.
	"\r\n";

	if(!empty($this->title)) 
	$entry .= "".$this->title.
	"\r\n";

	if(!empty($this->g_agent)) 
	$entry .= "".$this->g_agent.
	"\r\n";

	if(!empty($this->g_summary)) 
	$entry .= "".$this->g_summary.
	"\r\n";

	if(!empty($this->g_description)) 
	$entry .= "".$this->g_description.
	"\r\n";

	if(!empty($this->g_price)) 
	$entry .= "".$this->g_price.
	"\r\n";

	if(!empty($this->g_price_type)) 
	$entry .= "".$this->g_price_type.
	"\r\n";

	if(!empty($this->g_price_units)) 
	$entry .= "".$this->g_price_units.
	"\r\n";

	if(!empty($this->g_image_link)) 
	$entry .= "".$this->g_image_link.
	"\r\n";

	if(!empty($this->g_property_type)) 
	$entry .= "".$this->g_property_type.
	"\r\n";

	if(!empty($this->g_bedrooms)) 
	$entry .= "".$this->g_bedrooms.
	"\r\n";

	if(!empty($this->g_bathrooms)) 
	$entry .= "".$this->g_bathrooms.
	"\r\n";

	if(!empty($this->g_sleeps)) 
	$entry .= "".$this->g_sleeps.
	"\r\n";

	if(!empty($this->g_area)) 
	$entry .= "".$this->g_area.
	"\r\n";

	if(!empty($this->g_feature)) 
	$entry .= "".$this->g_feature.
	"\r\n";

	if(!empty($this->g_zoning)) 
	$entry .= "".$this->g_zoning.
	"\r\n";

	if(!empty($this->g_listing_status)) 
	$entry .= "".$this->g_listing_status.
	"\r\n";

	if(!empty($this->g_listing_type)) 
	$entry .= "".$this->g_listing_type.
	"\r\n";

	if(!empty($this->g_item_type)) 
	$entry .= "".$this->g_item_type.
	"\r\n";

	if(!empty($this->g_location))
	{
	   $entry .= "".$this->g_location;
	   if(!empty($this->g_latitude) && 
              !empty($this->g_longitude))
	   {
	      $entry .= "".$this->g_latitude.
	"\r\n";

	      $entry .= "".$this->g_longitude.
	"\r\n";

	   }
	   $entry .=.
	"\r\n";

	}
	if(!empty($this->link)) 
	$entry .= "link."\"/>\r\n";

	if(!empty($this->c_iid)) 
	$entry .= "".$this->c_iid.
	"\r\n";

	$entry .=.
	"\r\n";

	return $entry;
   }
}
?>


We then return the
$feed
object which is then posted to Google Base using the
postItems
function and voila, we're done and dusted. Here's the
postItems
function which finishes off the process of adding Riviera Rental Guide's holiday accommodation listings to Google Base.


function postItems($token)
{
  $ch = curl_init();
  global $developerKey, $itemsFeedURL, $feed;

  curl_setopt($ch, CURLOPT_URL, $itemsFeedURL);
  curl_setopt($ch, CURLOPT_POST, true);
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
  curl_setopt($ch, CURLOPT_FAILONERROR, true);
  curl_setopt($ch, CURLOPT_HTTPHEADER, array(
    'Authorization: AuthSub token="' . $token . '"',
    'X-Google-Key: key=' . $developerKey,
    'Content-Type: application/atom+xml'
  ));
  $feed_data = $feed->newBaseFeed();
  curl_setopt($ch, CURLOPT_POSTFIELDS, $feed_data);

  $result = curl_exec($ch);
  curl_close($ch);

  return $result;
}


Once someone executes this script every couple of weeks (we set up a cron job to e-mail the site administrators reminding them to go through the process), Google Base will have a reasonably up to date copy of all our clients vacation rentals. It's quite a lengthy bit of code but as always, the more complicated the code, the easier it is for the user plus, it's a small price to pay for the absolutely amazing functionality of Google Base. Rest assured that Base is the way forward when it comes to searching for tangible products and I'm fairly sure you will see items in Base eclipsing regular search results very soon.

As with all articles on Celtic Productions, this article is protected by international copyright laws. It may be linked to (we are of course most grateful of links to our articles), however, it may never be reproduced without the prior express permission of its owners, Celtic Productions. In addition, segments of this article are owned and copyright protected by Riviera Rental Guide.

About Us | Our Products | Our Services | Technical Articles | How we work | News | Contact Us | Privacy Policy