The corporate future is built on a digital foundation. At Celtic Productions, we construct this new world.
Michael Magan CEO
Google Base for Vacation Rentals
ALLARTICLES
Title: Google Base for Vacation Rentals Posted: Sun, 5th August 2007 Category: PHP
Recently, one of our clients, Riviera Rental Guide, a
provider of holiday rental accommodation in the French and
Italian Rivieras contacted us to discuss the possibility of
integrating their rental listings with Google Base. We had
certainly heard of Base but had only submitted items manually or
in bulk, never through the application programming interface
(API). So here we are, a comprehensive guide to inserting and
updating items in Google Base.
What is Google Base? From the lads themselves: "Google Base is a
place where you can easily submit all types of online and offline
content, which we'll make searchable on Google". With the API,
this means that we can programmatically submit our information
(or any information from a data driven website) in a strict
markup language (i.e. XML) so that it is "more" searchable. What
does "more" searchable mean? Well, the trouble with documents on
the web is that relevant elements of a particular page are
enclosed, or marked up, in tags which are devoid of context with
reference to other pages. If we take 2 websites which are
essentially about the same thing, it will be a fairly reliable
fact that some things will be bold in one page and not in the
other, some things will be italicised, there will be different
headings and generally the pages will be very different from each
other despite representing quite similar concepts.
This poses a huge problem for search engines because they have to
determine the context of a page and indeed what it is about, i.e.
vacation rentals in the south of France. They determine this by,
among other things, analysing the page content, analysing the
page content of pages that link to that page, analysing the link
text of pages that link to that page &c. This is fine if there
was a 1 to 1 relationship between products/services and websites,
i.e. if there was only one place on the web to purchase flights
but the fact that search engines need to return more relevant
pages first means they need to compare different pages and as we
well know, there's little point in comparing apples with oranges.
This is where XML and Google Base level the playing field and
make it much easier for users to find what they are looking for.
By requiring content providers to publish their information in a
standardised format, e.g. for the vacation rentals item type, you
have to provide the number of bedrooms, bathrooms, the price of
the rental property and the location, any ambiguity is
effectively removed. It means that the same rental listing from
Riviera Rental Guide will look the same as a rental listing from
a competitor. This has advantages. From a users perspective it
means that they can easily retrieve the most relevant items and
they see the items in a standarised format. From a content
creator's perspective, they can be the smallest player in their
field but if they have the right product, they no longer need to
compete in terms of having the biggest website, the most
expensive technology. In fact, in order to use Google Base one
doesn't even need a website.
Anyway, let's get down to business, how do we go about adding
listings to Base? First, get yourself a Google account. Second,
get yourself an API developer key. I realise that this isn't at
all helpful but I don't know where you get these nowadays. I've
had one in my gmail inbox since the dawn of time which I use for
everything. Here's what we're going to do:
- Authenticate ourselves for querying using our developer key
- Grant access to our application
- Pull down all our existing Base items
- Serialize our existing items to an external file
- Authenticate ourselves for inserting/updating using our
developer key
- Grant access to our application
- Unserialize our existing items
- Set up the XML feed
- Post the information to Google Base
Here's the logic in PHP, we'll look at the functions in more
detail later:
if(empty($_GET['token'])) { authenticate(true); } else { // Second time the page is loaded if($_GET['postStage']) { // After serialization, post our items $mu_token = exchangeToken($_GET['token']); $existing_items = unserialize(file_get_contents(".rrg_gbase")); $feed = establishFeed(); $postResponse = postItems($mu_token); print $postResponse; } else { // Serialize our existing GBase items $mu_token = exchangeToken($_GET['token']); $existing_items = getItems($mu_token); $fh = fopen(".rrg_gbase","w+"); fwrite($fh,serialize($existing_items)); fclose($fh); authenticate(false); } }
So the first time the page loads, we don't have an authentication
token from Google so we need to get one. We get that using our authenticate function which goes a little something
like this:
function authenticate($query=false) { global $itemsFeedURL, $itemsFeedURLQuery;
We can pass a flag to the function which tells us what URL we
want access to and what URL we would like to return to once
authentication is successful. The definition of the two URL
variables are as follows:
When our page is requested, there is no token so we ask for one.
We redirect off to Google. Google prompts the user to login.
Google asks, on behalf of our application, permission to access
the users Base items. Once the user accepts, they are redirected
back to the script with an authentication token which we need in
order to do stuff. Our logic now checks whether the postStage variable is set. If it's not, we know that
we haven't serialized our existing items yet. First though, we
need to exchange our single use token for a multi use token. This
is accomplished with a call to exchangeToken.
function exchangeToken($token) { $ch = curl_init();
Then we call our function getItems which will query
Google Base for all of Riviera Rental Guide's existing holiday
rental listings. The function fires off a request to Base passing
in our authentication token and our developer key. Base returns
an XML document describing all of our existing items. We parse
the XML document (not using an XML parser because they are
desperately slow) and link Google's id for the entry (which is
basically a URL) to Riviera Rental Guide's internal id for the
rental listing. Every time we create an item, we associate a
custom variable with the "g" namespace in the XML document called
"iid", eh, internal id. The getItems function looks
like this:
function getItems($token) { $ch = curl_init(); global $developerKey, $itemsFeedURLQuery;
This function returns an array with Riviera Rental Guide's
internal id as the key and Google's id as the value. We then
serialize this array off to the file system so that we have a
record of it. Once we have serialized the data, we
re-authenticate ourselves so that we can update our existing
items and add any new rental listings. This time, we don't set
the query flag...
authenticate(false);
...so the function will now append a variable called postStage to our return URL. This is so we can
literally determine are we supposed to be pulling down our
existing items or posting our items. What this will look like to
the user who accesses our script, is just that the Google page
asking for permission on behalf of our script refreshes and asks
twice, they won't even notice that all of the existing items have
been serialized. Once the user grants access, our page gets
another token. Once again, we exchange the token for a multi use
token. It's not actually necessary but a handy step if we decide
to modify the script in anyway. After that, we call establishFeed which creates the XML data from
Riviera Rental Guide's database. Due to client confidentiality we
can only show bits and pieces of this function but at a certain
level, all that happens is that we create a data abstraction
layer that maps Riviera Rental Guide's database fields to
Google's preferred attribute names. Once we have pulled down all
the vacation rental listings and we are in the loop, for each
entry we would like to add to Google Base we execute the
following lines of code:
$entry = new GBaseEntry(); if(array_key_exists($id,$existing_items)) { $entry->setParam('id',$existing_items[$id]); $entry->setParam('batch_operation',"update"); } else { $entry->setParam('id',$id); $entry->setParam('batch_operation',"insert"); } $entry->setParam('title', ...); $entry->setParam('g_agent', ...); $entry->setParam('g_summary', ...); $entry->setParam('g_description', ...); $entry->setParam('g_location', ...); $entry->setParam('g_latitude', ...); $entry->setParam('g_longitude', ...); $entry->setParam('link', ...); $entry->setParam('g_price', ...); // e.g. 35 EUR $entry->setParam('g_price_type', ...); // e.g. starting $entry->setParam('g_price_units', ...); // e.g. night $entry->setParam('g_property_type', ...); $entry->setParam('g_bedrooms', ...); $entry->setParam('g_bathrooms', ...); $entry->setParam('g_sleeps', ...); $entry->setParam('g_area', ...); // e.g. 42 sq.m. $entry->setParam('g_feature', ...); // e.g. BBQ, Swimming Pool, Terrace $entry->setParam('g_zoning', ...); // e.g. Residential $entry->setParam('g_listing_status', ...); // e.g. active $entry->setParam('g_item_type', ...); // e.g. Vacation rentals $entry->setParam('c_iid', ...); // Custom Internal Id if(!empty($img_src)) { $entry->setParam('g_image_link', ...); } $feed->addEntry($entry->XMLize());
We use two classes throughout the process, GBaseFeed
and GBaseEntry to provide solid, extensible
functionality to Riviera Rental Guide. The included file
containing these two classes looks as follows. Please note that
if I wasn't so lazy, I probably would've written individual
get/set functions for each parameter instead of just the two, getParam and setParam.
class GBaseFeed {
var $title = ''; var $author = ''; var $id = ''; var $link = ''; var $desc = ''; var $author_email = ''; var $listings = array();
function GBaseFeed ( ) { }
function setFeedTitle ( $title ) { $this->title = $title; }
function setFeedDescription ( $desc ) { $this->desc = $desc; }
class GBaseEntry { var $batch_id; var $batch_operation; var $id; var $title; var $link; var $g_agent; var $g_summary; var $g_location; var $g_price; var $g_price_type; var $g_image_link; var $g_property_type; var $g_bedrooms; var $g_bathrooms; var $g_sleeps; var $g_area; var $g_feature; var $g_zoning; var $g_listing_status; var $g_listing_type; var $g_latitude; var $g_longitude; var $g_item_type; var $g_description; var $g_price_units; var $c_iid;
We then return the $feed object which is then posted
to Google Base using the postItems function and
voila, we're done and dusted. Here's the postItems
function which finishes off the process of adding Riviera Rental
Guide's holiday accommodation listings to Google Base.
function postItems($token) { $ch = curl_init(); global $developerKey, $itemsFeedURL, $feed;
Once someone executes this script every couple of weeks (we set
up a cron job to e-mail the site administrators reminding them to
go through the process), Google Base will have a reasonably up to
date copy of all our clients vacation rentals. It's quite a
lengthy bit of code but as always, the more complicated the code,
the easier it is for the user plus, it's a small price to pay for
the absolutely amazing functionality of Google Base. Rest assured
that Base is the way forward when it comes to searching for
tangible products and I'm fairly sure you will see items in Base
eclipsing regular search results very soon.
As with all articles on Celtic Productions, this article is
protected by international copyright laws. It may be linked to
(we are of course most grateful of links to our articles),
however, it may never be reproduced without the prior express
permission of its owners, Celtic Productions. In addition,
segments of this article are owned and copyright protected by
Riviera Rental Guide.