Search Appliance SBE

 

Thunderstone Search Appliance Manual

AJAX Crawlable URLs

Syntax: select Yes or No button

When enabled (the default), support the Google AJAX crawling scheme. This allows AJAX URLs (which are usually not walkable) to be walked, if the site being walked also supports the scheme.

AJAX URLs which contain anchors/fragments (#someFragment) are not normally walkable because anchors are never sent in HTTP requests, and the client-side JavaScript support in the Search Appliance does not include AJAX so the anchor is not processed by the walker either. Thus AJAX anchor links look like duplicates and are never fetched, or if fetched, do not return the anchor-specified content.

With the Google AJAX crawling scheme, walkers temporarily rewrite certain links with conforming anchors - those that begin with an exclamation point - by placing the anchor into the query string. Since query strings are sent in HTTP requests, the server sees the anchor, and can return the appropriate content. Moreover, the returned content can be static so that the walker can index it.

Specifically, URLs of the form:

http://example.com/path#!fragment http://example.com/path?query=present#!fragment

will be requested from the server as (respectively):

http://example.com/path?_escaped_fragment_=fragment http://example.com/path?query=present&_escaped_fragment_=fragment

This temporary rewrite is only used for the walker fetch: search results still return the original AJAX-anchor version of the link, so that browsers can still take advantage of the AJAX features of the site.

Note that this scheme requires web site support: the site must respond to any URL with the _escaped_fragment_ query parameter with the appropriate, full, static HTML content that corresponds to the AJAX anchor state of the same value.


Copyright © Thunderstone Software     Last updated: Mar 19 2020