Checing if a visitor being a search bot is easy. You can use one of the globals, the $_SERVER[‘HTTP_USER_AGENT’] to check if it contains bot-like string. For example, many spiders, e.g. Sogu Web Spider, contain ‘spider’ in their descriptive string. Therefore, checking functions (in PHP) are pretty simple.
$ZLAI_Search_Engines = array(
'google',
'baidu',
'yahoo',
'spider',
'msn',
'test',
'http',
'bot',
'jeevesteoma',
'slurp',
'gulper',
'linkwalker',
'validator',
'webaltbot',
'wget',
'feed',
'bing',
'websitepulse',
'sogou',
'mediapartners',
'sohu',
'soso',
'search',
'yodao',
'robozilla'
);
//define('BOTS', '/('.implode('|', $ZLAI_Search_Engines).')/i');
define('BOTS',
'/(test|http|google|baidu|yahoo|spider|msn|bot|jeevesteoma|slurp|gulper|linkwalker|validator|webaltbot|wget|feed|bing|websitepulse|sogou|mediapartners|sohu|soso|search|yodao|robozilla)/i'
);
function checkspider($u)
{
return (preg_match(BOTS, $u));
}
function spider()
{
$agent='';
if (isset($_SERVER['HTTP_USER_AGENT']))
{
$agent = $_SERVER['HTTP_USER_AGENT'];
}
return (preg_match(BOTS, $agent));
}
function isSE($BS)
{
$BS = trim($BS);
if (!$BS)
return (true);
global $ZLAI_Search_Engines;
foreach ($ZLAI_Search_Engines as $v)
{
if (stripos($BS,$v) !== false)
{
return (true);
}
}
return (false);
}
The list defines most common spiders nowadays, such as google, baidu. However, you are free to add any other signature. The following page [here] uses these functions and can list the visitors (human or search engines) grabbing the domain steakovercooked.com
–EOF (The Ultimate Computing & Technology Blog) —
263 wordsLast Post: Select Random SQL
Next Post: Downloading URL using Python