Articles & News About Jobs & Employment

Stop Parking Domain Names
Develop Your Domain Names

Search Engine Robots - How They Work, What They Do (Part I)


Automated search engine robots, sometimes called "spiders" or "crawlers", are the seekers of web pages. How do they work? What is it they really do? Why are they important?

You'd think with all the fuss about indexing web pages to add to search engine databases, that robots would be great and powerful beings. Wrong. Search engine robots have only basic functionality like that of early browsers in terms of what they can understand in a web page. Like early browsers, robots just can't do certain things. Robots don't understand frames, Flash movies, images or JavaScript. They can't enter password protected areas and they can't click all those buttons you have on your website. They can be stopped cold while indexing a dynamically generated URL and slowed to a stop with JavaScript navigation. How Do Search Engine Robots Work?

Think of search engine robots as automated data retrieval programs, traveling the web to find information and links.

When you submit a web page to a search engine at the "Submit a URL" page, the new URL is added to the robot's queue of websites to visit on its next foray out onto the web. Even if you don't directly submit a page, many robots will find your site because of links from other sites that point back to yours. This is one of the reasons why it is important to build your link popularity and to get links from other topical sites back to yours.

When arriving at your website, the automated robots first check to see if you have a robots.txt file. This file is used to tell robots which areas of your site are off-limits to them. Typically these may be directories containing only binaries or other files the robot doesn't need to concern itself with.

Robots collect links from each page they visit, and later follow those links through to other pages. In this way, they essentially follow the links from one page to another. The entire World Wide Web is made up of links, the original idea being that you could follow links from one place to another. This is how robots get around.

The "smarts" about indexing pages online comes from the search engine engineers, who devise the methods used to evaluate the information the search engine robots retrieve. When introduced into the search engine database, the information is available for searchers querying the search engine. When a search engine user enters their query into the search engine, there are a number of quick calculations done to make sure that the search engine presents just the right set of results to give their visitor the most relevant response to their query.

You can see which pages on your site the search engine robots have visited by looking at your server logs or the results from your log statistics program. Identifying the robots will show you when they visited your website, which pages they visited and how often they visit. Some robots are readily identifiable by their user agent names, like Google's "Googlebot"; others are bit more obscure, like Inktomi's "Slurp". Still other robots may be listed in your logs that you cannot readily identify; some of them may even appear to be human-powered browsers.

Along with identifying individual robots and counting the number of their visits, the statistics can also show you aggressive bandwidth-grabbing robots or robots you may not want visiting your website. In the resources section of the end of this article, you will find sites that list names and IP addresses of search engine robots to help you identify them. How Do They Read The Pages On Your Website?

When the search engine robot visits your page, it looks at the visible text on the page, the content of the various tags in your page's source code (title tag, meta tags, etc.), and the hyperlinks on your page. From the words and the links that the robot finds, the search engine decides what your page is about. There are many factors used to figure out what "matters" and each search engine has its own algorithm in order to evaluate and process the information. Depending on how the robot is set up through the search engine, the information is indexed and then delivered to the search engine's database.

The information delivered to the databases then becomes part of the search engine and directory ranking process. When the search engine visitor submits their query, the search engine digs through its database to give the final listing that is displayed on the results page.

The search engine databases update at varying times. Once you are in the search engine databases, the robots keep visiting you periodically, to pick up any changes to your pages, and to make sure they have the latest info. The number of times you are visited depends on how the search engine sets up its visits, which can vary per search engine.

Sometimes visiting robots are unable to access the website they are visiting. If your site is down, or you are experiencing huge amounts of traffic, the robot may not be able to access your site. When this happens, the website may not be re-indexed, depending on the frequency of the robot visits to your website. In most cases, robots that cannot access your pages will try again later, hoping that your site will be accessible then.

Resources

*SpiderSpotting - Search Engine Watch http://searchenginewatch.com/webmasters/spiders.html

*Robotstxt.org List of robots and protocols for setting up a robots.txt file. http://www.robotstxt.org/

*Spider-Food Tutorials, forums and articles about Search Engine spiders and Search Engine Marketing. http://spider-food.net/

*Spiderhunter.com Articles and resources about tracking Search Engine spiders. http://www.spiderhunter.com/

*Sim Spider Search Engine Robot Simulator Search Engine World has a spider that simulates what the Search Engine robots read from your website. http://www.searchengineworld.com/cgi-bin/sim_spider.cgi

Daria Goetsch is the founder and Search Engine Marketing Consultant for Search Innovation Marketing, a Search Engine Optimization company serving small businesses. She has specialized in Search Engine Promotion since 1998, including three years as the Search Engine Specialist for O'Reilly Media, Inc., a technical book publishing company.

Copyright © 2002-2005 Search Innovation Marketing. http://www.searchinnovation.com All Rights Reserved.

Permission to reprint this article is granted if the article is reproduced in its entirety, without editing, including the bio information. Please include a hyperlink to http://www.searchinnovation.com when using this article in newsletters or online.

How To Make Money With Expired Domain Names

Other Article Sites

findabook.com  moneycd.info  a-mortgage.info   about-lemon-laws.info  aboutstudentloans.info
all-about-publishing.info  auctions-articles.info  bestcollege-university.com  bestispconnection.com
biblefolder.com  blogger-website.com  books-used.info  brokers-guide.info  buywindows.info  cable-dsl.info
career-miner.com  carpel-tunnel.info  cashinaflash.info  cashloanreviews.info  casinobell.com  chat-house.info
clearmycredit.info  collegeloantips.info  crones.info  depression-articles.info   dirnic.net  dishguides.info
divers-below.com  expodog.info   financewizz.com  fire-insurance.info  getgood.info  handleit.net   it-idea.info
health-supplies.info  hosting-right.com  insidealert.com  insurance-facts.info  jobs-employment.info
justgood.info  lookgold.net   lowcost-travel.info  money-source.info  myhostzone.info  numisblog.com
peoplesearchfinder.info  pr-articles.info  realeas.com   refinancing-guides.info  spyware-remove.info
telelot.info  the-law.info   toppaid.info  travel-deals.info  travelcorrect.com  wedding-guide-site.com
your-blog.info  your-credit.info

MORE ARTICLES:
Smart Art Investments - Knowing the Investment Potential of an Artists Work
An artist's background is important and definitely impacts on the value of their artworks. Most of us understand that an artwork by an established or accomplished artist is worth more than that of an lesser known artist.

Bringing Back Mining Jobs to the United States
With new environmental technologies and modern efficiency methods in mining can we revive this sector of our economy. It would be great for manufacturing costs and coupled with the current low dollar we stand to make gain quite a bit if we can.

2007 Great Place to Work® Awards Winners Announced
Winners of the 2007 Great Place to Work® Awards were announced last week by Great Place to Work® Institute, a San-Francisco based global research and consulting firm. The annual Great Place to Work® Awards are a distinctive way Great Place to Work® Institute recognizes the accomplishments of organizations that have implemented creative and effective approaches to developing trust, pride, and camaraderie within their workplaces.

Golf Jobs Galore
Golf has become a broad industry that has produced a number of golf jobs in its wake Contrary to popular notion, these jobs are not just for teenagers looking to make a little extra summer cash

Need Business Home Work?
Whichever way you define it 'business home work' or 'work at home business' is an increasing and sometimes only option for many people. Many business home work companies that are based on selling products or services to a local market have reached saturation point, which is why so many people turn to the Internet to sell products or services on a global scale.

Facebook, MySpace and LinkedIn: How Social Networks can Work for Your Marketing. North Star Marketing Offers Free Teleconference on Social Networks
Do social networks have a place in the marketing toolbox? Join in this free teleconference from North Star Marketing and find out if Facebook, MySpace or LinkedIn can make an impact on your marketing.

Freelance Writing: Work At Home & Make Money
"Freelance writing is by far the most lucrative and profitable job online, and will be for many years to come" If you're searching for a serious work at home job that will truly increase your earning potential, then you must take a closer look at becoming a freelance writer

TrainUp.com Launches Updated Career Training Portal and Custom LMS Solutions
TrainUp.com recently launched an improved version of its industry- leading career training portal. The updated site features improved training search capabilities directory structure, and thousands of new courses. Users can search over 250,000 instructor-led training events and 5,000 online courses or take advantage of TrainUp.com's custom learning and performance management solutions.

Freelance Writing: A Career From Anywhere
An island in the Mediterranean. A beach in Africa.

An Opensource Web Development Frame Work Ruby on Rails
Everyone from startups to non-profits to enterprise organizations are using Rails. Rails is all about infrastructure, so it's a great fit for practically any type of web application Be it software for collaboration, community, e-commerce, content management, statistics

Career Moves: Take Charge of Your Life
Every day millions of people let their inner fears stop them from creating the life of their dreams. No one will deny that it is scary to step out of your comfort zone, but once you challenge your fear and take action, you can attain great things.

Choosing the Right Work Shirts for Your Small Business
Company attire says a lot about your business philosophy to your customer. Company shirts project professionalism and advertising.

Los Angeles Human Resources Consulting Firm to Offer Employment Screening, Drug Testing by HRPlus
HR Consulting Firm Expands Its HR Services To Include Employment Screening And Drug Testing By HRPlus. (Former Client Now Sells HRPlus Products and Services)

Free Diet Plans ? Do They Work?
Slimming formulas, diet pills, hunger-controlling nutritional bars are expensive Want to look good without parting with your money

How MICR Toners Work
Have you ever wondered why your checks have a line of characters printed on its bottom side and why these characters look different This line is called the MICR line

Develop Your Domain Names | Site Map | Home

Privacy Policy | Copyright/Trademark Notification