Internet Search Engines is one of the hottest subject in the Internet marketing world. Search engines are special sites on the Web that are designed to help people find information stored on other sites. It is a program that searches web pages for specified keywords and returns a list of the web pages where the keywords were found. There are differences in the ways various search engines work, but they all perform three basic tasks:
* They search the Internet based on specific words.
* They keep an index of the words they find, and where they find them.
* They allow users to look for words or combination of words found in that index.
When people use the term search engine in relation to the Web, they are usually referring to the actual search forms that searches through databases of HTML documents, initially gathered by a robot. Early search engines held an index of a few hundred thousand pages and documents, and received maybe one or two thousand queries each day. Today, a top search engine will index hundreds of millions of pages, and respond to tens of millions of queries per day. In this article, we’ll tell you how these major tasks are performed, and how search engines put the pieces together in order to let you find the information you need on the Web.
While explaining search engines particulars, we should begin form the notion itself. A search engine is a searchable online database of Internet resources. It has several components:
search engine software , spider software an index (database), and a relevancy algorithm (rules for ranking). The search engine software consists of a server or a collection of servers dedicated to indexing Internet web pages, storing the results and returning lists of pages to match user queries. The spidering software constantly crawls the Web, collecting webpage data for the index. The index is a database for storing the data.
It should be noted that search engines are of four main types while describing Internet search engines. They are crawler-based (traditional or common engines), directories (human-edited catalogs), hybrid engines which are META engines and those using other engines’ results, and paid listings (PPC and paid inclusion engines).
Crawler-based search engines are those that use automated software agents (called crawlers) that visit a Web site, read the information on the actual site, read the site’s meta tags and also follow the links that the site connects to performing indexing on all linked Web sites as well. The crawler returns all that information back to a central depository, where the data is indexed. The crawler will periodically return to the sites to check for any information that has changed. The frequency with which this happens is determined by the administrators of the search engine. As you can see, spider software belongs to crawler-based search engines. Their work is as following: spiders read your page, index, and rank it. Finally, it appears on search engine results pages for the words and phrases most common on the indexed webpage.
Human-powered search engines or directories rely on humans to submit information that is subsequently indexed and catalogued. Only information that is submitted is put into the index. Directories work in the following way: you have to submit your pages manually to one of the existing categories, your site is visited and read by a directory editor. You must be ready for long queue process as reviewing by an editor (directories use human power for indexing) takes much longer to process all pages. Most directories do not have their own ranking mechanism; they use some obvious factor to sort the URLs such as an alphabetic sequence or some other pattern.
Paid inclusion search engines require certain fees to list your page with some differences in the working system as re-spidering or top-ranking for keywords that you choose. Moreover, most major Internet search Engines utilize such schemes as a part of their indexing and ranking system. PPC engines use an auction system where keywords and phrases are associated with a cost-per-click (CPC) fee. The fundamental principle that lies at the heart of PPC process is that the higher you bid, the higher your position will be for the particular search terms.
It should be noted that search engines now apply a sophisticated technique to determine how relevant you pages are to search words and phrases. Major Internet search engines uses ranking algorithms for generating the result listing, In other words, the site’s design is meaningless if your pages are ranked low on Internet search engines. For this purpose, they examine many on-the-page and off-the-page factors and only after this give your page a certain position or rank. This position will be visible while displaying results for a certain search query. To be top ranked, you should also be familiar with such parameter as relevancy of your website. It means that your content should spin around a particular subject and be focused on it.
For example, Healweal – India Search Engine “http://www.healweal.com” is a search engine having database of India specific websites. Resulting India related web pages for searching keywords and crawling the Indian web world. To get more details about this Indian search engine, Visit “http://www.healweal.com” “
In the next article, We will discuss search engine mechanism and technical particulars…..
Mukesh Kumar
(mukesh@healweal.net)
www.healweal.com
Posted by webyst