Search Engines: The Big Picture
MBA 682 – Information Technology in a Global Environment
May 26, 1998
Since the origin of the Internet, the need to find specific information from amongst the vast resources of information available has been a problem. In the early years of the Internet, searches were limited to Gopher. The Internet Gopher is a distributed document delivery service. When the World Wide Web became a part of the Internet in the early 1990’s, the need arose for a better way to sort through the Internet’s resources. Thus, the advent of the computer software known as search engines. Search engines are catalogs of the resources available on the Internet.
Search engines are composed of three parts. The first of these is called the “spider” or the “crawler”. The spider visits a web page, reads it, and then follows the links to other pages on the site or on the web. The second part is the index. The index, sometimes called the catalog, is like a giant book containing a copy of every web page that the spider has found. The third part of the search engine is the software that is responsible for sifting through the index to find matches for a particular search.
Types of Search Engines
|Search Engines: Search engines create their listings automatically.|
|Directories: A directory depends on humans for its listings.|
|Metasearch Engines: A metasearch engine allows the searching of several search engines all at once.|
|Site Specific Search Engines: Many of the commercial sites on the web provide visitors the opportunity to search the contents of the site. This search will only bring back “hits” if a hit is found within the site itself.|
|Subscription Only Search Engines: These are often available on sites that provide published material. The results of specific searches are only available to those who are members of the service or those who are willing to pay for the documents listed as being “hits” for their particular search.|
Types of Searches
The success of a visit to a search engine will be defined by the number of sites that provide relevant information. Conducting a search goes beyond typing the correct word(s) into the search form. It likely contains some combination of the following:
|Keyword Search: A search for documents containing one or more words that are specified by a user.|
|Phrase Search: A search for documents containing an exact sentence or phrase specified by a user.|
|Boolean Search: Documents can be included or excluded by using operators such as AND, NOT and OR.|
|Concept Search: A search for documents related conceptually to a word, rather than specifically containing the word itself.|
|Proximity Search: A search that allows for specifying the nearness of one search term to another.|
|Fuzzy Search: A search that will find matches even when words are only partially spelled or misspelled.|
Evolution of the Search Engine
When search engines was first developed, the primary purpose was to merely catalog information. The way it was cataloged may have varied slightly, but the purpose was always to provide the most helpful information to the user. Since the development of search engines nearly four years ago, they have become very commercialized. The search engines are literally fighting for market share. Many of the search engines now offer value-added services such as:
- Building community(i.e., pod members)
- Free email (i.e., Yahoo)
- Free web space for own web site(i.e., Lycos)
- Translation (i.e., Alta Vista)
- Alliances with high traffic sites (i.e., Netscape and Yahoo)
- Branding (i.e., Yahoo’s credit card)
In addition, the search engines have formed partnerships with an assortment of retail web outlets. These partnerships and the revenue from banner advertisements allows the search engines to provide the searching for the new visitors and the additional services for the more seasoned visitors.
Future Of Search Engines
One of the most exciting uses of search engines is the use of search engines on intranets. Presently, many organizations rely on file servers for the storage of information. If a user wants to find a file, they have to know the exact path of the file. With search engine technology, the path of a file becomes transparent. If the user knows the title or something about the contents of a file, the search engine will likely find it. Additionally, as search engines become more sophisticated, they will be able to index additional file types. This will not be limited to spreadsheets and word documents; it will also include presentations, graphics, databases, video, and audio files.
The next evolutionary step of the search engine will likely be agent technology. An agent is a piece of software that is designed for a specific purpose. Presently, the spiders used to collect information on the web are a specialized type of agent. Agents of the future will be able to travel from site to site to locate specific information for their masters.