Go to USCB home page UNIVERSITY OF SOUTH CAROLINA COLUMBIA USC BEAUFORT LIBRARY
UNIVERSITY OF SOUTH CAROLINA BEAUFORT CAMPUS
BONES HOME | QUICK TIPS | FINAL EXAM | USER AGREEMENT
 
 

bare bones Lesson 1: Search engines

 

 


"As Obi-Wan sez: 'Use the web, Luke!'"  
             --Larry Masinter, Xerox Corporation, on the Net


WHAT ARE SEARCH ENGINES?

Search engines are huge databases of web page files that have been assembled automatically by machine.

There are two types of search engines:

  1. Individual.  Individual search engines compile their own searchable databases on the web.
  2. Meta.   Metasearchers do not compile databases. Instead, they search the databases of multiple sets of individual engines simultaneously (see Lesson 2).

HOW DO SEARCH ENGINES WORK?

Search engines compile their databases by employing "spiders" or "robots" ("bots") to crawl through web space from link to link, identifying and perusing pages. Sites with no links to other pages may be missed by spiders altogether. Once the spiders get to a web site, they typically index most of the words on the publicly available pages at the site. Web page owners may submit their URLs to search engines for "crawling" and eventual inclusion in their databases.

Whenever you search the web using a search engine, you're asking the engine to scan its index of sites and match your keywords and phrases with those in the texts of documents within the engine's database.

It is important to remember that when you are using a search engine, you are NOT searching the entire web as it exists at this moment. You are actually searching a portion of the web, captured in a fixed index created at an earlier date.

How much earlier? It's hard to say. Spiders regularly return to the web pages they index to look for changes. When changes occur, the index is updated to reflect the new information. However, the process of updating can take a while, depending upon how often the spiders make their rounds and then, how promptly the information they gather is added to the index. Until a page has been both "spidered" AND "indexed," you won't be able to access the new information.

NOTE: While most search engine indexes are not "up to the minute" current, they have partnered with specialized news databases that are. For late breaking news, look for a "news" tab somewhere on the search engine or directory page. Examples include:

WHAT ARE THE PROS AND CONS OF SEARCH ENGINES?

PROS:
Search engines provide access to a fairly large portion of the publicly available pages on the Web, which itself is growing exponentially (see "How Big Is the Internet?")

Search engines are the best means devised yet for searching the web. Stranded in the middle of this global electronic library of information without either a card catalog or any recognizable structure, how else are you going to find what you're looking for?

CONS:
On the down side, the sheer number of words indexed by search engines increases the likelihood that they will return hundreds of thousands of responses to simple search requests. Remember, they will return lengthy documents in which your keyword appears only once.

Additionally, many of these responses will be irrelevant to your search.

ARE SEARCH ENGINES ALL THE SAME?

Search engines use selected software programs to search their indexes for matching keywords and phrases, presenting their findings to you in some kind of relevance ranking. Although software programs may be similar, no two search engines are exactly the same in terms of size, speed and content; no two search engines use exactly the same ranking schemes, and not every search engine offers you exactly the same search options. Therefore, your search is going to be different on every engine you use. The difference may not be a lot, but it could be significant. Recent estimates put search engine overlap at approximately 60 percent and unique content at around 40 percent.

HOW DO SEARCH ENGINES RANK WEB PAGES?

In ranking web pages, search engines follow a certain set of rules. These may vary from one engine to another. Their goal, of course, is to return the most relevant pages at the top of their lists. To do this, they look for the location and frequency of keywords and phrases in the web page document and, sometimes, in the HTML META tags. They check out the title field and scan the headers and text near the top of the document. Some of them assess popularity by the number of links that are pointing to sites; the more links, the greater the popularity, i.e., value of the page.

WHEN DO YOU USE SEARCH ENGINES?

Search engines are best at finding unique keywords, phrases, quotes, and information buried in the full-text of web pages. Because they index word by word, search engines are also useful in retrieving tons of documents. If you want a wide range of responses to specific queries, use a search engine.

NOTE: Today, the line between search engines and subject directories (see Lesson 3) is blurring. Search engines no longer limit themselves to a search mechanism alone. Across the Web, they are partnering with subject directories, or creating their own directories, and returning results gathered from a variety of other guides and services as well.

EXAMPLES OF INDIVIDUAL SEARCH ENGINES:

EXAMPLES OF SEARCH ENGINES THAT HAVE PARTNERED WITH SUBJECT DIRECTORIES:


ASSIGNMENT:

Select one of the search engines listed above and search for:

     Connecticut Compromise

Now try searching for the same subject as a phrase, enclosed in quotes:

   "Connecticut Compromise"

[Note: The second search should retrieve far fewer documents than the first search. More on this in Lesson 7]


Table of Contents                                        Metasearchers
 


[Table of Contents] [Search Engines] [Metasearchers] [Subject Directories] [Gateways & Databases] [Evaluating Web Pages] [Search Strategies] [Search Tips] [Boolean Operators] [Field Searching] [Troubleshooting] [Ask] [Clusty] [Dogpile] [GigaBlast] [Google] [MSN Search] [Yahoo!] [Graveyard] [Final Exam] [Beyond Bones] [User Agreement]

Last updated by E. Chamberlain, Thursday October 09, 2014