A Class on the Net for Librarians with Little or No Net Experience

LESSON 27: WWW, PART 2: THE URL AND SEARCH ENGINES

"The Web is like the game of Othello. Remember? 'a moment to learn, a lifetime to master.'"

-- Lou Rosenfeld, University of Michigan, SLIS


HOME PAGES AND HTML

"Home page" is the friendly, non-threatening term for the special document used to serve information systems on the Web. The home page for a WWW server, or for a special collection on a server, is roughly analogous to a gopher root menu or a magazine's table of contents. Usually featuring a flashy identifying graphic or logo, the home page provides links to the collection of online materials the site distributes. The home page for each site is both the welcome mat and the front door to its collections; a well-designed home page will include a mailing address for your comments on the site, information on the person who maintains the page, and perhaps a search engine against the materials stored on the site. Every major university, commercial enterprise, corporation, and government agency on the Web provides a home page to make your visit to their site worthwhile and informative.

Increasingly, individuals are designing their own home pages and serving them on the Web. In fact, the personal home page has become such an electronic status symbol, that many commercial Internet providers now also offer to serve the personal home pages of their subscribers as an additional service (usually for an additional fee, of course!) Folks with a direct Internet connection, and the know-how to run a WWW server package, can serve their own home page directly from their own computer.

Additionally, anyone creating WWW pages needs to have a basic knowledge of HTML (Hypertext Markup Language). This markup language, which is really just a set of command tags that indicate how to format a page (first level heading, hot link to here, insert graphic, begin a bulleted list, etc.) is to the Web what Pagemaker is to desktop publishing. In "Understanding and Exploring the Internet," _Popular Mechanics_, April 1995, Senior Correspondent Abe Dane describes how pages composed in HMTL work:

A Web Browser runs on your computer and acts as a graphical interface between you and the Web. When you click on a link, it issues the necessary commands to request data from other computers, then interprets whatever comes back. Documents written in the standard Hypertext Markup Language (HTML) contain codes that tell Web browsers what typeface to use and how to format it. Most browsers can also display digitized pictures, so that a well-written HTML document appears as an illustrated page.
HTML supports the creation of many features, including headings, highlighting, links to other pages, etc. The HTML behind Web pages allows them to support such features as in-line graphics (pictures embedded in the text as they might be in a desktop publishing article), in-line fill-out forms (for database searching, surveys, user-feedback, ordering information, etc.), "hot" maps (where you can click on highlighted portions of a map or graphic to go to that spot -- as you might do in a computer game), links to other documents, and other headings in the current document, etc.

URLs

URL stands for Uniform or Universal Resource Locator and it acts as a unique global Internet identifier. Since 1991, URLs have been used as the standard way to locate and to cite multimedia resources on the Web and virtually everything else on the Internet -- links to virtually every Internet protocol are also supported (e.g., gopher, wais, telnet, ftp, http, usenet news, mail, etc.).

URLs are directions to information accessible via the Web. At first, the format of URLs will look different to you from the access addresses we've covered so far for telnet, ftp, and gopher connections; for instance, they can be a *lot* longer. But, if you look at them closely, you'll see familiar Internet "connection goings-on" at work. One thing to keep in mind at all times is that URLs are case-sensitive to the extreme, and like other Internet addresses, every character counts (miss one capital letter, or a dot or a slash) and you won't connect.

Each URL describes a particular path to a specific Internet resource. For example, here's a typical URL:

http://www.sc.edu/beaufort/library/index.html

Let's take a look at the basic parts of this Web address:

  1. The part before the colon:
    In the example given above, http (hypertext transfer protocol) tells you that the file is being served from an http, or web, server.

    Since other protocols are also supported, you will also see URLs that begin with "gopher", "telnet", "ftp", etc. which provide links to resources on these types of servers.

  2. The part that follows the two forward slashes (//):
    This gives you the hostname, or Internet address, where the resource is located. In this example, the information is on the server "www.sc.edu".

  3. The remainder of the URL (following the first "/"):
    The remainder describes the directory path to the particular resource (sometimes you must navigate through multiple directories to reach a resource -- remember our FTP travels?), and ultimately the name of the file. Both the directory names and the file name may be *long* and complicated, although not in this case. In our example, the directory path is simply "beaufort/library/," these being the directories created for my campus and library on the www.sc.edu server. The filename for my library home page is "index.html"; the "html" extension is used for HTML files (you may see the three-character extension ".htm" used for files being served from a Windows web server). You may see other filetypes, as well: "au" is a sound format, "gif" is a graphic, etc.
The URL in our example just happens to be the home page for my library at the University of South Carolina Beaufort. I have been allotted space on the "www.sc.edu" server under "beaufort/library" and that is where I stash my files. When you visit my home page, you will find many links to other pages and sources on the Web. If you wish to visit any of them, you have the option of accessing them via my library home page or noting the URLs and later, jumping directly to them on your own. URLs are useful, when travelling on the Web, to get you quickly to where you want to go. Just like in real travel adventures, you won't be at the mercy of road signs if you have specific directions in hand.

And, don't forget to save those directions. Most Web browsers, as noted in our last lesson, let you create a list of hotlinks or bookmarks to make your return trips to sites a "no-brainer".

However, what about the times when you don't have a URL in hand and you don't have any idea how to get to where you want to go on the Web. What do you do then?

WEB SEARCHERS AND INDEXES

One of the best ways to explore the Web is to use a Web search tool, or what's called an "intelligent Web agent." These are software programs that roam the Web creating databases for you to search. Each works differently and you may want to try more than one. Please note that most of these search tools search against the *FULL TEXT* of indexed documents, and not just against document titles as did the gopher search tools in earlier lessons. You may elect to try a subject search using one or more of the strong WWW subject tree collections or indexes that are organized into subject categories, such as the following:

Web Subject Directories

Web Search Tools

Or, you may prefer to go directly to one of the powerful search engines operating on the World Wide Web:

Unfortunately, those of you without "forms capability" software are limited to "no-forms" searching, the lowest level of searching and not uniformly available across the Web.

Lastly, here are two search interfaces you'll want to bookmark:

For a more lengthy look at available search engines, and a link to articles comparing the various engines, go to WebReference.com's Search page:
URL: http://webreference.com/search.html

YOUR ASSIGNMENT

Finally, for a basic discussion of different types of Internet connections possible via the Web, open David Baker's Guide to URLs:
http://www.netspace.org/users/dwb/url-guide.html

READ MORE ABOUT IT

Here's a text that's been highly recommended by a number of folks on the WEB4LIB discussion list as one of the best books around on creating HTML pages:

      Lemay, Laura. _Teach Yourself Web Publishing with HTML in a

        Week_.  SAMS Publishing, 1995.

Also, check out:

      Taylor, Dave. _Creating Cool Web Pages With HTML_.  IDG Books, 1995.


* "BCK2SKOL" is a free electronic library classroom created by Ellen Chamberlain, Head Librarian, University of South Carolina Beaufort, and Miriam Mitchell, Sr. Systems Analyst, USC Columbia. Additional support is provided by the Division of Libraries & Information Systems, University of South Carolina Columbia.


Your feedback and support for BCK2SKOL are appreciated; please email link updates, suggestions and comments to: eechambe@gwm.sc.edu

Return to BCK2SKOL Index

Go to Next Lesson

Links checked 7 January 1999. See the BCK2SKOL homepage for course update details.
Copyright © 2000, the Board of Trustees of the University of South Carolina.
URL: http://www.sc.edu/bck2skol/fall/lesson27.html