Tangled in the Web?

Using Search Guides & Engines to Untangle Information Resources:
A Central Florida Library Cooperative Workshop


Scope:   This page will examine both subject searching sites and search engines available for use on the Internet.


Background:   The following sites provide much background information, tutorials, and guides about searching the Internet and the World-Wide Web.

AskScott - Your guide to finding it on the Internet
http://www.askscott.com/
A well-organized site from a librarian trying to organize where to search by type of information needed.
How to Choose a Search Engine or Research Database
http://www.albany.edu/library/internet/choose.html
From the University at Albany Library, an excellent chart in an "if you want..." format linking to selected search engines or subject searching sites by specific features, fields, or options.
Internet Search Tools, A Library of Congress Internet Resource Page
http://lcweb.loc.gov/global/search.html
Provides links to a number of WWW sites organized by subject, meta-search sites, evaluative information about search engines, and individual search engines.
Internet Web Text: Index
http://www.december.com/web/text/index.html
A particularly good outline overview of the Web, which distinguishes between "Subject-Oriented Searching" and "Keyword-Oriented Searching" and links to many of the popular subject sites.
The Matrix of Internet Catalogs and Search Engines
http://www.ambrosiasw.com/~fprefect/matrix/
This whole site rates and describes search engines and subject guides, but its heart is the excellent chart (http://www.ambrosiasw.com/~fprefect/matrix/overview.html) rating and comparing features of many of the popular search engines. (WARNING:  This was last updated in 1996 and does not list some of the most recent engines.) Written by a software engineer with an information sciences degree, this is one person's opinion--but the opinions are quantifiable. Good starting place when looking for specific features (searches URLs, allows proximity, etc.).
Yahoo: Computers and Internet: Internet: World Wide Web: Searching the Web: How to Search the Web
http://www.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Searching_the_Web/How_to_Search_the_Web/
At last count, lists thirty-one sites with significant information about how to search the Web, most including information about subject guides as well as search engines.
  Back to top of page.top.gif (371 bytes)

Subject Searching Sites

Definition

Subject searching sites are those where human beings have indexed and often rated and summarized Internet sites. For purposes of this page, "subject searching" includes subject guides (often called "Webliographies"), i.e., documents that list many types of Internet sources but about a single topic, subject directories, i.e., sites that list one type of Internet source but about multiple, categorized topics, and the newest entry in the field, "hybrid" sites, i.e., those that attempt to rate Web sites. For broad subject searching, any of these classified sites usually provides fewer but more relevant results than those provided by search engines, which rely on computer-generated algorithms searching on keyword hit numbers. For narrow topics, a combination of subject searching and search engines can be the most effective way to search.
Back to top of section.     Back to top of page.top.gif (371 bytes)

WWW Subject Searching Sites

Good Starting Points (Alphabetically)

Britannica.com
http://www.britannica.com/
This is an evolving, ever-growing, and increasingly useful site where users can access "the world's most respected encyclopedia, expert reviews of the Web's best sites, timely articles from leading magazines, and Books in Print."  Web sites are rated and summarized; also includes a link to allow searching beyond the site.
INFOMINE: Scholarly Internet Resource Collections
http://lib-www.ucr.edu/
In its own words, "INFOMINE is intended for the introduction and use of Internet/Web resources of relevance to faculty, students, and research staff at the university level." It categorizes 15,000-plus scholarly Internet and Web resources and provides indexing and annotations about the sites listed. Begun in 1994 at the University of California (UC), Riverside, it is now maintained by librarians at all nine UC campuses and Stanford University. A few links are limited to UC patrons, but subscriptions to the same sources may be available at other university libraries--ask!
The Librarians' Index to the Internet
http://lii.org
Begun as one librarian's bookmark file to sites useful in a public library setting, this is now an organization employing 71 librarian-indexers.  An extremely well-organized and well-annotated site searchable in various ways.
Snap!
http://www.snap.com
Begun in 1997, this claims to be "the fastest-growing Internet portal and the first to launch a broadband service."  Now part of NBC Internet and a direct competitor to Yahoo!, this is very similar to Yahoo! in set-up and function; has recently added "Quick Guides" to the familiar structure.  To date, equally reliable, user-friendly, and customizable, though not quite up to the Yahoo! numbers.
Yahoo!
http://www.yahoo.com/
"Yet Another Hierarchical Officious Oracle," YAHOO is one of the older and better searching sites around. With broad categories determined by human beings, keyword searching can be done across all categories or limited to a specific category. Search results show site title, brief summary, Yahoo category, and links to other sites and search engines. An excellent place to start if searching for what sort of information can by found on the Web by discipline.
Back to top of section.     Back to top of page.top.gif (371 bytes)

Other Selected Subject Sites

The Argus Clearinghouse
http://www.clearinghouse.net/
Many "netizens" will know this source by its former name: University of Michigan Clearinghouse for Subject Oriented Resources. Provides "a selective collection of topical guides" compiled by librarians and subject experts. Guides identify, describe, and rank sites and are also searchable by keyword.
BUBL WWW Subject Tree - UDC
http://bubl.ac.uk/link/subjects/
Originated as BUlletin Board for Libraries, and while it retains a strong library element, the subject trees (accessible alphabetically or by Universal Decimal Classification number) now provide broader access to research and academic Internet sites.
The BigHub
http://www.thebighub.com/
Formerly known as The Internet Sleuth, this site lists, describes, and provides search forms for over 2000 searchable databases. Slightly less useful than it was before recently being reorganized, this site still gives access to material difficult to locate elsewhere.   (NOTE:  Be sure to go to the part of the page labeled "Specialty Search Categories" in order to get to the specialized databases.)  Some pages contain a "Quick Search" form which allows searching up to 6 databases in that subject area at once.
Internet Resources For...
http://www.ala.org/acrl/resrces.html
Archives of a monthly column that appears in College & Research Libraries News, there are few bibliographies here, but those found are extensive and highly authoritative.
NetGuide Live's Best of the Web
http://www.netguide.com
While at first glance, this appears similar to most other "portal" sites, this site presents most of its information in review articles with imbedded links.
The World-Wide Web Virtual Library: Subject Catalogue
http://vlib.stanford.edu/Overview.html or mirror site: http://www.ugems.psu.edu/~owens/VL/
This is probably the oldest search site on the Internet.  Recently revised, it now allows keyword searching plus viewing the subject list either hierarchically or alphabetically.

"Hybrid" Sites: Reviews and Ratings of the Internet

Many of the parent sites listed in this section show up in other portions of this page, as they produce subject directories or search engines, but each of these organizations also employs teams of professionals specifically to review and/or rate Internet sites. In most cases, searching the master site does not automatically include the associated reviewed directory.
The Internet Public Library: Ready Reference Collection
http://ipl.si.umich.edu/ref/RR/
"Not intended to be a comprehensive hotlist to all sites on every subject, but rather an annotated collection, chosen to help answer specific questions quickly and efficiently. Sources are selected according to ease of use, quality and quantity of information, frequency of updating, and authoritativeness." Arranged by broad category then by subcategories, the "collection" (several thousand items) is also searchable by keyword. Entries give title, URL, a review, the site author, and the IPL subject headings and keywords.
Looksmart
http://www.looksmart.com/
This site's displays are not like all the rest, as it has a "user-friendly cascading menu"; subjects are chosen in increasingly narrow categories (horizontally) while the full outline stays on the screen.  (As the screen fills, this can make it tedious to get to the desired subject.)  Claims to give access to "1.5 million Web sites ...indexed into more than 100,000 categories."  The ultimate results are lists of briefly summarized sites.   Has a "LookSmart Live!" feature that allows asking a human being a question and claims to respond within 24 hours.
Magellan Internet Guide
http://magellan.excite.com/
This has the look and feel of a typical Web search engine site, but there is a major difference: included sites are described, rated, and reviewed. Magellan uses something it calls Intelligent Concept Extraction to search concepts (e.g., you search "senior citizens" and it also includes "elderly" in the results.  Includes some 50 million sites and provides for either browsing by topics or keyword searching (which can be further limited to sites with reviews or to "green light" sites).   One odd feature is the "Search Voyeur" which allows viewing a dozen randomly selected real-time searches being conducted.
The Scout Report
http://wwwscout.cs.wisc.edu/scout/report/index.html
A weekly publication of the Internet Scout Project (part of the InterNIC), this valuable current awareness tool is available by subscription or on the Web. Professional librarians and subject matter experts select, research, and annotate what they judge to be "the best" Internet resources available and more than three years of issues are archived and searchable by keyword, fields, subject category, or LC Classification.
Yahoo Internet Life Reviews
http://www.zdnet.com/yil/filters/channels/reviews.html
This site is different from most, as it serves to point to articles (usually compilations of site reviews) about topics as well as to individual site reviews. Browsing gives article titles and "quick clicks" to related topics, while keyword searching results in a list ranked by presumed relevance. (This becomes problematic when many titles on a list are all "Yahoo! Internet Life / Site Review" with only dates to differentiate one from another.) A relatively small number of sites are included.

"Also-Rans": Recommended With Qualifications

With more time and work, these sites will probably eventually become quite useful, but for now, they fall short of the mark.  Use them for exploring or surfing, but be warned that they are frustrating if you are on an expeditious hunt for information!
Ask Jeeves
http://www.askjeeves.com/
Though really more in competition with search engines than with subject searching sites, this is definitely an "also-ran."  Ask Jeeves uses natural language processing and knowledge bases rather than just Boolean keyword searching, but unfortunately, the "millions of researched answer links" frequently do not contain the specific answers sought. Ask Jeeves responds with the questions for which it does have answers, checks several of the popular search sites, and provides lists of their links.  Still a far cry from fulfilling its claim that "each answer link is guaranteed to be relevant to the question asked," this might be useful for homework help, but not yet for research.
IPL Pathfinders Subject Index
http://ipl.si.umich.edu/ref/QUE/PF/
A relatively new service from the Internet Public Library, this is a limited, but ever-growing, list of guides to starting research on particular topics.  They primarily link to Internet resources, but most also cite books of interest.
Welcome to WebRing!
http://www.webring.com/
WebRing operates from a sound basic premise, i.e., linking like subject sites together so that a searcher can move from one to the next and know that they are related to the desired topic, but the site organization of and search engine for the 18,000-plus rings leave a lot to be desired and included pages are predominantly personal and commercial.

World-Wide Web Search Engines

Definition

World-Wide Web search engines are sites that use software (often referred to as spiders, crawlers, worms, or robots) to automatically create searchable databases attempting to "index" the Internet. Many engines "weigh" the results for relevancy, relying on a computer-generated algorithm to compare the numbers of times keyword hits appear. Meta-search sites provide an advantage, in that they will allow querying several search engines from a single site (either one or several at a time), but they usually do so without taking full advantage of the features of each search engine. For narrow searches where a specific term is required, search engines are the most effective way of finding those sites which use the term. For broader searches going beyond the keyword into a discipline or subject area, a combination of search engines with subject searching sites, or directories, is more effective.

Background Information

Comparison of search engine user interface capabilities
http://www.curtin.edu.au/curtin/library/staffpages/gwpersonal/senginestudy/compare.htm
A table by a librarian from Australia listing the search capabilities and commands for Alta Vista, Excite, Fast, Google, HotBot, Infoseek, Lycos, Northern Light, and WebCrawler.
Search Engine Shoot-Out: Top Engines Compared
http://www.cnet.com/Content/Reviews/Compare/Search2/index.html
A January 1998 comparison from C/Net of seven of the most widely used search engines, this examines ease of use, accuracy, advanced searching, and "extras."   Gives pros, cons, and search tips for AltaVista, Excite, HotBot, Infoseek, Lycos, Northern Light, and Open Text, plus a brief look at nine meta-search sites.
Search Engine Watch
http://www.searchenginewatch.com/
For Web surfers and Web developers, this provides massive amounts of information on search engine news, design, tips, statistics, and more.  For added value, true fans may subscribe.
Yahoo: Computers and Internet: Internet: World Wide Web: Searching the Web: Comparing Search Engines
http://www.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/Searching_the_Web/Comparing_Search_Engines/
At last count, lists twenty sites with significant information about search engines and their various functions and features.

WWW Search Engines: Meta-Sites (more than one search engine listed and accessible)

Meta-Sites That Provide Search Forms & Links; Search One At A Time

All-in-One Search Page
http://www.allonesearch.com/
A meta-search site, this is a great place to start if you don't already know which search engine you want to use. The search engines themselves are classified into broad topics (general interest, publications, software, people, etc.). Has forms that access almost all the heavily-used and popular sites (one at a time), and includes some sites that are just for fun.
SEARCH.COM
http://www.search.com/
A search engine to find search engines, this is a very professional meta-index compiled by the staff of CNET: The Computer Network (whose television show, C/NET Central, airs on USA Network and the Sci-Fi Channel). Provides subject menus as well as an A-Z listing of some 500 search engines.
Find-It! - Search tool finds *anything* on the net!
http://www.iTools.com/find-it/
A site with many forms linking to search engines, there is also a companion site, Research-It!, where users can easily use forms to look up language, geographical, financial, and other questions.
Net Search
http://home.netscape.com/escapes/search
One of the easiest search pages to find, as it's part of the menu bar on Netscape. Provides links to the home pages of several major search engines, with forms access on the page to nine services.
W3 Search Engines
http://cuiwww.unige.ch/meta-index.html
Grouping search sites into fourteen broad categories, this provides forms to search approximately 40 different engines and links to over 100 others. Other than the categories, not much information is given about the engines.

Meta-Sites That Search More Than One Engine Simultaneously

Dogpile, the Friendly Multi-Engine Search Tool
http://www.dogpile.com/
Dogpile searches for Web documents (using thirteen popular engines), USENET entries (using four sources), FTP sites (using two engines), plus a variety of specialized areas, including weather, stock quotes, business news and other news wires.  Results are listed by engine, with a small proportion of entries [10-20] from each shown (in the style of that particular engine), but with the option of going to the engine itself for any entries not listed.
Mamma: Mother of All Search Engines
http://www.mamma.com:80/
A relatively new entry into the meta-search engine sites, Mamma uses Alta Vista, Excite, Infoseek, Lycos, WebCrawler, Yahoo, and HotBot simultaneously to generate one set of results.  Items are ranked by percentage, identified by engine used, and give summaries if requested.
MetaCrawler Searching
http://www.metacrawler.com/index.html
A sophisticated meta-site, this sends queries to eleven different search engines simultaneously, compiles the results on one page (attempting to eliminate duplicates), and provides a "relevance ranking" for each hit. "Metaspy" is a new feature great for voyeurs that shows ten real-time search topics; the page refreshes every 15 seconds.  Also has added a subject guide on the home page.
ProFusion
http://www.profusion.com/
Designed by the Center for Research, Inc., at the University of Kansas, this meta-site includes nine search engines, searchable as "the best three," "the fastest three," "all of them," or by manual selection.   Searches the Web or Usenet and compiles and ranks the the results, eliminating the duplicates.

WWW Search Engines: The Largest Single Search Engine Sites (Listed by Size as of 12/1/99)

AltaVista: Main Page
http://www.altavista.digital.com/
This is an extremely popular search engine, as it includes more documents and allows for more sophisticated searching than most (including 25 languages).  Has both SIMPLE and ADVANCED search modes, each with a lengthy help file, but no help is available if "Bad Query" is the system response. Results are not ranked; each lists title, summary, and URL.  Allows searching for media and Usenet posts.
Northern Light Search
http://www.nlsearch.com/
This late 1997 entry into the search engine field is different in that it returns classified results (by concept or by type of site) and it includes both Web documents and  "Special Collection" documents, i.e., book chapters and articles not available free on the Web, but which may be purchased inexpensively from Northern Light.  The company made a  concerted effort in 1999 to greatly expand the size of its index and briefly captured the number one ranking.
Fast Search
http://www.alltheweb.com/
Fast is a new entry onto the search engine scene (May 1999) and for a good portion of 1999 was the largest search engine on the web, the first to top 200 million Web pages indexed.  Given that it is a Dell partner and trying to demonstrate its technology and speed, it is not a fancy search engine, but it is a good alternative if others are not finding relevant results.
Excite
http://www.excite.com/
This site has evolved considerably recently and no longer resembles a typical search engine.  Its "one-stop-shopping" approach to searching makes it much more useful than it used to be.  Depending on the topic, it returns some or all of the following: a box with related searches and suggested words to add to the search term(s); a suggested list of what it considers the most likely to try first; a "Web Site Guide" (multiple sites indexed under the same subject); Web results (lists of single pages); news articles; and discussions.
Google
http://www.google.com/
Google uses a new technique which means that though the engine has indexed about 70 to 100 million web pages, through link analysis, the searches can actually cover more than what's indexed--up to about 300 million pages on the web. It is, however, not the same as full-text indexing of 300 million pages, but if sheer numbers are desired, Google is the engine.  Results are returned on the basis of link popularity.
HotBot
http://www.hotbot.com/
From HotWired and Inktomi, this mid-1996 award-winning entry into the search engine field is one of the most complete Web indexes online, with close to 120 million Web documents; it refreshes its entire database of documents every three to four weeks. The fairly intuitive search engine accesses Web documents, news groups, and a menu of specialized options.. Displays ten results at a time, generally providing title, a confidence rating score (expressed as a percentage), part of the first sentence, URL, file size, and date. One unique feature is that it tries to detect duplicates of the same document and group them together, listing any occurrence(s) after the first as "alternate," providing only URL, file size, and date. Allows searching for media. The text version (http://hotbot.lycos.com/text/) provides an improvement in loading speed without any loss of search features.

WWW Search Engines: Selective Smaller Single Search Engine Sites (Listed Alphabetically)

Infoseek Guide
http://www2.infoseek.com/
Frames-based retrieval, one frame gives search results, while one shows a list of "related topics" which can be searched. Sorts results by score, identifies "InfoSeek Select Sites" and allows searching for "similar pages." Provides title and site summary information. Allows searching for media and Usenet posts.
Welcome to Lycos
http://www.lycos.com/
Results are given a percentage ranking and show how many of the search terms appear in the hit. Results show title, summary, and URL. Does not allow terribly sophisticated search techniques, but does have pull-down menus that allow searching for sounds, pictures, personal homepages, and UPS tracking number.
WebCrawler Searching
http://webcrawler.com/
Could probably be classified as a "hybrid" site, but numbers of sites included are quite limited. Allows but doesn't require sophisticated search skills. Also has a "Similar Pages" button that sometimes (but not always) broadens retrieval to other relevant information. Initial results show as a list of titles, but user can request site summaries. A new feature also gives a "Shortcuts" feature listing Web site reviews to any guides it has which match the search terms. Allows searching for media.

WWW Search Engines: Specialized Search Engines or Lists

About.com's Guide to Search Engines and Directories
http://websearch.about.com/internet/websearch/msubmenu12.htm
This starts as a routine list of links to various common search engines and directories, but it has two outstanding features which bring it quickly out of the norm.  First is the list of Regional Search Engines and Directories, which includes links to search engines focusing on Africa, Asia, Europe, Latin and South America, the Middle East, and Oceania (Australia, New Zealand, etc.).  Second is the Specialized Search Engines and Directories link, leading to a list of (at this writing) 36 unusual specialized search sites, from the esoteric American Sign Language Browser to the 1000 Best and Busiest Sites.
Deja.com
http://www.deja.com/
This search engine was formerly known as Deja News and began its life designed to search Usenet postings exclusively. It is now focusing its efforts on consumer information, though it still maintains a simple interface to search Usenet as far back as March, 1995 and a Power Search page is available where users can create their own query filters.
EuroFerret
http://www.euroferret.com/
A search engine including almost 37 million European pages covering 52 nations, more European domain documents than can be retrieved by either AltaVista or HotBot.  The engine can be accessed in English, French, German, Italian, Spanish, or Swedish.
Govbot
http://eden.cs.umass.edu/Govbot
A specialized engine designed to search only those sites in the .gov and .mil domains.   Results lists are in an unusual format, but worth it when specifically seeking governmental or military information.
Livelink Pinstripe
http://pinstripe.opentext.com/
Formerly Open Text Index, a general search engine, Livelink Pinstripe has evolved into what it calls "the first Internet search site designed specifically for business users," with over 150 subject groupings, all specific to business.   Affords "slice" (or subject) searching, "quick search" and "power search" options, all of which are straightforward. Allows searching for Usenet posts.
Search Engine Collossus
http://www.searchenginecolossus.com/
From Canada, this is a list of more than 1,000 search engines and subject directories, organized by country or broad category.  Indicates the language of the source and includes a brief summary of the resource.

Suzanne E. Holler
Librarian and Trainer
sholler@cflc.net

Central Florida Library Cooperative 


Created 8/23/97; Last revised 1/11/2000