[Index] [Chapter One] [Chapter Two] [Chapter Three]
Chapter 1
You have informationproduct literature, technical specs, financial data, customer and employee records, employment opportunities, schedules of events, reports. The information is in a variety of formatstext, graphics, video, audio. People want that informationcustomers, vendors, colleagues, friends, family.
How do you get this varied information into the hands of the right people (and keep it out of the hands of the wrong people)? Publish it with WebSite, the powerful, fully featured, easy-to-use Web server from OReilly & Associates.
With WebSite you can publish your information directly on the Internet and reach the millions of people who use the World Wide Web every day. Or, you can use WebSite to publish on an internal network and share important company information with the people who need it most. You can even use WebSites virtual server capability to publish in both environments with a public web connected to the Internet and a private web running on your internal LAN.
To administer, build, and manage your web, WebSite comes with a complete set of tools. WebView lets you see and develop your web in a graphical environment. MapThis! makes creating clickable image maps a breeze, while WebIndex lets you create search indexes for all or any parts of your web. With Server Admin you configure your server through an easy-to-use graphical interface.
This book is dedicated to showing you how WebSite works and how you can use it to meet your information goals. This first chapter sets the stage with ideas for using WebSite, some background on the Web, an overview of the WebSite server and tools, and a look at whats new with Version 1.1 of WebSite.
Running a Web server is essentially a new way to publish information. Publishing on the Web differs from traditional paper-based publishing by giving you the ability to include multimedia elements, to link information from many locations, to update and distribute information quickly, and to create virtual documents from other sources and applications. Taking these general capabilities as a jumping-off point, lets look at some ways you might want to use WebSite in publishing your own information.
These are just a few ideas; you probably have many more for publishing your information. This list referred to several features of the Internet, the World Wide Web, and WebSite. Read on to learn more about them.
The Internet, the World Wide Web, WebSite. How do they all fit together? How do users find your information? To answer these questions, take a few minutes to look at the big picture and some basic concepts.
Figure 1-1 depicts the components of the Internet and particularly the World Wide Web.
Figure 1-1: The World Wide Web
In this illustration, you can see that
Figure 1-2 seems quite similar to Figure 1-1. It shows the same Web components but on an internal network, such as a local area network, instead of the Internetsometimes called the Intranet. You can use WebSite in either situation: on the Internet for public, global use; or on an internal network for local, private use.
Figure 1-2: An internal web
Whether you plan to run a public or a private WebSite server, you should be familiar with the few basic concepts that make it work. As you use WebSite and explore its capability, these concepts will become clear. This section describes the concepts, gives you a bit of history, and introduces the specifications on which the Web relies.
When Tim Berners-Lee decided to use hypertext technology in developing the World Wide Web, he did away with the typical linear approach to published material. Youre familiar with that: you open a book, skim the table of contents, and go to a specific topic. To find similar information in the same book, you may have to check the index (or a cross-reference) and turn there.
Hypertext, on the other hand, allows for many relations within a document and between documents. Hypertext links in a web document allow the reader to instantly find additional information, which may be text, graphics, video, or audio and may be located half a world away! Web is an apt term to picture how hypertext works on the Internet, where links can span computers around the globe. By using hypertext as a navigational system, users can move freely from one document to another, regardless of where the documents are located.
The Web is based on client/server architecture. Simply put, a server holds documents that a client requests. On the Web, the client is called a browser. A Web browser takes a users request (in the form of a URL, explained below), retrieves the document from the proper Web server, interprets the contents, and presents it to the user.
A graphical Web browser, such as Mosaic, can display text and GIF graphics. Web browsers have varying capabilities of manipulating the display of text. We recommend you test your web documents on several browsers to verify their proper appearance.
A Web browser can also be configured to automatically call external viewers, such as Lview, Wham, and MPEGPlay, or other applications to properly display specific types of documents. Web browsers are becoming more sophisticated every day with built-in capabilities. It is important that you keep up on browser improvements so that the documents on your web take full advantage of new features.
The other half of the client/server relationship is the server. On the Web, a server houses documents and returns them to a Web client when requested. As its name implies, a server serves up the document to the client. The WebSite server is a fully featured Web server providing support for multiple IP addresses (virtual servers), mapping capabilities, basic user authentication and access control, automatic directory listing, and logging. These topics are discussed in detail in Section 3 of this book.
Web servers not only return text and multimedia documents to browsers, they can also execute special programs that enable them to act as gateways to other applications or information resources. These programs are called Common Gateway Interface (CGI) programs. WebSite supports CGIs for the Windows environment, the standard (POSIX) environment, and the DOS environment.
For example, a WebSite Windows CGI program can execute a request for data from a Microsoft Access database or a calculation from a Lotus 123 spreadsheet and return the results to the browsers in an HTML document. WebSites full 32-bit support for Windows CGIs makes it unique among Web servers. Windows CGIs for WebSite can be written with a variety of tools including Visual Basic 4, Visual C++, and Delphi. Section 4 of this book provides a detailed discussion of CGIs.
Text documents on the Web are ASCII (plain text) files that contain codes of a special tagging language called Hypertext Markup Language (HTML). HTML describes the structure of a document but not its exact formatting. That is, you can identify some text as a top-level heading with a special tag:
<H1>Hypertext Markup Language (HTML)</H1>
However, how that text looks to the user depends on the browser. The head will certainly stand out, but it doesnt mean it will appear as 18-point Helvetica boldface while the body text is 12-point Times Roman. Working with HTML means adjusting your thinking away from WYSIWYG (what you see is what you get) desktop publishing to deciding what role an element plays in your document. Is it a head? Is it text? Is it a list? In many ways, HTML formatting simplifies the life of an author because if concentrates on content, not format.
The most powerful part of HTML is the ability to embed hypertext links in documents. As you guessed, links have special tags. HTML also includes tags that ask the user for input, either with simple questions or complex forms.
The current version of HTML is version 2.0. It is based on some of the concepts of SGML, the Standard Generalized Markup Language, which is an ISO standard used for marking up documents for both print and online publication. The next version of the HTML specification, HTML 3, will move HTML into full SGML compliance and provide more tags. WebSite supports both HTML 2.0 and HTML 3. Chapter 5, HTML Tutorial and Quick Reference, covers HTML.
With documents residing on Web servers around the world, how do Web browsers know where to find a specific document? Every document on the Web has a unique address called a Uniform Resource Locator, or URL. You can think of URLs as a global addressing system that provides several pieces of information to the Web browser.
Lets look at the URL for the list of external viewers at the NCSA at the University of Illinois at Urbana Champaign:
http://www.ncsa.uiuc.edu/SDG/Software/WinMosaic/viewers.html
The first part of the URL, http://, tells what protocol is used to reach the target server. In this URL the protocol is HTTP, which means this is a Web server. (Yes, you can reach FTP servers and Gopher servers by using URLs with the correct protocol.) The rest of the URL is the path for the document. The server name is first (www.ncsa.uiuc.edu), followed by the full URL path of the document viewers.html.
The URL path is not necessarily the same as the physical path of the file on the server. The URL path is determined by how the web is mapped on the server. Chapter 9, Mapping, covers mapping in detail; the important thing to remember now is that the physical path and the URL path may have absolutely no correlation to each other.
Often URLs dont include filenames. For example, the URL for WebSite Central, the web site dedicated to WebSite information and support is:
http://website.ora.com/
With this URL, the browser can locate the server and a directory within the servers web, but not a specific document. Which document to return is up to the server. If the server has a default home page or index file defined, and a file by that name exists in the specified URL directory, the server returns that document. If the file doesnt exist, the server returns a directory listing. The server can also be configured to return nothing to protect sensitive information. Automatic directory listing, with its many features, is discussed in Chapter 11, Automatic Directory Listings.
The glue that holds the Web together is the "Hypertext Transfer Protocol (HTTP). Web browsers and Web servers use HTTP when requesting and returning documents. In HTTP, every document request from a Web browser to a Web server is a new connection. For example, when a Web browser requests an HTML document from a Web server, the connection is opened, the document is transferred, and the connection is closed.
The current version of HTTP is 1.0 (often written as HTTP/1.0). WebSite meets the HTTP/1.0 specification and includes all the required features. If youre interested, the HTTP/1.0 specification is available through the Tech Center at WebSite Central (http://website.ora.com/).
The World Wide Web originated in 1989 at the European Particle Physics Laboratory (CERN) in Geneva, Switzerland. Tim Berners-Lee, an Oxford University graduate who came to CERN with a background in text processing and real-time communications, wanted to create a new kind of information system in which researchers could collaborate and exchange information during the course of a project. He saw the need for physicists to collaborate in real time, and not just on one project, but on many.
Tim used hypertext technology to link together a web of documents that could be traversed in any manner to seek out information. In cooperation with others at CERN, Tim defined an Internet-based architecture using open, public specifications and free, sample implementations for both clients and servers. The team at CERN implemented a line-mode browser, which is the lowest common denominator among browsers and can be used from almost any kind of terminal. Lynx, a browser with a full-screen interface, was later developed at the University of Kansas. Although these browsers supported the hypertext environment, they did not support graphic or multimedia elements.
The widespread appeal of the Web did not come until 1993 with the release of Mosaic, a graphical browser. Marc Andreessen, a student at the University of Illinois at Urbana-Champaign (UIUC), was working part-time at the National Center for Supercomputing Applications (NCSA) at the university. His job was to build tools for scientific visualization. Out of that work came Mosaic, a Web browser with an easy-to-use interface that lets you click on a link to navigate the Web, as well as the ability to display graphics. Extending Mosaic with external viewers added multimedia capability.
The World Wide Web and graphical browsers have made the Internet important beyond the scientific or educational community. Most references you see to the Internet in consumer or business-oriented settings are to the World Wide Web. Generally youll see a picture of a Web browser showing a home page. The Web has made an exciting contribution to the information age. We will continue to see refinements to Web servers and browsers and a major shift in the information publishing paradigm.
WebSite includes the Web server and a full range of tools to manage the server and develop your web. This section briefly describes those components and also touches on security and performance issues, topics which may be of concern to you.
The heart of WebSite, the server handles requests from clients (browsers) for documents, whether they be text, graphic, multimedia, or virtual. The WebSite server is a full 32-bit, multi-threaded HTTP server that runs under Windows NT or Windows 95. The server takes full advantage of the operating systems Registry and multithreading support. Under NT, the server can run as an application or as a service. Usually the server appears as an icon on your desktop or task bar with the status shown as either idle or busy. Configuring the server is the job of Server Admin.
Server Admin
Server Admin lets you configure the WebSite server to meet the needs of your environment. Although the install program handles the basic configuration, you will probably want to enhance your server by changing some settings. Mapping, identities for virtual servers, automatic directory listings, access control, and logging parameters are set through Server Admin. The General settings in Server Admin are covered in Chapter 3, Installing WebSite; the other settings are covered in Section 3 (Chapters 9 to 14).
WebView helps you build and manage your web by graphically depicting it. WebView shows all hypertext links between and within documentsinternal or external, broken or complete. WebView not only lets you see your web from a birds eye view, it also lets you edit individual files. In WebView you can launch an appropriate editing application based on the files type, or you can drag a file into the desired application using WebViews drag and drop capability. If youre building a new web, start with WebView. If you are managing an existing web, use WebView to make improvements and fix problems. WebView diagnoses HTML coding problems and lets you see activity reports on any part of your web. WebView works on any web, not just the local one. WebView also includes HTTP proxy support to work behind a firewall. WebView is the subject of Chapter 4, Managing Your Web Using WebView.
WebIndex and WebFind work together to provide full-text search capability for users of your web. WebIndex appears in the WebSite program group or program list while WebFind is a CGI program. WebIndex lets you create the indexes used in WebFind searches. Before WebFind can work, you must run the WebIndex program and indicate which portion of your Web is to be searchable. You can create multiple indexes with WebIndex and keep them separate or merge them. A variety of preferences allow you to tailor index contents. When users click on a WebFind hypertext link in one of your documents, WebFind first displays a search form for the user to complete and then executes the search. WebIndex and WebFind are covered in Chapter 6, Indexing and Searching Your Documents.
Clickable image mapsimages that have hotspots that send the user to other locationsare a great addition to any web. Many webs use clickable image maps on the home page to help users navigate through the contents of the Web. WebSite Central is a good example of a clickable image map. Point your browser at http://website.ora.com/ and you will see an image map at work. Map This! lets you create clickable image maps easily, in a graphical environment. Map This! supports both Registry-based maps and externally configured maps. Map This! is the topic of Chapter 7, Working with Image Maps
Before you make your web available to users either on the Internet or on a local area network, you probably have some questions about performance and security. How much load can the server handle? Will it handle requests fast enough in a busy environment? What will cause performance degradation? Will the rest of my files be safe if browsers have access to my web? Can I impose additional security? If youve asked any of these questions, take a few minutes to read this section.
The WebSite server is as robust as any Web server currently in use. Given the same basic hardware setup (network connection, disk capacity, disk speed, CPU type and speed), the WebSite server performs as well as any other NT- or UNIX-based Web server (and, historically, UNIX has been the platform of choice for Internet servers). In an equal hardware environment, WebSite is as fast or faster and can handle an equal load.
In addition, WebSite fully supports symmetric multi-processing. Running on a computer with multiple 486 CPUs, the server can saturate a T1 line. Running the server on a single Pentium with a fast bus would have the same effect. In short, server performance is limited only by the hardware being used. To improve performance, we recommend upgrading your hardwareboth the computer system (particularly RAM) and your Internet connection.
For a more detailed discussion of WebSites performance capability and the results of performance tests, see the Performance White Paper in the Developers Corner of the Tech Center at WebSite Central.
Security is certainly an issue of concern for server administrators. Unauthorized access to computers and files on the Internet can range from annoying to disastrous depending on the intruders intent and abilities. Even if your web is on a private network or otherwise protected, you may still have some security concerns.
Although no Internet service is 100% safe, the World Wide Web is safer by design. If you think of the Web as a web, the limited nature of what a user can see or have access to becomes clear. The Web has boundaries, defined by the document links. A Web browser doesnt have the capability to freely browse a server; it can only view documents that are part of the document tree, beginning with the document root. This limitation is controlled by mapping, the topic of Chapter 9, Mapping.
In addition, WebSite provides two standard (basic) methods of access control, which can be applied to the whole Web or any URL in your web:
Perhaps the best advice we can give to deal with security issues is to keep an eye out for any suspicious activity on your server. You should also regularly check WebSite Central for security updates.
Note: You may need greater security than the basic security provided with WebSite 1.1. For example, if you want to conduct credit card transactions over the Web, you may want to use encryption-based security. Two protocols provide this enhanced security for Web transactions: SSL (Secure Sockets Layer) and S-HTTP (Secure HTTP). Both of these protocols are available in WebSite Pro. Please contact OReilly & Associates Customer Service at 800-998-9938 for information on upgrading to WebSite Pro.
If you are upgrading from WebSite 1.0, youll see several new features in WebSite 1.1. Many of these features were added in response to your suggestions and comments. Other features were added to keep WebSite on the cutting edge of Web server technology. We hope you find these features useful; we welcome your feedback on WebSite as we continue to develop an even better product for you.
Some of the new features youll see in WebSite Version 1.1 include: