Web Computing

For the past few years, we have been witnessing some incredible advancements in the way we think about computers. We are no longer bound by large and cumbersome applications on the desktop. With the arrival of the Internet and the Web, we can now access information, even do business from virtually anywhere. For this fulfillment, there are many challenges ahead like designing and developing faster, lighter and more robust applications that can be delivered across the Web using HTTP protocol. This sort of applications are called Web applications. Below we are to see the tools and the technologies for developing, delivering and deploying Web applications.

Web Architecture

The architecture is very simple and straightforward. To view a web page in our browser, we have to type the URL for that page or click an existing hyper-link to that URL. Once we submit this request and the web server receives this, the web server locates the web page from its memory or from its hard disk and sends it back to the browser. The browser then displays the page. Each image in the page is also referenced by a URL and the browser requests each image URL from the server in the same way it requested the main HTML page.

The Web Browser

The web browser can be thought of as a universal graphical user interface. whether we are doing some simple web browsing or transacting online banking, the web browser's responsibilities are that of presenting web content, issuing requests to the web server and handling any results generated by the request. There are two main web browsers in the market. They are Microsoft Internet Explorer and Netscape Communicator. Both brings some incredible power on the client side and evolved into fully programmable document containers. Each has its own object model allowing for scripts, or objects, to manipulate the elements of the document itself. Scripting languages like VBScript or JavaScript can be used to perform client-side data validation or provide some interactivity within the document.

Dynamic HTML (DHTML) is a combination of HTML, Cascading Style Sheets (CSS), the document object model, and scripting languages. CSS are a better way of positioning and formatting HTML elements. And since each property of the style sheet is made visible to the object model, we can use some script to manipulate and reposition HTML elements. DHTML, as a whole, provides a greater level of interactivity within our pages. It also adds much more control over the presentation of them as well.

The latest must-have browser feature is Extensible Markup Language (XML) support. XML allows us to define our own tag set to characterize our data, and to construct documents and data structures using these tags. XML provides a way for structured data to be self-describing, and that means the data can be portable. Also there is XHTML, the combination of XML with HTML. Furthermore, with Extensible Style-Sheets Language (XSL), we can select the data we want to view, and even change tag names, allowing us to transform XML tags to HTML. Depending upon the type of browsers, the XSL will change XML: to HTML for web browsers, to WML for WAP browsers etc.

The web browser is also capable of executing applications within the same context as the document on view. The tow most popular choices for client-side web applications are Microsoft's ActiveX technology and Java applets. ActiveX components are downloaded from the web server, registered with the Windows registry, and executed when called upon by other script elements. A Java applet is a small Java program also downloaded from the web server, and executed within the browser's own Java Virtual Machine (JVM). Both ActiveX objects and Java applets have full access to the browser's document object model and can exchange data between the browser and themselves. Thus the web browser always serve the purpose of being our 'window to the world'. It serves as our primary user interface as we browse the web, conduct online business or even play games.

The Web server

The web server is the heart of any web interaction. The web server is a software program running on the server machine that listens for incoming requests from the web browser and serves those requests at any time. Once the web server receives a request, it then springs into action. Depending on the type of request, the web server might look for a web page, or it might execute a program on the server. Either way, it will always return some kind of results to the web browser, even if it might be an error message.

Currently there are two top web servers. They are the Apache web server and Microsoft's Internet Information Server (IIS). The first one has been developed as free software and has been contributed to by programmers around the world. Its power, flexibility, ease of use and the availability for multiple platforms has immensely contributed to its rise in popularity over the past few years. Microsoft's IIS, on the other hand, runs on the Windows NT and Windows 2000 operating systems. While it offers a wide range of features, its dependence on the Windows platform is being a major drawback. After the advent of the Linux operating system, the usage and popularity of the Apache web server gets multiplied. Thus the web server is poised to play a very important role in our server-side applications.

Web Application Architecture

A web application typically follows a three-tiered model. The first tier consists of the presentation layer which, in the case of web application, includes not only the web browser but also the web server, which is responsible for assembling the data into a presentable format. The second tier is the application layer. It usually consists of some sort of script or program. Finally the third tier provides the second tier with the data that it needs. A typical web application will collect data from the user (first tier), send a request to the web server, run the requested server program (second and third tiers), package up the data to be presented in the web browser, and send it back to the browser for display (first tier).

Collecting the data - The first step of a web application usually involves collecting some kind of data from the user using a simple HTML form. The user would type some information into some form fields and the information will be taken to the web server for being processed and finally the result will be sent back to the user. There is an alternative mechanism for this using Java applets, which can be used as a client to a server-side program by simply opening up a socket connection to the web server. This approach can help to move the majority of data formatting and validation off the server and onto the client. This approach reduces the volume of network traffic to a major extent. Sending the Web Server a Request - In order for the web server to spring into action and execute a server program, the web browser needs to package up the user data and issue an HTTP request to the web server HTTP/HTTPS is a simple, stateless, lightweight application-level protocol generally implemented over TCP/IP connections. The HTTP is a stateless protocol based on requests and responses. It was originally meant for serving static information. In this paradigm, client applications, such as a Web browser, send requests to the servers, such as a Web server, to receive information or to initiate a specific processing on the server. As an application level protocol, HTTP defines certain types of requests that clients can send to servers. The protocol also specifies how the requests and responses be structured. The latest version HTTP/1.1, in addition to GET, POST and HEAD, has five additional request methods: OPTIONS, PUT, TRACE, DELETE, and CONNECT. Of these, the GET and POST requests meet most of the common application development needs.

The GET request method is the simplest and most frequently used request method for accessing static resources such as HTML documents, images etc. GET requests can also be used to retrieve dynamic information by using additional query parameters in the request URL. For example, I can send a parameter name=peter appended to a URL as http://www.peterindia.com?name=peter. The Web server can use this parameter name=peter to send the content specific to "peter".

The POST request method is normally being used to access dynamic resources. POST requests are meant to transmit information that is request dependent, and are used when we need to send large amounts of complex information to the Web server. The POST request helps to encapsulate multi-part messages into the request body. For example, we can use POST requests to upload text or binary files. Also POST requests can be used to serializable Java objects to the Web server through our applets. Thus POST requests offer a wider choice in terms of the contents of a request.

There are certain differences between GET and POST requests. With GET requests, the request parameters are transmitted as a query string appended to the request URL. In the case of POST requests, the request parameters are transmitted within the body of the request. Since a GET requests contains the complete request information appended to the URL itself, it allows browsers to bookmark the page and revisit later. If the parameters to be passed happen to be sensitive, then the GET method may be not suitable. Also some Web servers, not complying with HTTP/1.1, may put some restrictions on the length of request URL.

These two method requests are set to be idempotent, that is, they are not programmed to modify the information on the Web server. An idempotent request can be reapplied without changing any data on the server side. These are used for pure information retrieval. Along with the type of the request, the client application also specifies the resource that it needs as part of the request header.

HTTP Response - For a HTTP request from a Web browser, the Web server responds with the status of the response, and some meta-information describing the response. All these are part of the response header. Except for the case of the HEAD method request, the server also sends the body content that corresponds to the resource specified in the request. The body content is actually what we want to get from the Web server. But the content header fields in the response contain useful information, such as "Date", "Content-Type", "Expires", etc.

In HTTP, servers and clients use MIME (Multi-Purpose Internet Mail Extensions) to indicate the type of content in requests and responses. Examples of MIME types are text/html, image/gif, etc. Here the first part of the header indicates the type of data, such as text and image, while the second part indicates the standard extension, such as html for text and gif for image. MIME is an extension of the email protocol to allow exchange of different kinds of data files over the Internet. HTTP servers use MIME headers at the beginning of each transmission. Web browsers use this information to decide how to parse or render the response. Web browsers also use MIME headers while transmitting data in the body of requests, to describe the type of data being sent. That is all about the HTTP request-response process.

Executing the Server Script or Program - An important function of the web server is that of passing a request to a specific script, or program to be processed. The web server first determines which type of operating environment it needs to load by looking at the extension of the requested file or the directory the file is located in. This is being done through mapping. When a web server is configured, it is told how to handle specific file types. For example, typically anything in the cgi-bin directory will be treated as a CGI script or anything with a .asp or .jsp extension will be treated as a Active Server Page or a Java Server Page respectively.

Once the web server determines the type of the requested file, it then loads any required runtime environment for the file to be executed. For example, if a CGI program were written in Perl, the web server would create a new process and load the Perl interpreter into it. For some types of programs depending on the web server, there is no necessity for loading a separate runtime environment.

Returning the Results to the Browser - The final step in a web application is to make some kind of response to the operation and return that to the browser. Generally speaking, the server script specifies the content type and then writes the response to an output stream. When the web browser receives the response, it will first look at the response header and determine the MIME type so that it knows how to render the data.

Server-Side Web Programs

A server-side web program has some important exceptions from a regular program. To make a program accessible to a web server, it must possess the following characteristics:

i. When a user issues a request from a web browser, the web server has to be able to locate and execute the requested program

ii. There must be a way in which the web server passes any form data to the program.

iii. Once the program is invoked, there has to be a standard entry point

iv. After the program has processed the input data, it has to package up the results and send them back to the web server which will, in turn, send them back to the web browser. The exact division of responsibility may vary for different web servers. But all web servers just talk HTTP.

Server-Side Technologies - There are many server-side technologies for developing server-side web programs. Common Gateway Interface (CGI) is one of them. Programs satisfying CGI specifications provide a relatively simple way to create a web application that accepts user input, queries a back-end database, and returns some results back to the browser. There came a number proprietary extensions, such as ISAPI and NSAPI, to CGI to overcome some of the issues and complexities associated with CGI technology. In the recent past, both Microsoft and Sun developed their own server-side technologies. They are Microsoft's Active Server Pages and Sun's Java Servlets and Java Server Pages.

Click for Java Web Computing

Home