by Anthony Frey
So, you've got your old reliable enterprise database and the
decision has been made to develop Web access for its information.
This could be the reason that you decided you needed Internet
access in the first place, or even more likely, you finally want
to provide Web-based access to your corporate information. So,
where should you begin?
We've seen an incredible array of Web-to-DBMS middleware packages
introduced lately. In an attempt to bring order to these products,
we've broken them down into Perl scripting, embedded queries and
dynamic HTML generation, and tested a few examples of each. First
we established a small networking information database on two
different data sources, Oracle Workgroup Server 7.2 and Microsoft
SQL Server 6.5. We chose Intergraph's TD-300 as our Web server
(Editor's Choice in our May 1 issue, page 46). Our choice of data
access middleware in this case was ODBC. After setting each package
up in our Wisconsin lab, we developed a Web site for each product
to see how the various product types stacked up.
Issues Related to Web-DB Connectivity Before we get into
each of these database access methods, let's talk about some of
the issues you should be concerned about when setting out to connect
your database to the Web. Obviously you are going to need a data
access method, or the middleware between your Common Gateway Interface
(CGI) program and your database. Technically you can use any type
of middleware that you might use for a standard client/server
application. However, you'll find that the CGI environment you
select generally narrows your choices. If you're developing on
the Windows NT platform, ODBC is the de facto data access standard.
Performance between your Web server and database server is crucial.
Because of the data access requirements, the CGIs that many of
these products are implemented with can no longer be thought of
as lightweight "scripts." Every time an HTML page containing
a database query is accessed, it must establish and close a session
with the database server. This is in addition to the overhead
of the query itself. A couple of products have worked around this
by using a session identifier (possibly the same one you use to
maintain state) to cache database sessions for queries that are
developed across HTML pages.
Also essential for most database applications is a method of maintaining
state information during a remote user's session. If you are defining
a transaction across a series of Web pages, how do you tie the
user you authenticated in the first page with a table you are
about to update on the following pages? This, of course, is an
inherent problem of the HTTP protocol. The most common answer
of course is a session identifier or "magic cookie."
A good Web development package should provide some assistance
with this.
How flexible is the query engine? The Microsoft Internet Database
Connector (MS-IDC) only allows a single query per MS-IDC file.
NetScheme InterMart Toolkit doesn't allow inserts or updates.
The more complex your data model is, the more flexibility you
will need. Finally, you're going to want to consider what these
methods offer in the way of HTML page maintenance. This is not
an issue for DBMS access alone, but it is a part of the environment
you will be working with when you choose a package.
Perl Scripts and Oracle WebServer In the early stages of
Web development, Perl scripts were the most common method of accessing
your data. At the time, data sources being accessed weren't always
RDBMSes. Often they were tabular structured data stored on a server's
file system. Perl is well suited to these environments due to
its text-processing capabilities. But Perl scripts are also great
because they provide a fairly sophisticated environment and a
rich set of libraries from which you can build data access routines
and maintain state information.
We tried this approach with two freely available packages, Oraperl
and ODBC Extensions for Win32 Perl (both are readily found on
the Internet by searching for oraperl and ntodbc at your favorite
search site). Oraperl is an exception to our ODBC middleware choice,
since it requires a local instance of Oracle to run (some versions
have been extended for SQL*Net). These are both general-purpose
packages for data access, but again the Perl environment allows
relatively easy HTML generation.
The likable characteristic of this method is that you have complete
procedural control over your data access and HTML generation.
A large drawback, however, is having to write a large number of
specialized scripts--usually one or more for each page. You will
have to come up with your own libraries for generating HTML and
maintaining state information. Make no mistake about it, this
is programming. It's not simple and it's not quick, but you can
certainly get the job done.
We've included Oracle with Perl scripts because the flavor of
development is much the same. Oracle has integrated an HTTP server
into its Workgroup Server and added an HTML generation library
to its standard procedural language PL/SQL. This places you into
the Oracle environment where you can make full use of the stored
packages, procedures and triggers you've already developed for
existing Oracle-based application. This method is a good choice
if you already have a large investment in Oracle development.
Embedded Queries This approach seems to be the method of
choice among most of the new tools we've seen. This basic technique
allows you to dynamically generate HTML pages by embedding queries
and additional logic within an extended HTML syntax. These pages
are not referenced directly as a hyperlink. Instead, you call
the provided CGI, which preprocesses the page, performs the database
queries and carries out the logic before returning the HTML text.
It's important to remember that the HTML text is dynamic and not
the actual queries.
Microsoft's Internet Database Connector, available with the Microsoft
Internet Information Server, is a good example of embedded queries.
Other products include WebBase from ExperTelligence, WebDBC from
Nomad Development and Cold Fusion from Allaire.
Generally, embedded queries are the most logical Web-to-database
development method because there's a one-to-one relationship between
what you're writing and the final output. This is unlike Perl
scripts, which require you to programmatically generate the HTML
output. The entire embedded query process feels like a natural
extension of HTML writing.
For example, let's look at how we implemented our Web site with
the Cold Fusion package. Our simple data model included three
tables: IP addresses, machines and contacts. Using the Cold Fusion
administrator, we assigned a user name and password to an ODBC
data source. Administering access in this way is better than specifying
the information in each query file, the method employed by MS-IDC
and WebBase, because you have a single point of maintenance for
authentication information.
Web pages with database queries are typically designed using two
pages--a normal HTML page (see IPADDR.HTM) that sets up the query
and a template page (see IPADDR.DBM) that contains the extended
HTML markup. Cold Fusion uses the DBM extension to label extended
HTML database markup files. The IPADDR.HTM file asks for two IP
addresses to specify a range of addresses to display. These are
assigned the identifiers lowip and highip, respectively. The Cold
Fusion CGI and template file is specified in the ACTION field
of the form tag, ACTION="/cgi-bin/ dbml.exe?template=ipaddr.
dbm". When the submit button on the form is activated, these
parameters are passed as variables to the CGI, which are subsequently
used as parameters for the query.
Notice the <DBQUERY Name=<P> "qryIPaddr" tag
in the IPADDR.DBM template page. This identifier represents the
row set that will be returned by the query. The DBTABLE tag uses
this row set identifier to loop through the rows and produce the
HTML table. You can also specify multiply queries with Cold Fusion,
something the MS-IDC doesn't allow. Cold Fusion's extensions are
also fairly robust, allowing procedural logic within the HTML
page. This allows you to conditionally modify the HTML output
based on input parameters or query results.
With HTML you have to rethink some of your usual techniques of
writing typical client/server application features. In a form-building
4GL you might create a list box that contains the detail items
of a master-detail relationship and the detail items need to be
selected for further editing. Using HTML, however, you need to
present the detail output in a HTML table, wrapping each item
in a hyperlink to a CGI that processes the row selection.
What about data validation? Although this has always been cumbersome
with HTML forms, Cold Fusion is no exception; it is slightly easier.
If our IP addresses were stored as four separate fields we could
check to make sure each octet was in the range 0..255 by appending
"_range" to the input widget name, for example,<INPUT
TYPE="text" NAME="ipa_range" MIN=0 MAX=255
SIZE="20">. The range validation is then carried
out on the server side by the Cold Fusion CGI.
Both Cold Fusion and WebDBC have ISAPI DLLs available to narrow
the performance gap between the Web server and the gateway. Despite
being an HTTP server itself, WebBase was noticeably slower in
its delivery of HTML documents and query template processing.
We expected the opposite to be true and suspect it's a result
of a less-optimized HTTP server. Therefore, we found the DLLs
to have comparable, if not better, performance.
Dynamic HTML Table Generation This last method takes dynamic
HTML generation a step further. In this case, HTML coding and
static queries are entirely eliminated. Instead, you use a modeling
application to pick the tables, views and stored procedures that
you want visible to the Web browser. The CGI uses this model file
to generate the entire HTML tree on the fly. The result is dynamically
generated HTML pages of database tables, replete with headers,
footers, navigation links and even predefined query pages. Typically
there is an additional server component that communicates with
the CGI to perform the actual database query. Both NetScheme's
Intermart Toolkit and DataCraft's DataWave extension to the 4th
Dimension RDBMS currently use this technique.
The Dynamic Table Generation method requires no HTML coding whatsoever.
With these tools you can publish your existing database schemas
almost instantly. If your data model includes a foreign key relationship,
your drill-down detail is automatically linked and generated.
Maintaining state becomes a non-issue as session IDs are automatically
embedded in the generated HTML pages. This also allows them to
easily cache user connections.
Obviously, the trade-off here is flexibility. NetScheme has removed
the tedious work of developing repetitive Web pages, but in the
version we tested it's strictly publish only. We couldn't define
pages with any inserts, updates, deletes or with any custom features.
DataWave is more complex as it sits on top of the full-featured
4th Dimension client/server development environment. It will export
any 4D application in its entirety, and it will allow you to define
custom headers and footers for the resulting Web pages. If you
can live with the feeling that this isn't quite what you wanted,
but it satisfies your users, then this is the way to go.
In terms of performance, this method has a stroke for and against
it. On one hand, the database connection can be maintained for
the life of the session ID, eliminating the overhead associated
with logging in a user. On the other hand, HTML must be regenerated
for each page and session.
Making a Choice Between configuring ODBC data sources,
installing the CGIs, coordinating access rights and debugging
cryptic HTML error messages, configuration of Web-to-DBMS connectivity
can be extremely difficult. And to top it off, generally you can't
use HTML editors because you'll make heavy use of extended HTML.
As always, the first step is to define your requirements. One
of these methods is going to be more appropriate for your application
than others. We use three general categories to narrow which tools
to use for any given project: highly interactive database access,
heavy transactional database access and database publishing (which
roughly correspond to using Perl scripts, embedded queries or
dynamic table generation). It's also important to remember that
you aren't completely limited to using only one of these methods.
If you're going to use a dynamic HTML generator, for example,
but need a complex task accomplished, then by all means throw
in a Perl script.
Anthony Frey can be reached at afrey@nwc.com
.
<HTML>
<HEAD>
<TITLE> IP Address Query </TITLE>
</HEAD>
<BODY>
<H4>Search For IP Addresses</H4>
<FORM METHOD+POST ACTION="/cgi-bin/dbml.exe?template=lpadder.dbm">
<PRE>
Low IP Address: <INPUT TYPE="text" NAME="lowip"
SIZE="20">
High IP Address: <INPUT TYPE="text" NAME="highip"
SIZE="20">
<INPUT TYPE="submit" VALUE="Search">
</PRE>
</FORM>
<HR>
</BODY>
</HTML>
<!-- Define query to lookup IP Address range-->
<DBQUERY Name="qryIPaddr" DataSource="NetDB"
MAXROWS=10
SQL="SELECT ipaddr, hwaddr, contact_id FROM tblIPaddresses
WHERE ipaddr > '#lowip#' AND ipaddr <'#highip#'
ORDER BY ipaddr>
<!-- Format HTML results --> <HTML>
<HEAD>
<TITLE>IP Address Search Results</TITLE> </HEAD>
<BODY>
<H1>IP Address Search Results</H1> <HR>
<!-- Table display of selected IP addresses -->
<DBTABLE Query="qryIPaddr" COLHEADERS>
<DBCOL Header="IP Address" Width=18 Text="#ipaddr#">
<DBCOL Header="HW Address" Width=20 Text="#hwaddr#">
<DBCOL Header="Contact ID" Width=15 Text="#contact_id#">
</DBTABLE>
</BODY>
</HTML>