College of
Engineering
EEL4781, Summer 2002
SQL
is a recognized standard language designed to organize, access and protect
valuable data. Proper use allows the user to do things with their data that
could not be done before. In some cases it is the only means of accessing data
that would otherwise be unavailable. SQL Networking is designed to operate on
relational databases. Data items, such as a customer's name, address, and
telephone number, are gathered together into tables. Looser relationships
connect the tables to each other. For instance, the CUSTOMER table might be
related to the PRODUCT table by way of the SALES INVOICE table.
The
question is, what is SQL, and how is it different from other computer
languages?
First
of all SQL is not a procedural language like FORTRAN, Basic, C, COBOL, or PAL.
Usually, to solve a problem in one of those languages, one must write a
procedure that performs a sequence of operations in a predefined manner until
the task is complete. In such case the programmer has to specify the order of
execution. SQL is nonprocedural. To solve a problem with SQL, you tell the
system what you want, instead of telling it how to get what you want. The
database management system then decides the best way to do it.
Let
suppose you are the personnel manager for a small company, and you have an
employee table with 200 employees in it. You want to retrieve from it the rows
corresponding to all your senior people, where you define a senior person as
anyone over age 40, or anyone earning more than $50,000 per year. You could do
so with the following SQL query:
SELECT *FROM EMPLOYEEWHERE AGE > 40OR SALARY > 50000;
This SQL statement will retrieve all rows from the
EMPLOYEE table where either the value in the AGE column is greater than 40 or
the value in the SALARY column is greater than 50000. With SQL, you don't have
to know which is true. The database engine examines what is available and
decides for itself. 5
The father of SQL, is Dr. E.F. "Ted" Codd
who worked for IBM. After Codd described a relational model for databases in
1970, IBM spent a lot of time and money researching how to implement his ideas.
But other companies had formed and created relational database products before
IBM was ready to release System/R. Oracle came out first and the second was Relational
Technology's Ingres. IBM then released improved products in 1982 named SQL/DS
and DB2. SQL is a standard, open language without corporate ownership. The
commercial acceptance of SQL was precipitated by the formation of SQL Standards
committees by the American National Standards Institute and the International
Standards Organization in 1986 and 1987. Two years later they published a
specification known as SQL-89. An improvement and expansion (to some 600 pages)
to the standard gave the world SQL-92. We now have the third generation
standard, SQL 99. The existence of standards is important for the general
portability of SQL statements.2
The vast majority of the language has not changed
through the years. Most DBMS are designed to meet the SQL92 standard. Since
many of the advanced features of SQL92 have yet to be implemented by DBMS
vendors, there has been little pressure for a new version of the standard.
Nevertheless a SQL99 standard was developed to address advanced issues in SQL.
All of the core functions of SQL, such as adding, reading and modifying data,
are the same. As of early 2001, no vendor has implemented the SQL99 standard.
SQL is a language that enables the user to process
anywhere from very little data to as much data as is required. It can be used
to read from and write to databases located in computer systems. 6 More
specifically, it can be used to access, define, delete and edit data in
database systems such as Oracle or Access among others. 1 It is an
innovation in the management of relational databases. From experience we know
that most languages are procedural in nature. That is, the programmer writes a
set of instructions that will tell the computer what to do in a step-by-step
manner. On the other hand, SQL is a declarative language that allows the
"programmer" to make a declarative statement to the computer in order
to request specific information. An advantage that is apparent with this type
of a language is that queries do not depend on the representation of the data.
Data is gathered by means of a declaration that somehow specifies what data is
desired. Another advantage is that "bugs" can be easily found and
fixed in an SQL language statement as opposed to finding them in the code for
procedural languages. Another advantage is that SQL's ease of use makes it an
ideal language for people to write in and understand it.
We have described so far some of the advantages of
SQL in accessing databases, but what are databases? Databases can simply be
defined as a conglomeration of information assembled into tables. These tables
can hold any type of information that is suitable to be maintained in an
organized fashion. As can be expected, these tables can become enormously large
and increasingly cumbersome to access the data and/or information that one
desires to obtain. SQL can be utilized to retrieve only the data that is
pertinent for a particular operation and/or use in an easy and efficient
manner. In other words, it allows us to manipulate the data within databases in
a manner that is beneficial for a particular application.
Picture obtained from [2]
In a network environment, different
computers can hold different databases. For example, one computer can have a
database maintaining customer's First and last names, address, phone numbers
etc. Another computer can contain databases on product information that the
customer might be interested in ordering or has ordered in the past. The first
computer is located at a location far from the other computer but the need to
reference one database with the other arises. Microsoft® SQL Server has been
created for this and other reasons. "Microsoft® SQL Server™ 2000 is a
family of products that meet the data storage and analysis requirements of the
largest data processing systems and commercial Web sites." 4 Programs
exist that manage databases but SQL is the language that allows the user to
access this information efficiently.
As specified earlier, SQL is a language, not
a software product, so the real question is best phrased as, when does one need
to use a heavy-duty DBMS, which would then require to start communicating in
SQL? There are several factors to consider which we'll look at in turn below.
Many small desktop systems such as Access will
perform well if five users are on a LAN. When you get to ten or twenty
concurrent users, problems begin. The primary problem is that it was never
designed to support many users at once. Database software designed for the
desktop generally fails when more than a few people try to use the data at the
same time. Large numbers of concurrent hits from a web site would of course
create problems to a desktop database such as Microsoft Access. Lower scale
systems also lack the security that more powerful systems offer. They are quite
easy to copy and walk away with making it easy for companies to move from
Access type desktop systems to a more robust DBMS. Almost all of these heavy
duty DBMS rely on SQL as the main form of communication. Although there is
nothing about SQL itself that concerns scaling, but there is a need to use it
in order to communicate with DBMS that scale well.2
SQL itself does not cure speed problems, the
implementation of faster DBMS does, and those faster DBMS will probably require
communications in SQL. There are heated contests between the major vendors for
bragging rights to the fastest machine with the lowest transaction price. But
the intense competition drives the vendors to produce faster, more robust DBMS
that work at lower and lower costs per transaction.2
It is more expensive to run a DBMS that only speaks
SQL than it is to use a desktop system such as Paradox, and Access. Not only the software is expensive, but in
most cases more expensive operating systems have to be run in order to support
the DBMS. Last, you will need personnel with more expensive qualifications. As
the data center grows, at some point the reliability, performance and
standardization benefits of a DBMS that uses SQL will outweigh the cost.2
An alternative to using SQL statements is to write
code in a procedural language like C++. The problem is if there is a change in
the structure of the tables the code must change. Also, if a new DBMS is
installed, most of the code must be revised to match with the new DBMS system.
But by using SQL statements almost all changes are handled by the DBMS, behind
the scenes from the programmer.2
An advanced feature of SQL is the emergence of
analytical tools that allow managers to extract business knowledge from large
amounts of data. These tasks frequently require the constructions of
multi-dimensional aggregates of data. These tools are referred to by the
generic names of Online Analytical Processing (OLAP), Decision Support Systems
(DSS) and Executive Information Systems (EIS). All of these tools are only
available in full-scale DBMS that require SQL for their means of communication.
As much as we have talked about the power of SQL, we
must be honest in describing those situations where an alternative system would
be better. You are probably better off staying with software that does not
require SQL in the situation where only one or a few people will use the
database at once, or where the data can be stored within 50% of the disk
capacity.2
The power of SQL in a network environment becomes
apparent when the language is used to manage a database. The language allows
you to query a database through statements that are very similar to writing a
full syntax sentence in English. Previous solutions to querying databases were
based on indexing them. This was efficient if the size of the database was
small3. As databases grow larger in size, the need arises for a more suitable
way of managing all the information that is amassed. If we were to add the
number of people requesting the same or different information from the same
sources of information, servers would slow down to a crawl3. A better way was
needed to allow access to multiple users to the same database in a structured
and easy manner. SQL, and more specifically SQL Server, was developed just for
that reason.
To access all the available data within the databases
a database manager can type a command much like the following:
SELECT
FirstName, LastName, City, PhoneNumber FROM Customer
WHERE
State = 'Floridia' AND Age > 21,
ORDER
BY LastName;
This command is all that is needed to obtain a
customer's first name, last name, the city the customer lives in, and their
phone number from the table called "Customer". The statement also
includes some filters that will provide very specific results. In the previous
example, only the customers that live in the state of
There is a need for sharing information through SQL
over a network between databases and users. "SQL is designed to operate on
relational databases. In a relational database, closely related data items,
such as a customer's name, address, and telephone number, are gathered together
into tables. Looser relationships connect the tables to each other. For example
the CUSTOMER table might be related to the PRODUCT table by way of the SALES
INVOICE table 5."
We have learned over the years that if we connect
computers together they are able to operate more efficiently than computers
that are standalones. The real problem arises in finding an easy to understand,
easy to write and easy to debug language that will allow access to the power of
information at the fingertips. The problem does not stop there either. In a
network setup, the possibility exists that a database needed for an operation
resides on another computer totally different from yours and most likely running
a different database management system. In order to make these databases
available to all users, a standard is required for communicating to the
databases. This standard is found through the SQL language. It provides the power to effortlessly
access different databases on different platforms. "SQL is the language
that virtually all database servers understand, regardless of hardware,
operating system, or database engine5."
This project will connect to a database located in a
different computer from where the information is being requested and obtain the
requested information via a network connection between the computers.
Specifically, the project will involve the design of this database and the
application of the SQL language to access them through a network environment.
Our project will assume a company requesting a database system that tracks the company's customers, employees, products, orders, payments and factories. The data requirements that we will utilize for this system are discussed next.
The company is organized into stores and factories. The stores receive and order the products that the factories produce. This products will ultimately be distributed to the customers by the employees. All the products are stored in the database along with its product name, description and unit price. Some company factories can produce only one product while any one factory can produce more than one product. Using a database, the company would like to keep track of all the orders that will be placed by each store. All orders must include the date when the order was placed, the list of products, the customer, the employee name, the store identifier and the total of money spent for each order. For billing purposes, the company would like to keep track of all the payments made per order. Partial payments are accepted, so any one order can be paid in full or in installments. All payments are stored with their date, order ID, amount and the remaining balance. Payment numbers are to be created automatically as they are being made or received. The balance shown will represent the amount still owed in order to pay in full the requested product order. The amount shown will represent the payment being received. Furthermore the company wants to save each store's name, address, phone, and the city and state where the company is located. The company would also like to store the customer's name, address, phone, city, and state. Employee information is also desired so that the employee's name, address, phone, city and state will also have to be stored. Through discussion of these requirements, there will have to be a way in which we can inter-relate all the databases without giving or accessing redundant information. There has to be a way to bring all the information stored in the database tables back together again. The most efficient way of accomplishing this task is to define relationships between the tables. The fields in all the tables must be coordinated in such a way that they show information about the same order. As we said before, this can be accomplished by setting up relationships between the tables.
A
relationship is an association established between the columns in two
tables. For this purpose, we have to
set up what are called primary keys in all the database tables. Primary keys
are defined as one or more fields whose value or values within a table uniquely
identify a record within the table.7 Primary keys can further be
described as matching fields between tables; they can also become the foreign
keys in other tables. A foreign key is defined as "one or more table fields (columns) that refer to the primary key field
or fields in another table."7 A foreign key is used to specify
how the tables are related and we must ascertain that the data in the
foreign key and primary key fields match. Nevertheless, the field names do not
have to be the same. For example, an OrderProducts table might contain the
foreign key called ProdID, relating to the primary key ProductID in the
Products table. Three different types of relationships can be established as
follows:
ü One-to-many (most common)
ü Many-to-many
ü One-to-one
One-to-many relationships allow a record in one table to have many matching records in another table. Table 1 can have one or more matching records in Table 2, but the reverse is not true. Table 2 cannot have but one matching record in Table 1. For our example, one factory can produce more than one product, but each product can only be produced by one factory. So the relationship between the Factory table and the Products table would be "one-to-many".
Many-to-many relationships allow Tables to have a record with many matching records in other tables. In order to use the many-to-many relationship, another table is required. This table's primary key would consist of two fields – the foreign keys from both Tables 1 and 2. In essence, this type of relationship is two 'one-to-many' relationships with a third table. For example, the Orders table and the Products table have a many-to-many relationship that is defined by developing two, one-to-many relationships with the Order details table. One order can have many products ordered, and each product can appear on many orders. The unifying table used would be the Order Details table that includes the Order ID, the Product ID and the unit price.
A one to one relationship refers to the relationship where each record in any given table can have only one matching record in another table. Table 1 records can only have a matching record in Table 2, and Table 2 records can have only one matching record in Table 1. SQL has several commands that can be used to access the information within these tables in an efficient manner. Examples of these commands can be found above in section 1.1 and section 2.
The
relationships for this project can best be described by means of an
illustration. The following illustration
establishes the relationships between the tables of the database we will build:
Figure A
The application of the SQL language will be
performed by means of scripts. These scripts provided in the following links will
be used to implement the database. The scripts that will be used are as
follows:
1. CloseTables.sql
– this script is used to close the tables once they exist in the calling system.
This script will also be instrumental in dropping all the sequential objects
used when the "CreateTables" script is called.
2. CreateTables.sql
– This script is utilized to create all the tables needed in order to realize
the project.
3. CreateViews.sql
– This script is utilized to identify the views that will be used for querying.
4. InsertData.sql –
This script is used to insert data into the tables that have been created. This
process is usually called populating the tables.
5. DeleteTables.sql
– This script is used to clear all the data that exists in all the tables.
Furthermore, an algorithm that could be used
to implement the database can be obtained from the following flow chart:

The algorithm that could
be implemented could look as follows:
1.
Start
2.
Do the tables
exist?
If(true)
{
//run the CloseTables script
CloseTables.sql /*@ c:\…\CloseTables.sql */
}
else /*tables do not exist*/
{
/*create
them, modify them and populate them*/
CreateTables.sql
/* @c:\…\CreateTables.sql */
ModifyTables.sql
/* @c:\…\ModifyTables.sql */
InsertData.sql /* @c:\…\InsertData.sql */
}
3. Stop / Run queries
/* Once the tables are created, modified and
populated, we are ready to run queries */
In step number three of this algorithm we
commented that queries could be performed after the tables have been created, modified
and populated. As part of this project's design solution we will run the
following queries that will ultimately perform the functionality that the
company requires:
ü
Locate the
number of customers that live in the state of Florida
ü
Count the number
of orders made by store number 4
ü
Find all the
employees who work on a particular store by using the Orders table.
ü
Find the
customers and employees located in different cities.
ü
Find a list of
the products starting with the least expensive to the most expensive.
ü
Find all
employees that do not have an address listed in the database.
ü
Find all the
products with a unit price less than 50 dollars
ü
An employee
would like to know which of the customers live in the state of Florida.
ü
List all the
names of the customers and the employees.
The Queries that we ran can be found in the following links:
· Product_Price_Per_Unit_greater_than.sql
· Product_Price_Per_Unit_less_than.sql
· Product_Qty_Total_greater_than.sql
· Product_Qty_Total_less_than.sql
a.)
The JDBC driver must be
loaded
b.)
A connection to the
database server must be established
After this connection has been established, we can then run the SQL commands written above and obtain the results. The application interface for JDBC already includes classes that perform these functionalities. In order to load up the JDBC driver and register the driver in the client, we can use the DriverManager. The connection to the database is actually performed through the Connection class. The Connection class will in effect perform authentication of the database server along with the connection by means of a supplied uniform resource locator.
In order to perform the SQL commands, the Statement class
will load the commands into the pre-processor by moving them into the database
engine. To obtain the results of the commands, the class ResultSet is included
as a means to view the results of our select statements. Each of theses must be
performed in the order specified:
1) DriverManager – load database driver.
2) Connection – connect to the database.
3) Statement – run the SQL statements.
4) ResultSet – obtain the results.
Out of three different ways of loading the device driver, the preferred way is to call the class method Class.forName(). For our example we can call the class method giving the parameter "sun.jdbc.odbc.JdbcOdbcDriver" in order to load the JDBC-ODBC bridge.
After successfully loading the driver, the connection can be established by supplying a uniform resource locator to the getConnection class method. The Driver Manager automatically looks at the supplied URL and searches through the drivers that have been registered until it finds one that will be suitable to connect to the database supplied within the URL. Database URLs must be supplied in a pre-defined format. For JDBC, the URL must be supplied as follows:
jdbc:<subprotocol>:<subname>
The first portion of this format defines the
protocol, the second part defines the sub-protocol, which in our case would be
"odbc". The last part, the sub-name, is the local data source. The
sub-name should provide all the necessary information in order to locate the
data source. One neat thing about the ODBC sub-protocol, is that it will
provide all the extra information neede to locate databases. What that means
for our application is that is the database to be accessed is the Employee
database table; all that needs to be supplied is the sub-name
"employee".
The next step involves running the commands. The Connection
class includes the class method, createStatement which will create a Statement
object. In order to create this object the following two commands must be
included:
Connection dbconnect = DriverManager.getConnection("jdbc:oracle:employees", "scott", "tiger");
Statement
dbstatemnt = con.createStatement();
Where "scott" and "tiger" are the
username and password, respectively, of the database.
Through dbstatemnt, access is obtained to three methods used to execute SQL statements. Primarily we obtain access to the following statements: "execute", "executeQuery" and "executeUpdate".
The method execute, can be used to execute the statement and return the results. The executeQuery method can be utilized for SQL statements that return single result sets such as the Select statement. Lastly, the executeUpdate method can be used to insert data, delete and update data by means of the INSERT, DELETE, AND UPDATE statements. These three statements will return an integer representing the number of rows that were affected by the statements. Through the executeUpdate method, it is also possible to create, and drop tables by using the CREATE TABLE and DROP TABLE statements.
The results can then be obtained by creating an
object of the ResultSet class as follows:
ResultSet
dbResltset=dbStatmnt.executeQuery("SELECT PRODUCTNAME, PRICEPERU FROM
Products");
This statement will go out and return all the rows (and columns) included in the Products table. The results will look like this:

To
execute an SQL statement, the following can be implemented using java:
java.sql.Statement dbstatemnt= dbconnect.createStatement();
ResultSet dbResltset=
dbstatemnt.executeQuery("SELECT PRODUCTNAME, PRICEPERU FROM Products");
while (dbResltset.next())
{
// Display
the values of the current row in the record set.
String s =
dbResltset.getString("PRODUCTNAME");
int i = dbResltset.getInt("PRICEPERU");
System.out.println("ROW
= " + s + " " + i + ");
}
As a
simple example, we can join all of the above to connect to part of our
database, more specifically we will create the Employee table, insert data and
retrieve data from it through the following code:
//Create the Employee table and
insert data into it
import
java.sql.*;
public
class CreateEmployeeTable
{
static
String[ ] SQL = {
"create
table EmployeeTable("+
"EMPLOYEEID
varchar (6),"+
"EMPLOYEENAME
varchar (15),"+
"EMPLOYEEPHONE
varchar (12),"+
"EMPLOYEEADRESS
varchar (20),"+
"EMPLOYEECITY
varchar (15),"+
"EMPLOYEESTATE
varchar (15);",
"insert
into EmployeeTable values ('E1',
'Mark', '407-759-4562', '', 'Oviedo', 'FL');",
"insert
into EmployeeTable values ('E2',
'Bill', '352-123-7412', '', 'Eustis', 'FL');",
"insert
into EmployeeTable values ('E3',
'Greg', '111-148-7898', '', 'Paris', 'FR');",
"insert
into EmployeeTable values ('E4',
'Cindy', '352-897-4454', '', 'Tavares', 'FL');",
"insert
into EmployeeTable values ('E5',
'Rosa', '787-447-4545', '', 'San Juan', 'PR');",
};
public static void main(String[]
args)
{
String URL
= "jdbc:odbc:Employee";
String
username = "scott";
String password = "tiger";
try
{
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
}
catch
(Exception e)
{
System.out.println("JDBC/ODBC driver
failed to load.");
return;
}
Statement dbstatemnt = null;
Connection
dbconnect =null;
try
{
dbconnect = DriverManager.getConnection
(URL,username,password);
dbstatemnt= dbconnect.createStatement( );
}
catch
(Exception e)
{
System.err.println("Could not connect
to "+ URL);
}
try
{
// run the
SQL commands in order to create the table and insert data
for (int
i=0; i<SQL.length; i++)
{
dbstatemnt.execute(SQL[i]);
}
dbconnect.close();
}
catch
(Exception e)
{
System.err.println("SQL sent to
"+ URL + ": "+ e.getMessage( )
+ "has problems.");
}
}
}
//Report
the obtained data
import java.sql.*;
public class EmployeeReport
{
public static void main (String args[]) {
String URL = "jdbc:odbc:Employee";
String username = "scott";
String password = "tiger";
try
{
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver");
}
catch (Exception e)
{
System.out.println("JDBC/ODBC
driver failed to load.");
return;
}
Statement dbstatemnt = null;
Connection dbconnect=null;
try
{
dbconnect = DriverManager.getConnection
(URL,username,password);
dbstatemnt = dbconnect.createStatement( );
}
catch (Exception e)
{
System.err.println("Encountered problems connecting to
"+ URL);
}
try
{
ResultSet dbResltset = dbstatemnt.executeQuery(
"SELECT
EMPLOYEEID, EMPLOYEENAME, EMPLOYEESTATE FROM
EmployeeTable ORDER BY EMPLOYEEID asc;");
dbResltset.next(); // move to next row
int id = dbResltset.getInt("EMPLOYEEID");
String name =
dbResltset.getString("EMPLOYEENAME");
String state = dbResltset.getString("EMPLOYEESTATE
");
System.out.println(id + " is the id for employee,
"+ name +" from the state of "+ state + ".");
dbResltset =
dbstatemnt.executeQuery(
"SELECT EMPLOYEENAME FROM EmployeeTable;");
// for each row search for the following name
name = "Mark";
while(dbResltset.next( ) )
{
name += dbResltset.getString("EMPLOYEENAME ");
}
System.out.println("Employees named "+ name +
" .");
dbResltset = dbstatemnt.executeQuery("SELECT soda, day,
sales FROM SodaTable;");
while(result.next())
{
String city = dbResltset.getString("EMPLOYEECITY
");
name = result.getString("EMPLOYEENAME");
state = result.getString("EMPLOYEESTATE ");
System.out.println(name + " " + "works
in" + city + "," + state);
}
dbconnect.close();
}
catch (Exception e)
{
e.printStackTrace( );
}
}
}
In the previous sections, we have given you some general information about the SQL language and we have created a scenario where SQL could be implemented. We then went ahead and used SQL to create, modify, populate and delete tables that form a database. By doing so, we hope to have demonstrated the usefulness and simplicity of the SQL language. Although not explicitly demonstrated, we hope to have shown you how SQL transcends platforms. In other words, SQL is a non-platform specific language that can be utilized in all or most systems. In our case, we used the Oracle database system, but any other could have used just as well. A small example using the JDBC driver was implemented in order to provide an understanding of how SQL can be used in a network environment. On a more user-friendly level, SQL Plus was used to demonstrate how simple a connection to a server can be made by supplying the host string of the server were the database tables are located. Then it becomes a matter of running the SQL scripts as was done for the case of the local connection to a database.
http://www.codefixer.com/sql_tutorial/what_is_sql.asp
http://philip.greenspun.com/sql/
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/startsql/getstart_4fht.asp