|
  
3.
DBMS Architecture
3.1
Logical DBMS Architecture
3.2
Physical DBMS Architecture
3.
DBMS Architecture
Database
Management Systems are very complex, sophisticated software applications
that provide reliable management of large amounts of data. To better
understand general database concepts and the structure and capabilities
of a DBMS, it is useful to examine the architecture of a typical database
management system.
There
are two different ways to look at the architecture of a DBMS: the logical
DBMS architecture and the physical DBMS architecture. The
logical architecture deals with the way data is stored and presented
to users, while the physical architecture is concerned with the software
components that make up a DBMS.
3.1
Logical DBMS Architecture
The logical
architecture describes how data in the database is perceived by users.
It is not concerned with how the data is handled and processed by the
DBMS, but only with how it looks. Users are shielded from the way data
is stored on the underlying file system, and can manipulate the data
without worrying about where it is located or how it is actually stored.
This results in the database having different levels of abstraction.
The majority
of commercial Database Management Systems available today are based
on the ANSI/SPARC generalized DBMS architecture, as proposed by the
ANSI/SPARC Study Group on Data Base Management Systems.
The ANSI/SPARC
architecture divides the system into three levels of abstraction: the
internal or physical level, the conceptual level,
and the external or view level. The diagram below shows
the logical architecture for a typical DBMS.
Figure
3-1 Logical DBMS Architecture
The
Internal or Physical Level
The collection
of files permanently stored on secondary storage devices is known as
the physical database. The physical or internal level is the one closest
to physical storage, and it provides a low-level description of the
physical database, and an interface between the operating system's file
system and the record structures used in higher levels of abstraction.
It is at this level that record types and methods of storage are defined,
as well as how stored fields are represented, what physical sequence
the stored records are in, and what other physical structures exist.
The
Conceptual Level
The conceptual
level presents a logical view of the entire database as a unified whole,
which allows you to bring all the data in the database together and
see it in a consistent manner. The first stage in the design of a database
is to define the conceptual view, and a DBMS provides a data definition
language for this purpose.
It is
the conceptual level that allows a DBMS to provide data independence.
The data definition language used to create the conceptual level must
not specify any physical storage considerations that should be handled
by the physical level. It should not provide any storage or access details,
but should define the information content only.
The
External or View Level
The external
or view level provides a window on the conceptual view which allows
the user to see only the data of interest to them. The user can be either
an application program or an end user. Any number of external schema
can be defined and they can overlap each other.
The System
Administrator and the Database Administrator are special cases. Because
they have responsibilities for the design and maintenance for the design
and maintenance of the database, they at times need to be able to see
the entire database. The external and the conceptual view are functionally
equivalent for these two users.
Mappings
Between Levels
Obviously,
the three levels of abstraction in the database do not exist independently
of each other. There must be some correspondence, or mapping, between
the levels. There are actually two mappings: the conceptual/internal
mapping and the external/conceptual mapping.
The conceptual/internal
mapping lies between the conceptual and internal levels, and defines
the correspondence between the records and the fields of the conceptual
view and the files and data structures of the internal view. If the
structure of the stored database is changed, then the conceptual/ internal
mapping must also be changed accordingly so that the view from the conceptual
level remains constant. It is this mapping that provides physical data
independence for the database.
The external/conceptual
view lies between the external and conceptual levels, and defines the
correspondence between a particular external view and the conceptual
view. Although these two levels are similar, some elements found in
a particular external view may be different from the conceptual view.
For example, several fields can be combined into a single (virtual)
field, which can also have different names from the original fields.
If the structure of the database at the conceptual level is changed,
then the external/conceptual mapping must change accordingly so the
view from the external level remains constant. It is this mapping that
provides logical data independence for the database.
It is
also possible to have another mapping, where one external view is expressed
in terms of other external views (this could be called an external/external
mapping). This is useful if several external views are closely related
to one another, as it allows you to avoid mapping each of the similar
external views directly to the conceptual level.
3.2
Physical DBMS Architecture
The physical
architecture describes the software components used to enter and process
data, and how these software components are related and interconnected.
Although it is not possible to generalize the component structure of
a DBMS, it is possible to identify a number of key functions which are
common to most database management systems. The components that normally
implement these functions are shown in the diagram on the following
page, which depicts the physical architecture for a typical DBMS. At
its most basic level the physical DBMS architecture can be broken down
into two parts: the back end and the front end.
The back
end is responsible for managing the physical database and providing
the necessary support and mappings for the internal, conceptual, and
external levels described earlier. Other benefits of a DBMS, such as
security, integrity, and access control, are also the responsibility
of the back end.
The front
end is really just any application that runs on top of the DBMS. These
may be applications provided by the DBMS vendor, the user, or a third
party. The user interacts with the front end, and may not even be aware
that the back end exists.
Figure
3-2 Physical DBMS Architecture
Both the
back end and front end can be further broken down into the software
components that are common to most types of DBMS. These components are
examined in detail in the following sections.
Applications
and Utilities
Applications
and utilities are the main interface to the DBMS for most users. There
are three main sources of applications and utilities for a DBMS: the
vendor, the user, and third parties.
Vendor
applications and utilities are provided for working with or maintaining
the database, and usually allow users to create and manipulate a database
without the need to write custom applications. However, these are usually
general-purpose applications and are not the best tools to use for doing
specific, repetitive tasks.
User applications
are generally custom-made application programs written for a specific
purpose using a conventional programming language. This programming
language is coupled to the DBMS query language through the application
program interface (API). This allows the user to utilize the power
of the DBMS query language with the flexibility of a custom application.
Third
party applications may be similar to those provided by the vendor, but
with enhancements, or they may fill a perceived need that the vendor
hasn't created an application for. They can also be similar to user
applications, being written for a specific purpose they think a large
majority of users will need.
The most
common applications and utilities used with a database can be divided
into several well-defined categories. These are:
-
Command
Line Interfaces-these
are character-based, interactive interfaces that let you use the
full power and functionality of the DBMS query language directly.
They allow you to manipulate the database and perform ad-hoc queries
and see the results immediately. They are often the only method
of exploiting the full power of the database without creating programs
using a conventional programming language.
-
Graphical
User Interface (GUI) tools-these
are graphical, interactive interfaces that hide the complexity of
the DBMS and query language behind an intuitive, easy to understand,
and convenient interface. This allows casual users the ability to
access the database without having to learn the query language,
and it allows advanced users to quickly manage and manipulate the
database without the trouble of entering formal commands using the
query language. However, graphical interfaces usually do not provide
the same level of functionality as a command line interface because
it is not always possible to implement all commands or options using
a graphical interface.
-
Backup/Restore
Utilities-these are designed to minimize the effects of a database
failure and ensure a database is restored to a consistent state
if a failure does occur. Manual backup/restore utilities require
the user to initiate the backup, while automatic utilities will
back up the database at regular intervals without any intervention
from the user. Proper use of a backup/restore utility allows a DBMS
to recover from a system failure correctly and reliably.
-
Load/Unload
Utilities-these
allow the user to unload a database or parts of a database and reload
the data on the same machine, or on another machine in a different
location. This can be useful in several situations, such as for
creating backup copies of a database at a specific point in time,
or for loading data into a new version of the database or into a
completely different database. These load/unload utilities may also
be used for rearranging the data in the database to improve performance,
such as clustering data together in a particular way or reclaiming
space occupied by data that has become obsolete.
-
Reporting/Analysis
Utilities-these
are used to analyze and report on the data contained in the database.
This may include analyzing trends in data, computing values from
data, or displaying data that meets some specified criteria, and
then displaying or printing a report containing this information.
The
Application Program Interface
The application
program interface (API) is a library of low-level routines which operate
directly on the database engine. The API is usually used when creating
software applications with a general-purpose programming language such
as C++ or Visual Basic. This allows you to write custom software applications
to suit the needs of your business, without having to develop the storage
architecture as well. The storage of the data is handled by the database
engine, while the input and any special analysis or reporting functions
are handled by the custom application.
An API
is specific to each DBMS, and a program written using the API of one
DBMS cannot be used with another DBMS. This is because each API usually
has its own unique functions calls that are tied very tightly to the
operation of the database. Even if two databases have the same function,
they may use different parameters and function in different ways, depending
on how the database designer decided to implement the function in each
database. One exception to this is the Microsoft Open Database Connectivity
API, which is designed to work with any DBMS that supports it.
The
Query Language Processor
The query
language processor is responsible for receiving query language statements
and changing them from the English-like syntax of the query language
to a form the DBMS can understand. The query language processor usually
consists of two separate parts: the parser and the query optimizer.
The parser
receives query language statements from application programs or command-line
utilities and examines the syntax of the statements to ensure they are
correct. To do this, the parser breaks a statement down into basic units
of syntax and examines them to make sure each statement consists of
the proper component parts. If the statements follow the syntax rules,
the tokens are passed to the query optimizer.
The query
optimizer examines the query language statement, and tries to choose
the best and most efficient way of executing the query. To do this,
the query optimizer will generate several query plans in which operations
are performed in different orders, and then try estimate which plan
will execute most efficiently. When making this estimate, the query
optimizer may examine factors such as: CPU time, disk time, network
time, sorting methods, and scanning methods.
The
DBMS Engine
The DBMS
engine is the heart of the DBMS, and it is responsible for all of the
data management in the DBMS. The DBMS engine usually consists of two
separate parts: the transaction manager and the file manager.
The transaction
manager maintains tables of authorization and currency control information.
The DBMS may use authorization tables to allow the transaction manager
to ensure the user has permission to execute the query language statement
on the database. The authorization tables can only be modified by properly
authorized user commands, which are themselves checked against the authorization
tables. In addition, a database may also support concurrency control
tables to prevent conflicts when simultaneous, conflicting commands
are executed. The DBMS checks the concurrency control tables before
executing a query language statement to ensure that it is not locked
by another statement.
The file
manager is the component responsible for all physical input/output operations
on the database. It is concerned with the physical address of
the data on the disk, and is responsible for any interaction (reads
or writes) with the host operating system.
   
|