Previous PageTop Of PageTable Of Contents


16. Distributed Databases

16.1 Introduction to Distributed Databases
16.2 DBMaker Distribution Structure
16.3 Distributed Database Environment Setup

16.4 Access Distributed Database Objects

16.5 Distributed Transaction Control


16. Distributed Databases

This chapter introduces the Distributed Database management functions provided by DBMaker, including distributed databases, the distributed architecture of DBMaker, distributed data access, distributed database object management, and distributed transaction management.

16.1 Introduction to Distributed Databases

Traditional client-server database management systems, as shown in figure 16-1, locate the database in one specific computer in network, and this computer is responsible for handling all client requests.

Figure 16-1: Traditional client/server database management system.

Distributed databases, as shown in figure 16-2, locate a copy of the database in several computers in the network, and each can independently support database clients. The distributed database management system manages the databases in these computers, so you can access your data transparently, without any concern for where the data is physically located.

DBMaker 3.5 supports a true distributed architecture to provide a complete and robust distributed database management system (Distributed DBMS). It provides remote database connections, distributed queries, and distributed transaction management. DBMaker also provides table and database replication to keep data up-to-date automatically.

Figure 16-2: Distributed database in client-server

In the DBMaker distributed database environment, you can write application programs using the DBMaker ODBC 3.0 compatible API or perform ad-hoc SQL queries that access data from different parts of the distributed database. DBMaker will transparently integrate the data and return the results to you, just as if they all came from the local database.

In this chapter, we will briefly describe the system architecture and basic functions of distributed database management using DBMaker 3.5. This includes configuring the distributed environment, managing remote data links and distributed transactions, and performing distributed queries. Whether you are a database administrator or an application developer, this chapter will provide you with a thorough overview of the simplicity and power of the DBMaker distributed architecture.

16.2 DBMaker Distribution Structure

The DBMaker distributed database environment builds on the traditional client/server architecture, effectively linking multiple client applications and multiple database servers. Client applications process user requests and display the results, and the database servers handle data management. Each client has a direct connection to a single database server, which is known as the Coordinator Database to that client. Through the Coordinator Database, the client can connect to other remote databases, which are known as Participant Databases.

DBMaker uses a hierarchical distribution structure to connect to remote databases. This allows DBMaker to access data from a remote database with no direct connection to the Coordinator Database by going through one of the Participant Databases. When this happens, the Participant Database acts as coordinator for any child databases beneath it in the hierarchy, and is known as the Local Coordinator Database. The DBMaker distribution structure is illustrated in figure 16-3.

Figure 16-3: DBMaker distribution structure

In figure 16-3, the client application program connects to the database server in New York, which makes the database in New York the Coordinator Database. If you use the database in New York to access data from London and Hong Kong, then both the London and Hong Kong databases are Participant Databases. Some of the data you are looking for in Hong Kong might actually be in the databases in Tokyo or Taipei, so the databases in Tokyo and Taipei are Child Participant Databases. This makes the database in Hong Kong a Coordinator Database for the databases in Tokyo and Taipei, so the database in Hong Kong is not only a Participant Database, but also acts as a Local Coordinator Database.

16.3 Distributed Database Environment Setup

Setting up a distributed database environment using DBMaker 3.5 is very simple. All you need to do is add some keywords to the dmconfig.ini configuration file to set the distributed database configuration options.

You must provide values for the following keywords when setting up a distributed database environment in DBMaker 3.5. Keywords with the prefix DB_ are for the client/server connection between the client and the Coordinator Database, and keywords with the prefix DD_ are for the distributed database connections between the Coordinator Database and the Participant Databases.

DB_SvAdr = <ip_address/host name> - specifies the IP address or host name of the Coordinator Database. This is used so the client application knows where it should look for the Coordinator Database to connect.

DB_PtNum = <port number> - specifies the port number that the client application and the Coordinator Database should use to communicate.

DD_DDBMD = <0/1> - enables distributed database mode for the Coordinator Database. The default value is 0, which means that distributed database mode is disabled.

DD_CTIMO = <number of seconds> - specifies the time in seconds that the Coordinator Database should wait when trying to establish a connection to a Participant Database. The default value is 5 seconds.

DD_LTIMO = <number of seconds> - specifies the time in seconds that the Coordinator Database should wait when trying to establish a lock on the requested data in a Participant Database. The default value is 5 seconds.

DD_GTSVR = <0/1> - enables the global transaction recovery daemon (GTRECO). The default value is 1, which means the global transaction recovery daemon is enable.

DD_GTITV = <YYYY/MM/DD hh:mm:ss> - specifies the time interval that the global transaction recovery daemon (GTRECO) should wait when processing pending global transactions.

DBMaker supports an automatic recovery mechanism for distributed transactions that have failed due to network problems or errors at the Participant Database. The automatic recovery mechanism is handled by the GTRECO daemon, which will check whether a distributed database server has any problems with pending global transactions. If any problems are detected, the GTRECO daemon will attempt to recover the pending global transactions. The GTRECO daemon is enabled using the DD_GTSVR keyword in the dmconfig.ini configuration file.

To better understand how DBMaker manages distributed databases, take an example where the ABC Bank has two branch offices. One branch office is in Los Angeles, and the other one is in Seattle. Each branch maintains their own customer and business data, but the government and financial database is under the central control of the Los Angeles branch.

LA branch database server dmconfig.ini configuration file:

[BankTranx]                     ;LA branch business database
DB_DbDir = c:\database
DB_SvAdr = 192.168.0.1
DB_PtNum = 21000
DD_DDBMD = 1
[BankMIS] ;government and financial database
DB_DbDir = c:\database
DB_SvAdr = 192.168.0.1
DB_PtNum = 30000
DD_DDBMD = 1
[BankTranx@Seattle] ;Seattle branch business database
DB_SvAdr = 192.168.0.2
DB_PtNum = 21000
DD_CTIMO = 20
DD_LTIMO = 10

Seattle branch database server dmconfig.ini configuration file:

[BankTranx]                     ;Seattle branch business database
DB_DbDir = c:\database
DB_SvAdr = 192.168.0.2
DB_PtNum = 21000
DD_DDBMD = 1
[BankMIS] ;government and financial database
DB_SvAdr = 192.168.0.1
DB_PtNum = 30000
DD_CTIMO = 20
[BankTranx@La] ;LA branch business database
DB_SvAdr = 192.168.0.1
DB_PtNum = 21000
DD_CTIMO = 20
DD_LTIMO = 10

LA branch client application dmconfig.ini configuration file:

[BankTranx]                     ;LA branch business database
DB_SvAdr = 192.168.0.1
DB_PtNum = 21000

Seattle branch client application dmconfig.ini configuration file:

[BankTranx]                     ;Seattle branch business database
DB_SvAdr = 192.168.0.2
DB_PtNum = 21000

In the configuration files shown above, you must set DD_DDBMD=1 in the configuration section for the local database to enable distributed database support. In this example, you would put this keyword in the BankTranx configuration section of both the Los Angeles and Seattle dmconfig.ini configuration files.

In addition, you must include a database configuration section for the Participant Database in the Coordinator Database configuration file, and for the Coordinator Database in the Participant Database configuration file. In this case, both the Los Angeles branch database and the Seattle branch database use the same database name. If you use the remote database name for the name of the database configuration section, it will cause a conflict with the local database name in the dmconfig.ini configuration file.

To avoid this type of problem when using distributed databases, DBMaker allows you to distinguish the remote database name from the local database name by attaching a server host description to the remote database name in the local dmconfig.ini configuration file. In this case, the remote database name would look like this:

database_name@server_host_description

The server host description can be any identifying name, such as the IP address or host name of the database server, the domain name, or almost any other descriptive text. In this example, the Los Angeles branch client application would use BankTranx@Seattle when it wants to access data in the Seattle branch database, and the Seattle branch client application would use BankTranx@La when it wants to access data in the Los Angeles branch database.

You must also set up the server address and port name for both the local database and the remote database in their respective configuration sections in the configuration files at both the Los Angeles and Seattle branches. In this example, the Los Angeles branch configuration file would contain the local server address in the BankTranx configuration section, and would contain the Seattle branch server address in the BankTranx@Seattle configuration section. Similarly, the Seattle branch configuration file would contain the Los Angeles branch server address in the BankTranx configuration section, and would contain the Seattle branch server address in the BankTranx@La configuration section.

You should also set the DD_CTIMO and DD_LTIMO remote connection parameters. These parameters go in the configuration section for the Participant Database in the Coordinator Database configuration file, and for the Coordinator Database in the Participant Database configuration file.

Every database server in the network can operate on distributed database objects. You can access any of these database servers through your Coordinator Database, in a manner similar to the normal client/server architecture. The SQL commands that reference a remote database are be passed to remote database server through the Coordinator Database. The Coordinator database will decompose this SQL command into the local and remote portions, and send the appropriate commands to the remote database server. The Coordinator Database will wait for the remote database to return its results, and then merge all local and remote data and return the combined results to you.

16.4 Access Distributed Database Objects

DBMaker provides several different methods for you to access a Participant Database:

Specify the Participant Database name directly.
Using database links defined in the Coordinator Database.
Through remote object mapping such as views or synonyms.

The difference between the first two approaches is that database links contain security information in addition to the remote database name. This allows you to specify the user name and password that you want to use in the database link when you access the remote database.

There is no obvious difference between the statements of a distributed query and a normal query except in the way database objects are specified. However, when using a remote database, the only remote database objects you can access are tables, views, or synonyms. To access a remote database object, you have to provide the remote database name or database link when you specify the name of the database object. This provides two ways to identify a remote database object:

remote_database_name:object_owner.object_name
database_link:object_owner.object_name

The following are examples of how to specify a remote database object in a query.

dmSQL> SELECT * FROM Bank:EmpTaple;
dmSQL> DELETE FROM Bank:EmpTable WHERE id = 101;
dmSQL> INSERT INTO Link1:mis.account VALUES (2003,'Kevin Liu','2327-0021');

The following example shows you how to access remote database objects in two different Participant Databases in the same query.

dmSQL> SELECT * FROM ABCBank@La:account a,
ABCBankMIS@Seattle:account b
WHERE a.name = b.name;

Connecting to Remote Databases Using Database Names

You can connect to remote databases with the database name of the Coordinator Database Server. To apply this way, you have to know the remote database name which is defined in the dmconfig.ini file in the Coordinator Database Server. For example, a client application in the Los Angeles branch of the ABC bank can access the database which is located in the Taipei branch of the ABC bank with the following SQL command.

dmSQL> CONNECT TO BankTranx SYSADM aa;
dmSQL> SELECT * FROM BankTranx@Taipei:SYSADM.Account ORDER BY AccID;

In this example, it appears like you are connecting to the Taipei branch with the user name SYSADM and the password aa. In reality you are connecting to the Coordinator Database, which is the Los Angeles branch database. The Coordinator Database then connects to the remote database with the account and password you used when connecting to the Coordinator Database.

More advanced SQL requests, such as joins, can also access remote database objects in same way.

dmSQL> SELECT * FROM BankMIS:SYSADM.Personnel ORDER BY PID;
dmSQL> SELECT Personnel.* FROM BankTranx@Taipei:Account A,
BankMIS:Personnel B
WHERE A.CustID = B.CustID;

Connecting to Remote Databases Using Database Links

A database link creates a connection to a remote database, and contains the login information and password necessary for connecting to the remote database. This allows you to connect to a remote database with a different user name than you are using in the Coordinator Database, or to connect to a remote database that you do not have an account in using a public link. It also makes data in a distributed database environment location transparent. The link definition, which also contains the login information and password, is stored in the Coordinator Database.

Creating Database Links

Below is the SQL syntax for creating a database link.

Only database administrators can create public database links that can be used by all users of that database. Any user can create private database links that only they can use. Different users may create private database links with same name. If you create a private database link whose name is identical to a public database link, the private database link will override the public database link.

If you do not explicitly specify the type of link you want to create, DBMaker will create your link as a private link by default. If you do not specify the login account and password in the IDENTIFY BY clause, your current login name and password will be used by default.

Accessing Remote Database Objects with Database Links

The following is an example to show how to access remote database objects using database links. In this example the SYSADM connects to the database and creates a public link named Bank_Seattle that connects to the Seattle branch database using the SYSADM account. The SYSADM updates some values and disconnects. Then user1 connects and performs a query on the Account table.

dmSQL> CONNECT TO BankTranx SYSADM;
dmSQL> CREATE PUBLIC DATABASE LINK Bank_Seattle
2> CONNECT TO BankTranx@Seattle IDENTIFIED BY SYSADM;
dmSQL> UPDATE Bank_Seattle:Account SET balance = balance + 100
2> WHERE id = 1001;
dmSQL> DISCONNECT;
dmSQL> CONNECT TO BankTranx user1 pwd1;
dmSQL> SELECT * FROM Bank_Seattle:Account;

The following example is similar to the example above, but the SYSADM does not specify the account to use when connecting to the public link. This means that when user1 uses the public link to connect to the Bank_Seattle database, there must be an account for user1 in the remote database, and the user1 account must have the authority to query the SYSADM.Account table. Otherwise this example will cause an error to occur.

dmSQL> CONNECT TO BankTranx SYSADM;
dmSQL> CREATE PUBLIC DATABASE LINK Bank_Seattle 2> CONNECT TO BankTranx@Seattle;
dmSQL> SELECT * FROM Bank_Seattle:Account;
dmSQL> DISCONNECT;
dmSQL> CONNECT TO BankTranx user1 pwd1;
dmSQL> SELECT * FROM Bank_Seattle:SYSADM.Account;

If a database link name is the same as the remote database name, DBMaker will use the database link name in preference to the remote database name. If you want to access the remote database directly, you must explicitly specify the remote database name in the form dbname@"" to force DBMaker to access the remote database directly instead of through database link.

The following example shows two different ways to access a remote databases, one through a database link and the other by explicitly specifying the remote database name in the form dbname@"".

dmSQL> CONNECT TO BankTranx SYSADM;
dmSQL> CREATE PUBLIC DATABASE LINK BankMIS CONNECT TO BankMIS
2> IDENTIFIED BY SYSADM;
dmSQL> DISCONNECT;
dmSQL> CONNECT TO BankTranx user1 pwd1;
dmSQL> SELECT * FROM BankMIS:Personnel; //database link
dmSQL> SELECT * FROM BankMIS@"":SYSADM.Personnel; //remote db name

Deleting Database Links

Below is the SQL syntax for deleting a database link.

Only database administrators can delete public database links, and only the owner of a private database link can delete that link. If you try to delete a database link when both a public and a private database link with the same name exist, DBMaker will assume you want to delete the private database link if you do not specify whether the link you are deleting is a public or private database link.

The following example deletes a public database link named BankMIS.

dmSQL> DROP PUBLIC DATABASE LINK BankMIS;

Accessing Remote Database Objects Using Database Object Mapping

Database Object Mapping provides better location transparency in a distributed database environment. There is no difference between the way users access remote database objects with Database Object Mapping and the way they access local database objects. This type of Database Object Mapping includes using views and synonyms.

Accessing Remote Database Objects Using Synonyms

Using a synonym to define a remote database object means that you give the remote database object an alias name. The privileges you have in the remote database depend on the privileges of the currently connected user, and not the user who originally defined the synonym.

The following example shows the what happens when you access a remote database object using synonyms. The comments indicate the equivalent SQL expression, the database being connected to, and the account used to connect.

dmSQL> CONNECT TO BankTranx user1;
dmSQL> CREATE DATABASE LINK LK1 CONNECT TO BankMIS IDENTIFIED BY user2;
dmSQL> CREATE SYNONYM s1 FOR BankTranx:Account;
dmSQL> CREATE SYNONYM s2 FOR LK1:user2.Personnel;
dmSQL> SELECT * FROM s1;
// SELECT * FROM BankTranx:user1.Account; (BankTranx, user1)
dmSQL> SELECT * FROM s2;
// SELECT * FROM LK1:user2.Personnel; (BankMIS, user2)
dmSQL> DISCONNECT;
dmSQL> CONNECT TO BankTranx user3;
dmSQL> CREATE DATABASE LINK LK1 CONNECT TO BankMIS IDENTIFIED BY user4;
dmSQL> SELECT * FROM s1;
// SELECT * FROM BankTranx:user3.Account; (BankTranx, user3)
dmSQL> SELECT * FROM s2;
// SELECT * FROM LK1:user2.Personnel; (BankMIS, user4)

Accessing Remote Database Objects Using Views

Using a view to define a remote database object is a bit different than using a synonym. The view is not just an alias, but includes the database name, user account, password, object owner, and object name as part of the view definition. The privileges you have in the remote database depend on the privileges of the user whose account was used when the view was created, and not the currently connected user.

The following example shows the what happens when you access a remote database object using views. The comments indicate the equivalent SQL expression, the database being connected to, and the account used to connect.

dmSQL> CONNECT TO BankTranx user1;
dmSQL> CREATE DATABASE LINK LK1 CONNECT TO BankMIS IDENTIFIED BY user2;
dmSQL> CREATE VIEW v1 FOR BankTranx:Account;
dmSQL> CREATE VIEW v2 FOR LK1:user3.Personnel;
dmSQL> SELECT * FROM v1;
// SELECT * FROM BankTranx:user1.Account; (BankTranx, user1)
dmSQL> SELECT * FROM v2;
// SELECT * FROM BankMIS:user3.Personnel; (BankMIS, user2)
dmSQL> DISCONNECT;
dmSQL> CONNECT TO BankTranx user3;
dmSQL> CREATE DATABASE LINK LK1 CONNECT TO BankMIS IDENTIFIED BY user4;
dmSQL> SELECT * FROM v1;
// SELECT * FROM BankTranx:user1.Account; (BankTranx, user1)
dmSQL> SELECT * FROM v2;
// SELECT * FROM LK1:user3.Personnel; (BankMIS, user2)

Closing Database Links

Once you access a remote database with an SQL command, the Coordinator Database will build a remote connection to the Participant Database. The remote connection will remain open until all users disconnect from the Coordinate Database or until the link is explicitly closed with the CLOSE DATABASE LINK command. Since DBMaker provides at most eight remote connections for each database, it is a good idea to close remote connections that are no longer required to free these connections.

Below is the SQL syntax for closing a database link.

The following example closes a database link using the remote database name BankMIS.

dmSQL> close database link BankMIS;

The following example closes a database link using the remote database link BankLink1.

dmSQL> close database link BankLink1;

Actually, when you issue a CLOSE DATABASE LINK command, DBMaker will decrease the remote connection counter by one. When the counter reaches zero, the connection is really closed and the occupied resources freed. Otherwise the connection remains open.

The following example closes all database links and frees the connection and any occupied resources.

dmSQL> CLOSE DATABASE LINK ALL

The following command closes all remote connections that are no longer being used in the current transaction.

dmSQL> CLOSE DATABASE LINK NONACTIVE

Database Link System Catalog Tables

There are two system catalogs tables related to database links: SYSDBLINK and SYSOPENLINK. SYSDBLINK records all database link names and their definitions, while SYSOPENLINK records open connections between databases.

16.5 Distributed Transaction Control

DBMaker supports a distributed transaction mechanism that is transparent to users. You do not need to become worry about Participant Databases committing their part of a distributed transaction, since this is handled automatically by DBMaker. As far as you are concerned, all transactions behave the same way as in the traditional client/server architecture.

The following example shows how distributed transaction control works.

dmSQL> CONNECT TO BankTranx user1;               // ABC Bank in Taipei
dmSQL> SET AUTOCOMMIT OFF;
dmSQL> UPDATE BankTranx:Customer SET money=money-1000 where id=123;
dmSQL> UPDATE BankTranx@"Bank_in_Seattle":Customer SET money=money+1000
2> WHERE id=123;
dmSQL> COMMIT;

Since all database operations from the client application are handled by the Coordinator Database, the Coordinator Database knows the scope of the instructions via Distributed System Catalog Manager. Transactions belonging to the local site will handled by the Coordinator Database in the same manner as a regular client/server transaction. Transactions belonging to remote sites need to reference the appropriate remote database. The Coordinator Database will exchange information with every Participant Database and coordinate the whole transaction until it is either rolled back or committed.

Two-Phase Commit

Database management systems need to maintain data integrity, and this requires that all transactions are atomic. All operations in the transaction must commit or roll back together. In the traditional client/server architecture, a journal is used to make sure the changes are either rolled back or committed.

In the distributed database architecture, two-phase commit protocol with presumed abort is used as the mechanism for controlling distributed transactions that span multiple database servers. A transaction that modifies data on two or more databases must complete the two-phase commit protocol before it is committed. The two-phase commit mechanism guarantees that all sites commit or roll back globally. It also protects data manipulation operations performed by remote synonyms, integrity constraints, and triggers. To commit a transaction, you have to ensure every sub-transaction has finished, otherwise the transaction will be aborted. For the same reason, if any sub-transaction cannot commit, the other sub-transactions must be aborted as well.

Distributed Transaction Recovery

DBMaker uses the two-phase commit protocol to inform all Participant Databases to commit a global transaction. Before entering the commit phase, the Coordinator Database will check the status of the Participant Databases to ensure there are no server or network problems. If the Coordinator Database finds a problem with any of the Participant Databases, it informs the other Participant Databases to roll back their part of the transaction, and returns an error indicating the global transaction has failed. If the two-phase commit protocol has finished the preparation phase, but there is a server or network problem with a Participant Database, the global transaction is regarded as a success. DBMaker records which Participant Database server cannot commit its part of the transaction in the SYSGLBTRANX system catalog table, and also records which database contains a pending transaction in the SYSPENDTRANX system catalog table for the crashed database.

DBMaker also provides an automatic recovery mechanism to handle network or site failure during the execution of a distributed transaction. In the Coordinator Database you start the global transaction recovery daemon (GTRECO). This daemon scans the SYSGLBTRANX system catalog table and periodically recovers any pending global transactions. Then it tries to connect to the crashed Participant Database and inform it to commit or roll back its part of the global transaction.

Heuristic End Global Transaction

After a network or site failure occurs during the two-phase commit, pending transactions continue to hold some resources such as locks or journal blocks. The pending transaction will occupy these resources until the problem is solved by the global recovery daemon (GTRECO). If the network or site failure cannot be solved immediately, then some of the users in the Participant site may be blocked by the held resources. To solve this problem, DBMaker supports heuristic end global transaction. Heuristic end transaction is an independent action taken by the database administrator to force a pending transaction in a Participant Database to commit or roll back. The database administrator can use DBA Tool to solve this problem.

To solve a pending transaction manually, you can do the following:

1. In the Participant Database, browse the SYSPENDTRANX table to find out whether there are transactions that have been pending for a long time.

2. In the Coordinator Database, determine the commitment status of the pending transaction. For example, if there are two pending transactions "DB_1-3376aafd" and "DB_2-3376aafd:DB_3-3376ab0f#1", you should ask the administrator of DB_1 to determine the status of "DB_1-3376aafd" and ask the administrator of DB_3 to determine the status of "DB_2-3376aafd:DB_3-3376ab0f#1".

3. In the Coordinator Database, when the administrator receives the transaction status query he can examine SYSGLBTRANX to determine the transaction status. If the STATE is 2 (COMMIT) or 3 (PENDCOM), he may reply `commit'. If the STATE is 4 (PENDABO), he may reply `abort'. But if the STATE is 1 (PREPARE), this transaction branch is pending in this site too, and he may ask the administrator of the parent site to determine its status.

4. In Participant Database, the administrator uses Server Manager to perform a heuristic commit or abort the pending transaction based on the reply.

If the administrator initiates a heuristic end transaction on a pending transaction which is different from action taken in the Coordinator Database, the distributed data will be inconsistent.

Previous PageTop Of PageTable Of Contents

Copyright 2002 SYSCOM Computer Engineering Co. All rights reserved.