Previous PageTop Of PageTable Of ContentsNext Page


8. Large Object Management

8.1 Managing BLOBs

8.2 Managing File Objects


8. Large Object Management

A Large Object (LO) is any variable length data object, such as document text, images, sound, or video. DBMaker has a great deal of flexibility when dealing with large objects and provides an excellent large object mechanism for unstructured data.

DBMaker does not limit the number of LOs that can be in one table, and there is no aggregate size limit on LO columns. This means the capacity of each LO column can be up to 2GB. DBMaker can use extensions to the SQL language to directly access Large Objects, eliminating the need for users to learn any special syntax. All access to LO columns is transparent in SQL statements, which makes using large objects easy to learn. Furthermore, users can input or output LO data to and from a file using SQL commands or the ODBC interface.

If the same LO data is contained in many tuples, DBMaker will store only a single LO and share it between the tuples instead of duplicating a copy for each tuple. This can decrease disk utilization dramatically. However, from the user's view there is always a dedicated LO item for each tuple. For example, if one user updates a shared LO in one tuple, the other tuples that share that LO are not influenced and users still see the unchanged LO data. The LO that was updated will be stored as a new LO in the database. All of this will be transparent to the users.

An LO is always written to disk as a single unit. However, users can read all or part of an LO. The SELECT, UPDATE, INSERT and DELETE statements are permitted with LOs. LO items can only be used in boolean expressions if users would like to test them for NULL values. DBMaker also provides the MATCH function for use with LOs to perform searches with pattern-matching. The MATCH function is similar to the LIKE function except it only works on LO columns and does not permit the use of wildcard characters.

DBMaker does not permit the operation of arithmetic or string expressions on LO items, nor can the LO items be used in any of the following ways:

With aggregate functions
With the IN, ANY, EXIST or LIKE predicates
With the GROUP BY clause
With the ORDER BY clause

There are two kinds of LOs: Binary Large OBjects (BLOBs), which are stored in database files, and File Objects (FOs), which are stored as external files on your host file system.

Figure 8-1: Large Objects supported by DBMaker

A BLOB, stored in database files, can only be accessed through DBMaker and insists on the data integrity provided by DBMaker, such as transaction controls, logging and recovery. A BLOB can only be shared between tuples in the same table. However, a FO can be shared between tables in a database. Also, when the data needs to be shared by the other non-database applications, using FOs will be more flexible.

8.1 Managing BLOBs

There are two types of BLOB items-LONG VARCHAR and LONG VARBINARY. Data of the LONG VARCHAR type stores any kind of text data such as memos, long text, HTML source files, or program source files. Data of the LONG VARBINARY type stores any kind of binary data such as images, sound, spreadsheets, program modules, etc.

A BLOB may be stored in a data file or a BLOB file, depending on its size. Although the format of a data file is fixed, you should customize the format of BLOB files in your database to obtain better performance and disk utilization.

The choice of BLOB logging is optional because it occupies a large amount of the journal space and can pull down performance. To save logging space and improve performance, you can turn off the BLOB journal. However, if you turn off BLOB logging, DBMaker does not ensure the BLOB contents are correct after you restore a database from a backup. If the BLOB journal is turned on, you should make sure the capacity of the journal space is large enough to accommodate the BLOB data.

Customizing BLOB Space

DBMaker can intelligently decide where the best location to store BLOB data is. If the size of a BLOB column is small and the total length of a tuple does not exceed the limitation for the maximum tuple size, DBMaker will put the BLOB data in a table column together with the other data in the database. This increases efficiency because the BLOB data is also fetched when DBMaker fetches a tuple, and an extra disk access to fetch the BLOB data from another file in not needed. However, when the total length of a data tuple exceeds the limitation of the maximum tuple size, DBMaker will store the BLOB data separately. In such a situation, getting the BLOB data (called an indirect BLOB) requires two disk operations, one to fetch the data tuple and one to fetch the BLOB data.

According to the size of a BLOB, an indirect BLOB may be stored in a DATA file or in a BLOB file in the same tablespace as the table. The data in an indirect BLOB column is stored in a data file when its size is equal to or less than 3950 bytes. Otherwise it is stored in a BLOB file.

Data files contain pages, and BLOB files contain frames. There are two major differences between pages and frames:

The size of a page is fixed at 4KB, but the size of a frame can be customized by a user.

A page can contain more than one tuple, but a frame can only contain a single BLOB.

You can customize the frame size of a BLOB file to increase performance and disk utilization. To customize the frame size, you need to specify the value in kilobytes of the DB_BFRSZ keyword in dmconfig.ini before creating a database. The default value of DB_BFRSZ is 16. To specify the BLOB frame size, add the following line to dmconfig.ini.

DB_BFRSZ = 10                       ; BLOB frame size = 10K bytes

The valid range of DB_BFRSZ is 8 to 256, except in the Microsoft Windows 3.1 environment, the frame size is fixed to 8KB and cannot be altered.

The frame size of all BLOB files in a database are the same. Once you create a database, the BLOB frame size cannot be changed from the initial setting. DBMaker will keep this value in the database system information page. When you restart the database the next time, DBMaker will get the value from the system information page and ignore the DB_BFRSZ keyword in dmconfig.ini.

Determining the frame size is a trade-off between disk utilization and performance. If you frequently retrieve an entire BLOB, adjusting the frame size to contain an entire BLOB will result in better performance because only one disk access is required. However, there may be large variations in the size of the BLOB data. If you make the frame size large enough to contain the largest BLOB, it will waste disk space as other frames which contain smaller BLOBs will contain unused disk space. On the contrary, if you adopt the policy of using frames only large enough to contain the smallest BLOBs, performance will be degraded when fetching larger BLOBS that are stored in multiple frames.

Each frame contains a header to record frame information. If the frame size is 8KB, for example, the space occupied by the BLOB will be less than 8192 bytes. In addition, the useful space in the first frame of a BLOB is much less than the size of a frame because there are about 1.8KB reserved to store information for each BLOB item, such as where other frames are. Thus, if the size of a BLOB is 8192 bytes, it will occupy 2 frames: the first 6.2KB of the BLOB are stored in the first frame and the remaining bytes of the BLOB are stored in the second frame.

Generating BLOBs

A BLOB column is the same as other columns except that it's type is LONG VARCHAR or LONG VARBINARY. For example, you can use the following SQL command to create two BLOB columns called note and photo:

dmSQL> CREATE TABLE employee (id INTEGER,
                            note LONG VARCHAR,
                           photo LONG VARBINARY);

You can input the BLOB data with constant values or from a file. For example, the contents of the file aa.txt is `1234567'. You can use the following SQL statement to insert BLOB data with data from the file aa.txt and the constant `1234567'.

dmSQL> INSERT INTO employee VALUES(1, 'aa.txt', '1234567');

You can insert BLOB data from the aa.txt file using the following ISQL statements. The contents of the note column is `1234567' and the contents of the photo column is `3456'.

dmSQL> INSERT INTO employee VALUES(2,?,?);
dmSQL/Val> &aa.txt, &aa.txt(2,4); dmSQL/Val> END;

The result of a LONG VARBINARY column is represented in hexadecimal format. When you browse the table, you will get results as follows:

dmSQL> SELECT * FROM employee;

 id           note              photo
=====      ===========     =================
   1       aa.txt          31323334353637
   2       1234567         33343536

DBMaker also supports fetching BLOB data to a user-specified file. For more information on how to insert and fetch BLOB data, refer to the DBATool User's Guide and the ODBC Programmer's Guide.

Updating BLOBs

A BLOB item is always written to disk as a whole. Thus when you update a BLOB column, DBMaker will drop the original BLOB item and then insert the new data as a new BLOB item. To update contents of a BLOB column, use the SQL UPDATE command:

dmSQL> UPDATE employee SET note = 'Hello !' WHERE id > 0;
dmSQL> SELECT * FROM employee;

 id           note               photo
=====      ===========     =================
   1       Hello !         31323334353637
   2       Hello !         33343536

From the user's viewpoint, there must be a BLOB for each tuple. However, to save disk space, DBMaker only creates a single copy of the BLOB data shared by all tuples with an ID greater than 0. DBMaker maintains an internal counter to record how many tuples reference this BLOB. When you update the BLOB column of a tuple which links to the shared BLOB, DBMaker will generate a new BLOB item for you and decrease the counter of the shared BLOB by one. This prevents any changes you make to the BLOB column from influencing other tuples. In DBMaker this is known as loose coupling. This makes disk utilization more efficient, but a BLOB item only can be shared by tuples which are in the same table. When a BLOB is not linked by any tuples, DBMaker will drop the BLOB automatically.

Predicate Operations on BLOB Columns

You can only use BLOB objects in Boolean expressions when testing for NULL values. For example, to fetch all data from the employee table where the note column is NOT NULL, you can use the following SQL statement:

dmSQL> SELECT * FROM employee WHERE note IS NOT NULL;

DBMaker provides pattern matching for BLOBs. The MATCH function is similar to the LIKE function except that wildcard characters are not supported. For example, if you want to find all employees whose note column contains the 'Database Administrator' phrase, you can use the following SQL commands:

dmSQL> SELECT * FROM employee WHERE note MATCH 'Database Administrator';

BLOB Journals

Because logging BLOB data requires large amounts of disk space and causes poor performance, it is optional and DBMaker does not log the content of BLOB data by default. During the period between starting and shutting down the database, DBMaker promises the consistency of BLOB data. Even in the event of a system crash, BLOB data in the database is consistent after recovery from the crash. However, when you restore a database from a full backup to a specific timestamp, DBMaker does not make ensure the correction of BLOB data. Thus, when you need to back up your database and you know you will need to restore it in the future, you should turn on the BLOB journal to ensure the consistency of your database.

To record BLOB data in a journal, set the value of the DB_BMODE keyword in the dmconfig.ini file to 2. Note that before you turn on the BLOB journal, you should check if the journal file is large enough. Otherwise, you will quickly get a journal full message. To turn on BLOB logging, add following line to the dmconfig.ini file.

DB_BMODE = 2    ;journal all data include BLOB

8.2 Managing File Objects

Each File Object column is stored as an external file. Using FOs is beneficial when the data is also used by other applications, since they can access the file directly. Most of the current multimedia tools can only process multimedia data when it is stored as a complete file of the required type. If the multimedia data is stored as a BLOB, you need to fetch it from DBMaker and redirect it to a file which can then be processed by the tools. However, when you store it as a FO you can get the file name from DBMaker and pass the name to the multimedia tool you are using.

The other advantage of using FOs is that DBMaker can link a column to an existing file directly. It does not need to duplicate data that already exists in a file, such as a file on a CD-ROM. This can result in a large saving in disk space.

DBMaker also generates unique file names automatically. You need not spend any additional time managing them.

There are two kinds of FOs: SYSTEM and USER. SYSTEM file objects are created when a user inputs column data and DBMaker stores it in an external file. This file is created by DBMaker and can be recognized by the .FOB file name extension. USER file objects are existing external files that are linked to a column in the database. The USER file object may be a file on any device.

The major difference between SYSTEM and USER file objects is a SYSTEM FO is generated by DBMaker automatically. This means a SYSTEM FO will be deleted when no column references it. Because they were created by a user, a USER FO will not be deleted when there are no more references to it. A file linked as a USER FO must open its read permission. Users can get the file name and file size of a FO by using the built-in functions FILENAME() and FILELEN().

Customizing The File Object Path

All SYSTEM FOs belonging to a database are put in the same directory. Although you can set up a FO directory shared by more than one databases to simplify FO management, DBMaker does not suggest you to do this because it becomes inconvenient when backing up a database.

Users can specify where to put SYSTEM FOs by setting the value of DB_FODIR in dmconfig.ini. The value of DB_FODIR is the full path of an existing directory. Furthermore, DBMaker must own the write-permission on that directory. For example, when a user wants all SYSTEM FOs created in the /disk1/usr/fo directory, he must specify the following statement in dmconfig.ini.

DB_FODIR = /disk1/usr/fo

USER FOs are existing files that are accessible from the database. They can be scattered in many directories on the server side. Instead of specifying a USER FO directory, users need to set the DB_USRFO keyword to 1 or TRUE in dmconfig.ini to enable the use of USER file objects. USER FOs are disabled by default.

DB_USRFO = 1          ; enable USER file objects

Generating File Objects

To create a FO column, set the column type to FILE when creating the table. The following SQL statement can be used to create a table named person with a file object column called photo:

dmSQL> CREATE TABLE person (name CHAR(10), photo FILE);

You can input FO data with the filename of a file on either the server side or the client side. In the first case, DBMaker will link the FO column to the existing file and generate a USER FO. In the second case, DBMaker will create a SYSTEM FO by copying the file on the client side to the FO directory on the server side. The following dmSQL statements show how to insert FO data:

dmSQL> INSERT INTO person VALUES (`cathy','/disk1/image/cathy.bmp')
       // stored as a USER FO
dmSQL> INSERT INTO person VALUES (`jeff',?);
dmSQL/Val> &jeff.gif; // stored as a SYSTEM FO
dmSQL/Val> END;

There are three varieties of fetching methods for a FO column: content, file name, and file size. The following SQL statement shows the usage of each:

dmSQL> SELECT photo, FILENAME(photo), FILELEN(photo) FROM person ;

  photo          filename(photo)           filelen(photo)
=========    =========================     ==============
012034451    /disk1/image/cathy.bmp                 21100
349045821    /disk1/usr/fo/ZZ000000.FOB             12034

For more information about manipulation on FO columns, refer to the DBATool User's Guide and the ODBC Programmer's Guide.

Updating File Objects

When you want to update the contents of a FO column, you can use the SQL UPDATE command. DBMaker will replace the FO column with a new file.

Similar to the method for inserting FOs, you can update a FO column to a new SYSTEM FO or link to a USER FO. The following SQL statement will link the photo column to an existing file, /disk2/image/common.bmp. Again, no file is created, and you must specify the full path for a USER FO.

dmSQL> UPDATE person SET photo = '/disk2/image/common.bmp'
                    WHERE name = `cathy';

Of course, you can input new data from a file on the client side. For more information, refer to the DBATool User's Guide and the ODBC Programmer's Guide.

If the results of the UPDATE operation contain more than one tuple, only one file will be created. This file will be shared among the tuples to save disk space. DBMaker will maintain an internal counter to record how many tuples reference that file. Also, if a user modifies the contents of the file through an external application program rather than DBMaker, all tuples will see the modification.

When no tuples retain links to a SYSTEM FO after UPDATE or DELETE operations, DBMaker will automatically delete the file after that transaction is committed. However, DBMaker never removes any USER FO even though no tuple references it, since the file was not generated by DBMaker.

Renaming File Objects

Sometimes users need to change the positions or names of FOs because of full disks or a reorganization of the disk layout. DBMaker permits users to use the MOVE FILE OBJECT statement to change the name or path of the FO. Before using the MOVE FILE OBJECT command, you need to move the files to the new location because DBMaker will make sure the new files exist before allowing the move. You can use the FILENAME() function to get the names of the files which will be moved.

dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/ZZ000000.FOB'
                     TO '/disk3/pub/photo1.bmp';

DBMaker also permits users to move FOs from one directory to another. Note that DBMaker permits using only one * character for the specified file name in the source directory, but does not allow using any * character in the destination directory. DBMaker does not support recursively moving files. If you want to move all the files, not including subdirectories, from one directory to another, you must specify the former directory by adding the `/' or `/*' characters at the end of the directory.

For example, if there are four files in /disk1/usr/fo named ABC1.FOB, ABC2.FOB, ABC3.FOB,and ABC4.FOB, and the following SQL commands are executed:

dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/ ' TO '/disk3/pub/ ';
dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/* ' TO '/disk3/pub/ ';
dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/*.FOB ' TO '/disk3/pub/ ';
dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/A* ' TO '/disk3/pub/ ';

DBMaker will move ABC1.FOB, ABC2.FOB, ABC3.FOB, ABC4.FOB from /disk1/usr/fo to /disk3/pub. If the following SQL commands are executed:

dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/*1.FOB ' TO '/disk3/pub/ ';
dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/A*1.FOB ' TO '/disk3/pub/ ';

DBMaker will only move ABC1.FOB from /disk1/usr/fo to /disk3/pub.

Predicate Operations on File Objects

Similar to BLOBs, you can test FOs for NULL values and use the MATCH function to perform pattern searches. Furthermore, you can use the FO item in arithmetic expressions with the FILELEN() built-in function and string expressions with the FILENAME() built-in function.

For example, if you want to select tuples in which .gif appears in the file name of the photo column, you can use the following SQL command.

dmSQL> SELECT * FROM person WHERE FILENAME(photo) LIKE '%.gif';

If you want to fetch tuples where the size of the photo column is greater than 100KB, you can use the following SQL command.

dmSQL> SELECT * FROM person WHERE FILELEN(photo) > 102400;

Journal Logging of File Objects

DBMaker does not support journal logging of FOs. When you back up the database, don't forget to back up all FOs belonging to the database, too. To back up FOs, you need to get the filenames of all FOs by querying the SYSFILEOBJ table.

dmSQL> SELECT FILE_NAME FROM SYSFILEOBJ;

Then copy all FOs to the backup storage. When you restore the database from a backup, copy all FOs, too. If the file paths or file names have changed, you can use the MOVE FILE OBJECT command to update the file names in the SYSFILEOBJ table.

Previous PageTop Of PageTable Of ContentsNext Page

Copyright 2002 SYSCOM Computer Engineering Co. All rights reserved.