|
   
8.
Large Object Management
8.1
Managing BLOBs
8.2
Managing File Objects
8. Large Object Management
A Large
Object (LO) is any variable length data object, such as document text,
images, sound, or video. DBMaker has a great deal of flexibility when
dealing with large objects and provides an excellent large object mechanism
for unstructured data.
DBMaker
does not limit the number of LOs that can be in one table, and there
is no aggregate size limit on LO columns. This means the capacity of
each LO column can be up to 2GB. DBMaker can use extensions to the SQL
language to directly access Large Objects, eliminating the need for
users to learn any special syntax. All access to LO columns is transparent
in SQL statements, which makes using large objects easy to learn. Furthermore,
users can input or output LO data to and from a file using SQL commands
or the ODBC interface.
If the
same LO data is contained in many tuples, DBMaker will store only a
single LO and share it between the tuples instead of duplicating a copy
for each tuple. This can decrease disk utilization dramatically. However,
from the user's view there is always a dedicated LO item for each tuple.
For example, if one user updates a shared LO in one tuple, the other
tuples that share that LO are not influenced and users still see the
unchanged LO data. The LO that was updated will be stored as a new LO
in the database. All of this will be transparent to the users.
An LO is
always written to disk as a single unit. However, users can read all
or part of an LO. The SELECT, UPDATE, INSERT and DELETE statements are
permitted with LOs. LO items can only be used in boolean expressions
if users would like to test them for NULL values. DBMaker also provides
the MATCH function for use with LOs to perform searches with pattern-matching.
The MATCH function is similar to the LIKE function except it only works
on LO columns and does not permit the use of wildcard characters.
DBMaker
does not permit the operation of arithmetic or string expressions on
LO items, nor can the LO items be used in any of the following ways:
With
aggregate functions
With
the IN, ANY, EXIST or LIKE predicates
With
the GROUP BY clause
With the ORDER BY clause
There are
two kinds of LOs: Binary Large OBjects (BLOBs), which are stored
in database files, and File Objects (FOs), which are stored as
external files on your host file system.

Figure
8-1: Large Objects supported by DBMaker
A BLOB,
stored in database files, can only be accessed through DBMaker and insists
on the data integrity provided by DBMaker, such as transaction controls,
logging and recovery. A BLOB can only be shared between tuples in the
same table. However, a FO can be shared between tables in a database.
Also, when the data needs to be shared by the other non-database applications,
using FOs will be more flexible.
8.1 Managing BLOBs
There are
two types of BLOB items-LONG VARCHAR and LONG VARBINARY. Data of the
LONG VARCHAR type stores any kind of text data such as memos, long text,
HTML source files, or program source files. Data of the LONG VARBINARY
type stores any kind of binary data such as images, sound, spreadsheets,
program modules, etc.
A BLOB
may be stored in a data file or a BLOB file, depending on its size.
Although the format of a data file is fixed, you should customize the
format of BLOB files in your database to obtain better performance and
disk utilization.
The choice
of BLOB logging is optional because it occupies a large amount of the
journal space and can pull down performance. To save logging space and
improve performance, you can turn off the BLOB journal. However, if
you turn off BLOB logging, DBMaker does not ensure the BLOB contents
are correct after you restore a database from a backup. If the BLOB
journal is turned on, you should make sure the capacity of the journal
space is large enough to accommodate the BLOB data.
Customizing BLOB Space
DBMaker
can intelligently decide where the best location to store BLOB data
is. If the size of a BLOB column is small and the total length of a
tuple does not exceed the limitation for the maximum tuple size, DBMaker
will put the BLOB data in a table column together with the other data
in the database. This increases efficiency because the BLOB data is
also fetched when DBMaker fetches a tuple, and an extra disk access
to fetch the BLOB data from another file in not needed. However, when
the total length of a data tuple exceeds the limitation of the maximum
tuple size, DBMaker will store the BLOB data separately. In such a situation,
getting the BLOB data (called an indirect BLOB) requires two disk operations,
one to fetch the data tuple and one to fetch the BLOB data.
According
to the size of a BLOB, an indirect BLOB may be stored in a DATA file
or in a BLOB file in the same tablespace as the table. The data in an
indirect BLOB column is stored in a data file when its size is equal
to or less than 3950 bytes. Otherwise it is stored in a BLOB file.
Data files
contain pages, and BLOB files contain frames. There are two major differences
between pages and frames:
The size of a page is fixed at 4KB, but the size of a frame can be customized
by a user.
A page can contain more than one tuple, but a frame can only contain
a single BLOB.
You can
customize the frame size of a BLOB file to increase performance and
disk utilization. To customize the frame size, you need to specify the
value in kilobytes of the DB_BFRSZ keyword in dmconfig.ini before creating
a database. The default value of DB_BFRSZ is 16. To specify the BLOB
frame size, add the following line to dmconfig.ini.
DB_BFRSZ = 10 ; BLOB frame size = 10K bytes
|
The valid
range of DB_BFRSZ is 8 to 256, except in the Microsoft Windows 3.1 environment,
the frame size is fixed to 8KB and cannot be altered.
The frame
size of all BLOB files in a database are the same. Once you create a
database, the BLOB frame size cannot be changed from the initial setting.
DBMaker will keep this value in the database system information page.
When you restart the database the next time, DBMaker will get the value
from the system information page and ignore the DB_BFRSZ keyword in
dmconfig.ini.
Determining
the frame size is a trade-off between disk utilization and performance.
If you frequently retrieve an entire BLOB, adjusting the frame size
to contain an entire BLOB will result in better performance because
only one disk access is required. However, there may be large variations
in the size of the BLOB data. If you make the frame size large enough
to contain the largest BLOB, it will waste disk space as other frames
which contain smaller BLOBs will contain unused disk space. On the contrary,
if you adopt the policy of using frames only large enough to contain
the smallest BLOBs, performance will be degraded when fetching larger
BLOBS that are stored in multiple frames.
Each frame
contains a header to record frame information. If the frame size is
8KB, for example, the space occupied by the BLOB will be less than 8192
bytes. In addition, the useful space in the first frame of a BLOB is
much less than the size of a frame because there are about 1.8KB reserved
to store information for each BLOB item, such as where other frames
are. Thus, if the size of a BLOB is 8192 bytes, it will occupy 2 frames:
the first 6.2KB of the BLOB are stored in the first frame and the remaining
bytes of the BLOB are stored in the second frame.
Generating BLOBs
A BLOB
column is the same as other columns except that it's type is LONG VARCHAR
or LONG VARBINARY. For example, you can use the following SQL command
to create two BLOB columns called note and photo:
dmSQL> CREATE TABLE employee (id INTEGER,
note LONG VARCHAR,
photo LONG VARBINARY);
|
You can
input the BLOB data with constant values or from a file. For example,
the contents of the file aa.txt is `1234567'. You can use the
following SQL statement to insert BLOB data with data from the file
aa.txt and the constant `1234567'.
dmSQL> INSERT INTO employee VALUES(1, 'aa.txt', '1234567');
|
You can
insert BLOB data from the aa.txt file using the following ISQL
statements. The contents of the note column is `1234567' and
the contents of the photo column is `3456'.
dmSQL> INSERT INTO employee VALUES(2,?,?); dmSQL/Val> &aa.txt, &aa.txt(2,4);
dmSQL/Val> END;
|
The result
of a LONG VARBINARY column is represented in hexadecimal format. When
you browse the table, you will get results as follows:
dmSQL> SELECT * FROM employee;
id note photo
===== =========== =================
1 aa.txt 31323334353637
2 1234567 33343536
|
DBMaker
also supports fetching BLOB data to a user-specified file. For more
information on how to insert and fetch BLOB data, refer to the DBATool
User's Guide and the ODBC Programmer's Guide.
Updating BLOBs
A BLOB
item is always written to disk as a whole. Thus when you update a BLOB
column, DBMaker will drop the original BLOB item and then insert the
new data as a new BLOB item. To update contents of a BLOB column, use
the SQL UPDATE command:
dmSQL> UPDATE employee SET note = 'Hello !' WHERE id > 0;
dmSQL> SELECT * FROM employee;
id note photo
===== =========== =================
1 Hello ! 31323334353637
2 Hello ! 33343536
|
From the
user's viewpoint, there must be a BLOB for each tuple. However, to save
disk space, DBMaker only creates a single copy of the BLOB data shared
by all tuples with an ID greater than 0. DBMaker maintains an internal
counter to record how many tuples reference this BLOB. When you update
the BLOB column of a tuple which links to the shared BLOB, DBMaker will
generate a new BLOB item for you and decrease the counter of the shared
BLOB by one. This prevents any changes you make to the BLOB column from
influencing other tuples. In DBMaker this is known as loose coupling.
This makes disk utilization more efficient, but a BLOB item only can
be shared by tuples which are in the same table. When a BLOB is not
linked by any tuples, DBMaker will drop the BLOB automatically.
Predicate Operations on BLOB Columns
You can
only use BLOB objects in Boolean expressions when testing for NULL values.
For example, to fetch all data from the employee table where
the note column is NOT NULL, you can use the following SQL statement:
dmSQL> SELECT * FROM employee WHERE note IS NOT NULL;
|
DBMaker
provides pattern matching for BLOBs. The MATCH function is similar to
the LIKE function except that wildcard characters are not supported.
For example, if you want to find all employees whose note column
contains the 'Database Administrator' phrase, you can use the following
SQL commands:
dmSQL> SELECT * FROM employee WHERE note MATCH 'Database Administrator';
|
BLOB Journals
Because
logging BLOB data requires large amounts of disk space and causes poor
performance, it is optional and DBMaker does not log the content of
BLOB data by default. During the period between starting and shutting
down the database, DBMaker promises the consistency of BLOB data. Even
in the event of a system crash, BLOB data in the database is consistent
after recovery from the crash. However, when you restore a database
from a full backup to a specific timestamp, DBMaker does not make ensure
the correction of BLOB data. Thus, when you need to back up your database
and you know you will need to restore it in the future, you should turn
on the BLOB journal to ensure the consistency of your database.
To record
BLOB data in a journal, set the value of the DB_BMODE keyword in the
dmconfig.ini file to 2. Note that before you turn on the BLOB journal,
you should check if the journal file is large enough. Otherwise, you
will quickly get a journal full message. To turn on BLOB logging, add
following line to the dmconfig.ini file.
DB_BMODE = 2 ;journal all data include BLOB
|
8.2 Managing File Objects
Each File
Object column is stored as an external file. Using FOs is beneficial
when the data is also used by other applications, since they can access
the file directly. Most of the current multimedia tools can only process
multimedia data when it is stored as a complete file of the required
type. If the multimedia data is stored as a BLOB, you need to fetch
it from DBMaker and redirect it to a file which can then be processed
by the tools. However, when you store it as a FO you can get the file
name from DBMaker and pass the name to the multimedia tool you are using.
The other
advantage of using FOs is that DBMaker can link a column to an existing
file directly. It does not need to duplicate data that already exists
in a file, such as a file on a CD-ROM. This can result in a large saving
in disk space.
DBMaker
also generates unique file names automatically. You need not spend any
additional time managing them.
There are
two kinds of FOs: SYSTEM and USER. SYSTEM file objects are created when
a user inputs column data and DBMaker stores it in an external file.
This file is created by DBMaker and can be recognized by the .FOB file
name extension. USER file objects are existing external files that are
linked to a column in the database. The USER file object may be a file
on any device.
The major
difference between SYSTEM and USER file objects is a SYSTEM FO is generated
by DBMaker automatically. This means a SYSTEM FO will be deleted when
no column references it. Because they were created by a user, a USER
FO will not be deleted when there are no more references to it. A file
linked as a USER FO must open its read permission. Users can get the
file name and file size of a FO by using the built-in functions FILENAME()
and FILELEN().
Customizing The File Object Path
All SYSTEM
FOs belonging to a database are put in the same directory. Although
you can set up a FO directory shared by more than one databases to simplify
FO management, DBMaker does not suggest you to do this because it becomes
inconvenient when backing up a database.
Users can
specify where to put SYSTEM FOs by setting the value of DB_FODIR in
dmconfig.ini. The value of DB_FODIR is the full path of an existing
directory. Furthermore, DBMaker must own the write-permission on that
directory. For example, when a user wants all SYSTEM FOs created in
the /disk1/usr/fo directory, he must specify the following statement
in dmconfig.ini.
USER FOs
are existing files that are accessible from the database. They can be
scattered in many directories on the server side. Instead of specifying
a USER FO directory, users need to set the DB_USRFO keyword to 1 or
TRUE in dmconfig.ini to enable the use of USER file objects. USER FOs
are disabled by default.
DB_USRFO = 1 ; enable USER file objects
|
Generating File Objects
To create
a FO column, set the column type to FILE when creating the table. The
following SQL statement can be used to create a table named person
with a file object column called photo:
dmSQL> CREATE TABLE person (name CHAR(10), photo FILE);
|
You can
input FO data with the filename of a file on either the server side
or the client side. In the first case, DBMaker will link the FO column
to the existing file and generate a USER FO. In the second case, DBMaker
will create a SYSTEM FO by copying the file on the client side to the
FO directory on the server side. The following dmSQL statements show
how to insert FO data:
dmSQL> INSERT INTO person VALUES (`cathy','/disk1/image/cathy.bmp')
// stored as a USER FO dmSQL> INSERT INTO person VALUES (`jeff',?); dmSQL/Val> &jeff.gif;
// stored as a SYSTEM FO dmSQL/Val> END;
|
There are
three varieties of fetching methods for a FO column: content, file name,
and file size. The following SQL statement shows the usage of each:
dmSQL> SELECT photo, FILENAME(photo), FILELEN(photo) FROM person ;
photo filename(photo) filelen(photo)
========= ========================= ==============
012034451 /disk1/image/cathy.bmp 21100
349045821 /disk1/usr/fo/ZZ000000.FOB 12034
|
For more
information about manipulation on FO columns, refer to the DBATool
User's Guide and the ODBC Programmer's Guide.
Updating File Objects
When you
want to update the contents of a FO column, you can use the SQL UPDATE
command. DBMaker will replace the FO column with a new file.
Similar
to the method for inserting FOs, you can update a FO column to a new
SYSTEM FO or link to a USER FO. The following SQL statement will link
the photo column to an existing file, /disk2/image/common.bmp.
Again, no file is created, and you must specify the full path for a
USER FO.
dmSQL> UPDATE person SET photo = '/disk2/image/common.bmp'
WHERE name = `cathy';
|
Of course,
you can input new data from a file on the client side. For more information,
refer to the DBATool User's Guide and the ODBC Programmer's
Guide.
If the
results of the UPDATE operation contain more than one tuple, only one
file will be created. This file will be shared among the tuples to save
disk space. DBMaker will maintain an internal counter to record how
many tuples reference that file. Also, if a user modifies the contents
of the file through an external application program rather than DBMaker,
all tuples will see the modification.
When no
tuples retain links to a SYSTEM FO after UPDATE or DELETE operations,
DBMaker will automatically delete the file after that transaction is
committed. However, DBMaker never removes any USER FO even though no
tuple references it, since the file was not generated by DBMaker.
Renaming File Objects
Sometimes
users need to change the positions or names of FOs because of full disks
or a reorganization of the disk layout. DBMaker permits users to use
the MOVE FILE OBJECT statement to change the name or path of the FO.
Before using the MOVE FILE OBJECT command, you need to move the files
to the new location because DBMaker will make sure the new files exist
before allowing the move. You can use the FILENAME() function to get
the names of the files which will be moved.
dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/ZZ000000.FOB'
TO '/disk3/pub/photo1.bmp';
|
DBMaker
also permits users to move FOs from one directory to another. Note that
DBMaker permits using only one * character for the specified file name
in the source directory, but does not allow using any * character in
the destination directory. DBMaker does not support recursively moving
files. If you want to move all the files, not including subdirectories,
from one directory to another, you must specify the former directory
by adding the `/' or `/*' characters at the end of the directory.
For example,
if there are four files in /disk1/usr/fo named ABC1.FOB,
ABC2.FOB, ABC3.FOB,and ABC4.FOB,
and the following SQL commands are executed:
dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/ ' TO '/disk3/pub/ ';
dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/* ' TO '/disk3/pub/ ';
dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/*.FOB ' TO '/disk3/pub/ ';
dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/A* ' TO '/disk3/pub/ ';
|
DBMaker
will move ABC1.FOB, ABC2.FOB, ABC3.FOB, ABC4.FOB
from /disk1/usr/fo to /disk3/pub. If the following SQL
commands are executed:
dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/*1.FOB ' TO '/disk3/pub/ ';
dmSQL> MOVE FILE OBJECT '/disk1/usr/fo/A*1.FOB ' TO '/disk3/pub/ ';
|
DBMaker
will only move ABC1.FOB from /disk1/usr/fo to /disk3/pub.
Predicate Operations on File Objects
Similar
to BLOBs, you can test FOs for NULL values and use the MATCH function
to perform pattern searches. Furthermore, you can use the FO item in
arithmetic expressions with the FILELEN() built-in function and string
expressions with the FILENAME() built-in function.
For example,
if you want to select tuples in which .gif appears in the file
name of the photo column, you can use the following SQL command.
dmSQL> SELECT * FROM person WHERE FILENAME(photo) LIKE '%.gif';
|
If you
want to fetch tuples where the size of the photo column is greater
than 100KB, you can use the following SQL command.
dmSQL> SELECT * FROM person WHERE FILELEN(photo) > 102400;
|
Journal Logging of File Objects
DBMaker
does not support journal logging of FOs. When you back up the database,
don't forget to back up all FOs belonging to the database, too. To back
up FOs, you need to get the filenames of all FOs by querying the SYSFILEOBJ
table.
dmSQL> SELECT FILE_NAME FROM SYSFILEOBJ;
|
Then copy
all FOs to the backup storage. When you restore the database from a
backup, copy all FOs, too. If the file paths or file names have changed,
you can use the MOVE FILE OBJECT command to update the file names in
the SYSFILEOBJ table.
   
|