TransX is a data transfer utility that allows you to populate your database with multilingual data. It uses XML to specify the data so you can take advantage of easy data transfer from XML to the database, a simple data format that is intuitive for both developers and translators, and validation capability that makes it less error prone.
TransX has the following features:
The Oracle TransX utility needs the following components in order to function:
TransX is packaged with Oracle9i.
The TransX utility is made up of three executable files:
By default, the Oracle9i installer installs TransX on your hard drive in the locations specified above.
Choosing Where to Keep Translatable Data
To build an internationalized system, it is essential to decouple localizable resources from the business logic. A typical example of such a resource is translated text information. Data specific to a particular region and shares a common language and cultural conventions needs to be organized using some kind of resource management facility that allows you to retrieve locale-specific information. A database is often used to store them because of easy maintenance and flexibility. Typically they are stored in a table like the following:
MESSAGE_ID LANGUAGE_ID MESSAGE
---------- ----------- ----------------------------------
1 us Welcome to System X
2 us Please enter username and password
: : :
Notice that the column LANGUAGE_ID is included in this
table so that applications can retrieve messages based on the preferred
language of the end user. It contains abbreviations of language names to
identify the language of messages. Oracle language abbreviations can be
found in the appendix of the Oracle9i Globalization Support Guide. Translated
messages can be added to this table.
Arrangement in the Predefined XML Format
An XML document that represents the table above would look something like the following example:
<?xml version="1.0"?>As shown above in the predefined XML format, the way your dataset is arranged in XML should reflect the structure of the target table. The table is defined as in Example 1-2, and all of the columns are to be populated.
<table name="translated_messages"><lookup-key>
<column name="message_id" />
<column name="language_id" />
</lookup-key><columns>
<column name="message_id" type="number"/>
<column name="language_id" type="string" constant="us" translate="yes"/>
<column name="message" type="string" translate="yes"/>
</columns><dataset>
<row>
<col name="message_id">1</col>
<col name="message" translation-note="dnt'X'">Welcome to System X</col>
</row>
<row>
<col name="message_id">2</col>
<col name="message">Please enter username and password</col>
</row>
<!-- ... -->
</dataset></table>
SQL> desc translated_messagesAs always, this XML example starts with the declaration. Its root element table encloses all the other elements. Inside the table element there are three sections, one for the list of lookup keys, one for the table description and one for the dataset. lookup-keys are those columns that are used to evaluate rows if they are already existing in the database. Because we want a pair of message and language ID to identify a unique string, we list the corresponding columns. The columns section should mirror the table's structure because it specifies which piece of data in the dataset section should go to which column in the database. The column names should be consistent throughout your XML dataset and database. The valid values for the type attribute are string, number, date or dateTime. They correspond to the datatypes defined in the XML schema standard, so each piece of data should conform to the respective datatype definition. In particular, it is important to use the ISO 8601 format for date and dateTime datatypes.Name Null? Type
----------- -------- -------------
MESSAGE_ID NOT NULL NUMBER(4)
LANGUAGE_ID VARCHAR2(3)
MESSAGE VARCHAR2(200)
Datatype | Format | Example |
date | CCYY-MM-DD | 2009-05-20 |
dateTime | CCYY-MM-DDThh:mm:ss | 2009-05-20T16:01:37 |
Example 1-3 shows how a row with a dateTime data would look:
<row>Back to our example, the constant attribute of a column is used to specify a value to be stored into the corresponding column for every row that appears in the dataset section. As we are working on the original language, the language_id column is set to be 'us'.
<col name="article_id">12345678</col>
<col name="author_id">10500</col>
<col name="submission">2002-03-09T16:01:37</col>
<col name="title">...</col>
<!-- some columns follows -->
</row>
Translating the Data in XML
When translating messages for applications, it is often the case that certain words or phrases should be left untranslated. As shown in this example, the translation-note attribute may be added to elements with translatable string to help improve the quality of translation. It is used to tell if the column is going to contain translated data. In this example, there are two columns with this attribute and they use it with a slight difference in their meanings. The first one for the language_id means the value of the constant attribute should be translated. The other one for the message column means those data in the dataset section that has a name that matches this column, which is "message", needs to be translated, such as <col name="message">Welcome to System X</col>, <col name="message">Please enter...</col>, and so on.
By using XML transformation, this format can be converted into another format for exchanging translation data among localization service providers for use with XML-based translation tools. This transformation insulates application developers from the inherent complexity of the format for translation data exchange that includes information for keeping track of revisions, categorizing translatable strings into some units, and so on.
If a plain text editor or a traditional text-based translation tool is used during the translation process, it is important to maintain the encoding of the document. After a document is translated, it is likely that it is encoded in a different encoding than the original one. If the translated document is in an encoding other than Unicode, the encoding declaration should be added to the XML declaration on the first line. A declaration for non-Unicode encoding looks like this:
<?xml version="1.0" encoding="ISO-8859-15"?>
To ensure that the translation process did not lose the syntactic integrity, it is important to process the document as XML. If that is not the case, the validation option of the command line option is handy for anybody to check the format. If any syntactic error was brought in, the option prints out the location and what the error is. Errors must be fixed for the data transfer to succeed.
A translated document for our example is shown in Example 1-4:
<?xml version="1.0"?>
<table name="translated_messages">
<lookup-key>
<column name="message_id" />
<column name="language_id" />
</lookup-key>
<columns>
<column name="message_id" type="number"/>
<column name="language_id" type="string"
constant="e" translate="yes"/>
<column name="message"
type="string" translate="yes"/>
</columns>
<dataset>
<row>
<col name="message_id">1</col>
<col name="message" translation-note="dnt'System
X'">Bienvenido al Sistema X</col>
</row>
<row>
<col name="message_id">2</col>
<col name="message">Porfavor
entre su nombre de usuario y su contraseña</col>
</row>
<!-- ... -->
</dataset>
</table>
Data Transfer with TransX
Now that you have the original and translated versions of XML, it is time to execute TransX. The following command is an example to connect to the database mydb at port 1521 on myhost as scott/tiger and transfer the data in the XML files.
transx "myhost:1521:mydb" scott tiger example.xml example_e.xmlThe above command transfers two files of data at the same time.
In building a multilingual software system, translations usually become
available at a later stage of development.
They also tend to evolve over a period of time. If for any reason some
messages need to be added, they can be added as new rows in your dataset
definition simply by running TransX again. As we discussed earlier, TransX
recognizes which rows are new and it inserts only the new messages based
on the column(s) specified in the lookup-key section. If some messages
are updated, run TransX with the -u option. Then it updates existing rows
with the data specified in XML.
After running TransX with the command in Example 1-5, your table would look like the following:
MESSAGE_ID LANGUAGE_ID MESSAGEYour application can retrieve a message in a language with a SQL query like this:
---------- ----------- ----------------------------------
1 us Welcome to System X
1 e Bienvenido al Sistema X
2 us Please enter username and password
2 e Porfavor entre su nombre de usuario y su contraseña
SELECT message FROM translated_messages WHERE message_id = 2 AND LANGUAGE_ID = 'e' ;
TransX comes with a simple command line which gives users
quick access to TransX's functionality.
The command line options are provided through the java class loader. Invoke it by calling:
transx
To use the TransX command line, you first need to specify
where the executable is located. To do this, add the TransX java library
(transx.zip) to your CLASSPATH.
Because the TransX depends on the the other XDK components,
for the TransX to run, you need to make known the location of these components.
To do this, your CLASSPATH needs to include the locations of the Oracle
XML Parser java library (xmlparserv2.jar), the Oracle XML Schema Processor(xschema.jar)
and the JDBC library (ojdbc5.jar).
Here is the syntax of TransX command line:
transx [ <option> ... ] <connect_string> <username> <password> <datasource> [ <datasource> ... ]
where arguments are specified as follows:
<option> | One of the TransX options found in Table 2-2 |
<connect_string> | A JDBC connect string |
<username> | A database username |
<password> | The password for the database user |
<datasource> | An XML data source |
Example:
transx "jdbc:oracle:oci:@mydb" scott tiger foo.xml
If the type of your JDBC driver is Thin, the connect string does not have to be fully specified. Host name, port number, and the SID separated by colons connects you as if you specified the whole connect string.
transx "myhost:1521:mydb" scott tiger foo.xml
transx -v <datasource> [ <datasource> ... ]
Example:
transx -v foo.xml
This example validates the format of foo.xml and exit without database access.
When you specify the -s(unloading) option, the argument list looks like this:
transx -s <connect_string> <username> <password> <filename> <table> [ <column> ... ]
Example:
transx -s "dlsun9999:1521:mydb" scott tiger emp.xml emp ename job
This example shows how columns in a table(ename and job in the table
emp) are formatted into an XML file(emp.xml) in the predefined format.
Column names are optional. By default, all columns are unloaded.
<?xml version="1.0"?>
<table name="i18n_messages">
<lookup-key>
<column name="message_code" />
</lookup-key><columns>
<column name="message_id" type="number" sequence="i18n_message_seq" useforupdate="no"/>
<column name="message_code" type="number" />
<column name="message_name" type="string" translate="yes" />
<column name="message_description" type="string" translate="yes" />
<column name="version_created" type="number" constant="0" />
<column name="version_updated" type="number" constant="0" />
<column name="message_type_code" type="string" virtual="yes" />
<column name="message_type_id" type="number" >
<query text="select message_type_id from i18n_message_type where message_type_code = :1" >
<parameter id="1" col="message_type_code" />
</query>
</column>
</columns><dataset>
<row>
<col name="message_code" >100</col>
<col name="message_name" trans-key="stts-name-1" >Continue</col>
<col name="message_description" trans-key="stts-desc-1" >
The client should continue with its request.</col>
<col name="message_type_code" >INFO</col>
</row><row>
<col name="message_code" >101</col>
<col name="message_name" trans-key="stts-name-2" >Switching Protocols</col>
<col name="message_description" trans-key="stts-desc-2" >
The server understands and is willing to comply with the client''s
request (via the Upgrade message header field) for a change in the
application protocol being used on this connection.</col>
<col name="message_type_code" >INFO</col>
</row><row>
<col name="message_code" >200</col>
<col name="message_name" trans-key="stts-name-3" >OK</col>
<col name="message_description" trans-key="stts-desc-3" >
The request has succeeded.</col>
<col name="message_type_code" >SUCCESS</col>
</row><row>
<col name="message_code" >201</col>
<col name="message_name" trans-key="stts-name-4" >Created</col>
<col name="message_description" trans-key="stts-desc-4" >
The request has been fulfilled and resulted in a new resource being
created.</col>
<col name="message_type_code" >SUCCESS</col>
</row><row>
<col name="message_code" >202</col>
<col name="message_name" trans-key="stts-name-5" >Accepted</col>
<col name="message_description" trans-key="stts-desc-5" >
The request has been accepted for processing, but the processing has
not been completed.</col>
<col name="message_type_code" >SUCCESS</col>
</row><row>
<col name="message_code" >203</col>
<col name="message_name" trans-key="stts-name-6" >Non-Authoritative Information</col>
<col name="message_description" trans-key="stts-desc-6" >
The returned metainformation in the entity-header is not the
definitive set as available from the origin server, but is gathered
from a local or a third-party copy.</col>
<col name="message_type_code" >SUCCESS</col>
</row><row>
<col name="message_code" >204</col>
<col name="message_name" trans-key="stts-name-7" >No Content</col>
<col name="message_description" trans-key="stts-desc-7" >
The server has fulfilled the request but does not need to return an
entity-body, and might want to return updated metainformation.</col>
<col name="message_type_code" >SUCCESS</col>
</row><!-- ... -->
</dataset>
</table>
The following two classes make up the XML-SQL Utility Java API:
The TransX interface makes up the core part of the TransX's
API.
Figure
1 illustrates the basic steps in the usage of TransX API.
Perform these steps when loading XML:
1. getLoader() -----------+
|
V
+------+------------+ ________
2. transx.open() -->| : | / \
|TransX: TransX | \________/
3. transx.load() -->| : | =====> | |
-->| : Core | | |
-->| API : | | |
: | : Powerd by | \________/
| : XDK |
4. transx.close() -->| : |
+------+------------+
String datasrc[] = {"data1.xml", "data2.xml", "data3.xml"}; // instantiate a transx class TransX transx = loader.getLoader(); // start a data loading session transx.open( "jdbc:oracle:oci8:@", "scott", "tiger" ); // specify operation modes transx.setLoadingMode( LoadingMode.SKIP_DUPLICATES ); transx.setValidationMode( false ); // load the dataset(s) for ( int i = 0 ; i < datasrc.length ; i++ ) { transx.load( datasrc[i] ); } // cleanup transx.close();
Here, the connection is made using the Oracle OCI driver.
You can connect to the scott schema supplying the password tiger. It
connects to the current database (identified by the ORACLE_SID environment
variable). You can also use the JDBC Thin driver to connect to the database.
The thin driver is written in pure Java and can be called from within applets
or any other Java program.
Here's an example of connecting using the thin driver.
// start a data loading session
with thin driver
transx.open( "jdbc:oracle:thin:@myhost:1521:ORCL",
"scott","tiger");
If you are just validating your data format, you don't need to establish a database connection, since the validation is totally performed by TransX. Consequently your command line for validation only should not have connect information.
transx -v foo.xml
What
makes TransX different from other data loading utilities like SQL*Loader,
Import, XSU or DataPump?
TransX is suitable for loading translated data, as it is designed to meet today's globalization needs like storing application messages in various languages. With TransX multilingual data can be transferred without having to switch NLS_LANG to set the client to an appropriate NLS environment. It also makes TransX unique that it takes its input in a simple predefined XML format that represents your dataset which can contain expressions such as constants, sequences and stored procedure calls.
Is
it possible to unload existing data in the database into the canonical
XML format?
Yes,
it is. This functionality is provided through the -s option to make it
easy to switch from loading .sql files to .xml files. After migrating to
.xml format, it will no longer be necessary to configure the environment
to a different NLS setting each time the character set of the data file
to be loaded is changed to another.
Do
I need to have tables in the database before I can use TransX?
Yes. TransX does not create them for you.
|
Copyright
© 1996-2003 Oracle Corporation.
All Rights Reserved. |