Edit D:\app\Administrator\product\11.2.0\dbhome_1\oc4j\j2ee\oc4j_applications\applications\em\em\online_help\tdpdw\tdpdw_system.htm
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <meta http-equiv="Content-Type" content="text/html; charset=us-ascii" /> <meta http-equiv="Content-Style-Type" content="text/css" /> <meta http-equiv="Content-Script-Type" content="text/javascript" /> <title>Setting Up Your Data Warehouse System</title> <meta name="generator" content="Oracle DARB XHTML Converter (Mode = ohj/ohw) - Version 5.1" /> <meta name="date" content="2009-04-27T11:26:23Z" /> <meta name="robots" content="noarchive" /> <meta name="doctitle" content=" Setting Up Your Data Warehouse System" /> <meta name="relnum" content="11g Release 2 (11.2)" /> <meta name="partnum" content="E10578-01" /> <meta name="topic-id" content="TDPDW003" /> <link rel="copyright" href="./dcommon/html/cpyr.htm" title="Copyright" type="text/html" /> <link rel="stylesheet" href="./dcommon/css/blafdoc.css" title="Oracle BLAFDoc" type="text/css" /> <link rel="contents" href="toc.htm" title="Contents" type="text/html" /> <link rel="prev" href="tdpdw_intro.htm" title="Previous" type="text/html" /> <link rel="next" href="tdpdw_owb_sources.htm" title="Next" type="text/html" /> <script src="./callback.js" type="text/javascript"></script> <noscript>Your browser does not support JavaScript. This help page requires JavaScript to render correctly.</noscript> </head> <body> <div class="zz-skip-header"><a href="#BEGIN">Skip Headers</a></div> <table class="simple oac_no_warn" summary="" cellspacing="0" cellpadding="0" width="100%"> <col width="86%" /> <col width="*" /> <tr valign="bottom"> <td align="left"></td> <td align="center"><a href="tdpdw_intro.htm"><img src="./dcommon/gifs/leftnav.gif" alt="Previous" /><br /> <span class="icon">Previous</span></a> </td> <td align="center"><a href="tdpdw_owb_sources.htm"><img src="./dcommon/gifs/rightnav.gif" alt="Next" /><br /> <span class="icon">Next</span></a></td> </tr> </table> <p><a id="CHDBFDGI" name="CHDBFDGI"></a><a id="TDPDW003" name="TDPDW003"></a></p> <h1><a name="TDPDW003|Setting up Your Data Warehouse"></a>Setting Up Your Data Warehouse System</h1> <p><a name="BEGIN" id="BEGIN"></a></p> <p>This section discusses how to initially configure your data warehouse environment. It includes the following topics:</p> <ul> <li> <p><a href="#CBBDIBJE">General Steps for Setting up a Data Warehouse System</a></p> </li> <li> <p><a href="#CBBBGIBF">Preparing the Environment</a></p> </li> <li> <p><a href="#CFHDHBGD">Setting Up a Database for a Data Warehouse</a></p> </li> <li> <p><a href="#BABJEBDG">Accessing Oracle Warehouse Builder</a></p> </li> </ul> <p><a id="CBBDIBJE" name="CBBDIBJE"></a><a id="TDPDW0027" name="TDPDW0027"></a></p> <div class="sect1"><!-- infolevel="all" infotype="General" --> <h2><a name="TDPDW00171" id="TDPDW00171"></a>General Steps for Setting up a Data Warehouse System</h2> <p>The procedures in this section describe how to configure Oracle Database for use as a data warehouse. Subsequently, you configure Oracle Warehouse Builder, which leverages Oracle Database and provides graphical user interfaces for designing data management strategies.</p> <a id="sthref9" name="sthref9"></a> <p class="subhead2">To set up a data warehouse system, complete the following steps:</p> <ol> <li> <p>Size and configure your hardware as described in <a href="#CBBBGIBF">"Preparing the Environment"</a>.</p> </li> <li> <p>Install the Oracle Database software.</p> </li> <li> <p>Optimize the Database for use as a data warehouse as described in <a href="#CFHDHBGD">"Setting Up a Database for a Data Warehouse"</a>.</p> </li> <li> <p>Access the Oracle Warehouse Builder software.</p> <p>Follow the instructions in <a href="#BABJEBDG">"Accessing Oracle Warehouse Builder"</a>. Subsequently, you can install a demonstration to assist you in learning how to complete common data warehousing tasks using Warehouse Builder.</p> </li> </ol> </div> <!-- class="sect1" --> <p><a id="CBBBGIBF" name="CBBBGIBF"></a><a id="TDPDW00172" name="TDPDW00172"></a></p> <div class="sect1"><!-- infolevel="all" infotype="General" --> <h2><a name="TDPDW00172" id="TDPDW00172"></a>Preparing the Environment</h2> <p>The basic<a id="sthref10" name="sthref10"></a> components for a data warehouse architecture are almost the same as for an online transaction processing (OLTP) system. However, because of the sheer size and volume of data, the hardware configuration and data throughput requirements for a data warehouse are unique. The starting point for sizing a data warehouse is the throughput that you require from the system. This can be one or both of the following criteria:</p> <ul> <li> <p>The amount of data that is being accessed by queries hitting the system at peak time, in conjunction with the acceptable response time. You may be able to use throughput numbers and experience from an existing application to estimate the required throughput.</p> </li> <li> <p>The amount of data that is loaded within a window of time.</p> </li> </ul> <p>In general, you must estimate the highest throughput you need at any given point.</p> <p>Hardware vendors can recommend balanced configurations for a data warehousing application and can help you with the sizing. Contact your preferred hardware vendor for more details.</p> <a id="CFHFJEDD" name="CFHFJEDD"></a><a id="TDPDW0031" name="TDPDW0031"></a> <div class="sect2"><!-- infolevel="all" infotype="General" --> <h3><a name="TDPDW0031|Ensuring a Balanced Hardware System"></a>Balanced Hardware Configuration</h3> <p><a id="sthref11" name="sthref11"></a>A properly sized and balanced hardware configuration is required to maximize data warehouse performance. The following sections discuss some important considerations in achieving this balance:</p> <ul> <li> <p><a href="#CHDHAEGE">How Many CPUs and What Clock Speed Do I Need?</a></p> </li> <li> <p><a href="#CHDDEFBD">How Much Memory Do I Need?</a></p> </li> <li> <p><a href="#CHDFIGHF">How Many Disks Do I Need?</a></p> </li> <li> <p><a href="#CHDBEEGJ">How Do I Determine Sufficient I/O Bandwidth?</a></p> </li> </ul> <a id="CHDHAEGE" name="CHDHAEGE"></a><a id="TDPDW00311" name="TDPDW00311"></a> <div class="sect3"><!-- infolevel="all" infotype="General" --> <h4><a name="TDPDW00311|How Many CPUs and what Clock speed"></a>How Many CPUs and What Clock Speed Do I Need?</h4> <p>Central<a id="sthref12" name="sthref12"></a> processing units (CPUs) provide the calculation capabilities in a data warehouse. You must have sufficient CPU power to perform the data warehouse operations. Parallel operations are more CPU-intensive than the equivalent serial operations.</p> <p>Use the estimated highest throughput as a guideline for the number of CPUs you need. As a rough estimate, use the following formula:</p> <ul> <li> <p><code><number of CPUs> = <maximum throughput in MB/s> / 200</code></p> </li> </ul> <p>This formula assumes that a CPU can sustain up to about 200 MB per second. For example, if a system requires a maximum throughput of 1200 MB per second, then the system needs <code><number of CPUs> = 1200/200 = 6 CPUs</code>. A configuration with 1 server with 6 CPUs can service this system. A 2-node clustered system could be configured with 3 CPUs in both nodes.</p> </div> <!-- class="sect3" --> <a id="CHDDEFBD" name="CHDDEFBD"></a><a id="TDPDW00312" name="TDPDW00312"></a> <div class="sect3"><!-- infolevel="all" infotype="General" --> <h4><a name="TDPDW00312|How much Memory Needed"></a>How Much Memory Do I Need?</h4> <p>Memor<a id="sthref13" name="sthref13"></a><a id="sthref14" name="sthref14"></a>y in a data warehouse is particularly important for processing memory-intensive operations such as large sorts. Access to the data cache is less important in a data warehouse because most of the queries access vast amounts of data. Data warehouses do not have the same memory requirements as mission-critical OLTP applications.</p> <p>The number of CPUs provides a good guideline for the amount of memory you need. Use the following simplified formula to derive the amount of memory you need from the CPUs that you select:</p> <ul> <li> <p><code><amount of memory in GB> = 2 * <number of CPUs></code></p> </li> </ul> <p>For example, a system with 6 CPUs needs <code>2 * 6 = 12 GB</code> of memory. Most standard servers fulfill this requirement.</p> </div> <!-- class="sect3" --> <a id="CHDFIGHF" name="CHDFIGHF"></a><a id="TDPDW00313" name="TDPDW00313"></a> <div class="sect3"><!-- infolevel="all" infotype="General" --> <h4><a name="TDPDW00313|How Many Disks Needed"></a>How Many Disks Do I Need?</h4> <p>A commo<a id="sthref15" name="sthref15"></a>n mistake in data warehouse environments is to size the storage based on the maximum capacity needed. Sizing that is based exclusively on storage requirements will likely create a throughput bottleneck.</p> <p>Use the maximum throughput you require to find out how many disk arrays you need. Use the storage provider's specifications to find out how much throughput a disk array can sustain. Note that storage providers measure in Gb per second, and your initial throughput estimate is based on MB per second. An average disk controller has a maximum throughput of 2 Gb per second, which translates to a sustainable throughput of about <code>(70% * 2 Gbit/s) /8 = 180 MB/s</code>.</p> <p>Use the following formula to determine the number of disk arrays you need:</p> <ul> <li> <p><code><number of disk controllers> = <throughput in MB/s> / <individual controller throughput in MB/s></code></p> </li> </ul> <p>For example, our system with 1200 MB per second throughput requires at least 1200 / 180 = 7 disk arrays.</p> <p>Make sure you have enough physical disks to sustain the throughput you require. Ask your disk vendor for the throughput numbers of the disks.</p> </div> <!-- class="sect3" --> <a id="CHDBEEGJ" name="CHDBEEGJ"></a><a id="TDPDW00314" name="TDPDW00314"></a> <div class="sect3"><!-- infolevel="all" infotype="General" --> <h4><a name="TDPDW00314|Ensuring Sufficient I/O Bandwidth"></a>How Do I Determine Sufficient I/O Bandwidth?</h4> <p>The end-to-<a id="sthref16" name="sthref16"></a>end I/O system consists of more components than just the CPUs and disks. A well-balanced I/O system has to provide approximately the same bandwidth across all components in the I/O system. These components include:</p> <ul> <li> <p>Host Bus Adapters (HBAs), the connectors between the server and the storage.</p> </li> <li> <p>Switches, in between the servers and a Storage Area Network (SAN) or Network Attached Storage (NAS).</p> </li> <li> <p>Ethernet adapters for network connectivity (GigE NIC or Infiniband). In a clustered environment, you need an additional private port for the interconnect between the nodes that you should not include when sizing the system for I/O throughput. The interconnect must be sized separately, taking into account factors such as internode parallel execution.</p> </li> <li> <p>Wires that connect the individual components.</p> </li> </ul> <p>Each of the components has to be able to provide sufficient I/O bandwidth to ensure a well-balanced I/O system. The initial throughput you estimated and the hardware specifications from the vendors are the basis to determine the quantities of the individual components you need. Use the conversion in the following table to translate the vendors' maximum throughput numbers in bits into sustainable throughput numbers in bytes.</p> <div class="tblformal"><a id="sthref17" name="sthref17"></a><a id="BGBFGGHA" name="BGBFGGHA"></a> <p class="titleintable">Throughput Performance Conversion</p> <table class="Formal" title="Throughput Performance Conversion" summary="Throughput Performance" dir="ltr" border="1" width="100%" frame="hsides" rules="groups" cellpadding="3" cellspacing="0"> <col width="*" /> <col width="29%" /> <col width="35%" /> <thead> <tr align="left" valign="top"> <th align="left" valign="bottom" id="r1c1-t2">Component</th> <th align="left" valign="bottom" id="r1c2-t2">Bits</th> <th align="left" valign="bottom" id="r1c3-t2">Bytes Per Second</th> </tr> </thead> <tbody> <tr align="left" valign="top"> <td align="left" id="r2c1-t2" headers="r1c1-t2"> <p>HBA</p> </td> <td align="left" headers="r2c1-t2 r1c2-t2"> <p>2 Gbit</p> </td> <td align="left" headers="r2c1-t2 r1c3-t2"> <p>200 MB</p> </td> </tr> <tr align="left" valign="top"> <td align="left" id="r3c1-t2" headers="r1c1-t2"> <p>16 Port Switch</p> </td> <td align="left" headers="r3c1-t2 r1c2-t2"> <p>8 * 2 Gbit</p> </td> <td align="left" headers="r3c1-t2 r1c3-t2"> <p>1200 MB</p> </td> </tr> <tr align="left" valign="top"> <td align="left" id="r4c1-t2" headers="r1c1-t2"> <p>Fibre Channel</p> </td> <td align="left" headers="r4c1-t2 r1c2-t2"> <p>2 Gbit</p> </td> <td align="left" headers="r4c1-t2 r1c3-t2"> <p>200 MB</p> </td> </tr> <tr align="left" valign="top"> <td align="left" id="r5c1-t2" headers="r1c1-t2"> <p>GigE NIC</p> </td> <td align="left" headers="r5c1-t2 r1c2-t2"> <p>1 Gbit</p> </td> <td align="left" headers="r5c1-t2 r1c3-t2"> <p>80 MB</p> </td> </tr> <tr align="left" valign="top"> <td align="left" id="r6c1-t2" headers="r1c1-t2"> <p>Inf-2 Gbit</p> </td> <td align="left" headers="r6c1-t2 r1c2-t2"> <p>2 Gbit</p> </td> <td align="left" headers="r6c1-t2 r1c3-t2"> <p>160 MB</p> </td> </tr> </tbody> </table> <br /></div> <!-- class="tblformal" --> <p>In addition to having sufficient components to ensure sufficient I/O bandwidth, the layout of data on the disk is also key to success or failure. If you configured the system for sufficient throughput across all disk arrays, and all of the data that a query is going to retrieve still resides on one disk, then you will not be able to get the required throughput. This is because your one disk will be the bottleneck. In order to avoid such a situation, stripe data across as many disks as possible, ideally all disks. A stripe size of 256 KB to 1 MB provides a good balance between multiblock read operations and data spread across multiple disks.</p> <a id="sthref18" name="sthref18"></a> <p class="subhead2"><a name="TDPDW0034|Using Automatic Storage Management (ASM)"></a>About Automatic Storage Management (ASM)</p> <p>ASM is a co<a id="sthref19" name="sthref19"></a><a id="sthref20" name="sthref20"></a>mponent of Oracle Database that you can use to stripe data across disks in a disk group. ASM ensures the data is balanced across all disks. Disks can be added or removed while ASM is operational, and ASM will automatically rebalance the storage across all available disks. ASM can also be used to mirror data on the file system, to avoid loss of data in case of disk failure. The default stripe size for ASM is 1 MB. You can lower the stripe size to 128 KB.</p> <p>You can p<a id="sthref21" name="sthref21"></a>erform storage operations without ASM, but this increases the chances of making a mistake. Thus, Oracle recommends that you use ASM whenever possible.</p> </div> <!-- class="sect3" --></div> <!-- class="sect2" --> <a id="CFHDHJCH" name="CFHDHJCH"></a><a id="TDPDW0032" name="TDPDW0032"></a> <div class="sect2"><!-- infolevel="all" infotype="General" --> <h3><a name="TDPDW0032|Verifying the Data Warehouse Hardware Configuration"></a>Verifying the Data Warehouse Hardware Configuration</h3> <p>Before you install O<a id="sthref22" name="sthref22"></a>racle Database, you should verify your setup on the hardware and operating-system level. The key point to understand is that if the operating system cannot deliver the performance and throughput you need, Oracle Database will never be able to perform according to your requirements. Two tools for verifying throughput are the <code>dd</code> utility and Orion, an Oracle-supplied tool.</p> <a id="TDPDW00321" name="TDPDW00321"></a> <div class="sect3"><!-- infolevel="all" infotype="General" --><a id="sthref23" name="sthref23"></a> <h4><a name="TDPDW00321|Using the dd Utility"></a>About the dd Utility</h4> <p>A very bas<a id="sthref24" name="sthref24"></a>ic way to validate the operating system throughput, on UNIX or Linux systems, is to use the <code>dd</code> utility. The <code>dd</code> utility is a very basic way to read data blocks directly from disk and, because there is almost no overhead involved, the output from the <code>dd</code> utility provides a reliable calibration. Oracle Database will reach a maximum throughput of approximately 90 percent of what the <code>dd</code> utility can achieve.</p> </div> <!-- class="sect3" --> <div class="sect3"><a id="sthref25" name="sthref25"></a> <h4><a name="TDPDW0036|Example: Using the dd Utility"></a>Example: Using the dd Utility</h4> <a id="sthref26" name="sthref26"></a> <p class="subhead2">To use the <code>dd</code> utility:</p> <p>First, the most important options for using <code>dd</code> are:</p> <pre xml:space="preserve" class="oac_no_warn"> bs=BYTES: Read BYTES bytes at a time; use 1 MB count=BLOCKS: copy only BLOCKS input blocks if=FILE: read from FILE; set to your device of=FILE: write to FILE; set to /dev/null to evaluate read performance; write to disk would erase all existing data!!! skip=BLOCKS: skip BLOCKS BYTES-sized blocks at start of input </pre> <p>To estimate the maximum throughput Oracle Database will be able to achieve, you can mimic a workload of a typical data warehouse application, which consists of large, random sequential disk access.</p> <p>The following <code>dd</code> command performs random sequential disk access across two devices reading a total of 2 GB. The throughput is 2 GB divided by the time it takes to finish the following command:</p> <pre xml:space="preserve" class="oac_no_warn"> dd bs=1048576 count=200 if=/raw/data_1 of=/dev/null & dd bs=1048576 count=200 skip=200 if=/raw/data_1 of=/dev/null & dd bs=1048576 count=200 skip=400 if=/raw/data_1 of=/dev/null & dd bs=1048576 count=200 skip=600 if=/raw/data_1 of=/dev/null & dd bs=1048576 count=200 skip=800 if=/raw/data_1 of=/dev/null & dd bs=1048576 count=200 if=/raw/data_2 of=/dev/null & dd bs=1048576 count=200 skip=200 if=/raw/data_2 of=/dev/null & dd bs=1048576 count=200 skip=400 if=/raw/data_2 of=/dev/null & dd bs=1048576 count=200 skip=600 if=/raw/data_2 of=/dev/null & dd bs=1048576 count=200 skip=800 if=/raw/data_2 of=/dev/null & </pre> <p>In your test, you should include all the storage devices that you plan to include for your database storage. When you configure a clustered environment, you should run <code>dd</code> commands from every node.</p> </div> <!-- class="sect3" --> <a id="TDPDW00322" name="TDPDW00322"></a> <div class="sect3"><!-- infolevel="all" infotype="General" --><a id="sthref27" name="sthref27"></a> <h4><a name="TDPDW00322|Using the Orion Utility"></a>About the Orion Utility</h4> <p>Orion is a tool th<a id="sthref28" name="sthref28"></a>at Oracle provides to mimic an Oracle-like workload on a system in order to calibrate the throughput. Compared to the <code>dd</code> utility, Orion provides the following advantages:</p> <ul> <li> <p>Orion's simulation is closer to the workload the database will produce.</p> </li> <li> <p>Orion enables you to perform reliable write and read simulations within one simulation.</p> </li> </ul> <p>Oracle recommends you use Orion to verify the maximum achievable throughput, even if a database has already been installed.</p> <p>The types of supported I/O workloads are as follows:</p> <ul> <li> <p>small and random</p> </li> <li> <p>large and sequential</p> </li> <li> <p>large and random</p> </li> <li> <p>mixed workloads</p> </li> </ul> <p>For each type of workload, Orion can run tests at different levels of I/O load to measure performance metrics such as MB per second, I/O per second, and I/O latency. A data warehouse workload is typically characterized by sequential I/O throughput, issued by multiple processes. You can run different I/O simulations depending upon which type of system you plan to build. Examples are the following:</p> <ul> <li> <p>daily workloads when users or applications query the system</p> </li> <li> <p>the data load when users may or may not access the system</p> </li> <li> <p>index and materialized view builds</p> </li> <li> <p>backup operations</p> </li> </ul> <a id="sthref29" name="sthref29"></a> <p class="subhead2">To download Orion software, point your browser to the following:</p> <p><a href="http://www.oracle.com/technology/software/tech/orion/index.html"><code>http://www.oracle.com/technology/software/tech/orion/index.html</code></a></p> <p>Note that Orion is Beta software, and unsupported.</p> </div> <!-- class="sect3" --> <a id="TDPDW0037" name="TDPDW0037"></a> <div class="sect3"><!-- infolevel="all" infotype="General" --><a id="sthref30" name="sthref30"></a> <h4><a name="TDPDW0037|Example: Using the Orion Utility"></a>Example: Using the Orion Utility</h4> <p>To invoke Orion:</p> <pre xml:space="preserve" class="oac_no_warn"> $ orion -run simple -testname mytest -num_disks 8 </pre> <p>Typical output is as follows:</p> <pre xml:space="preserve" class="oac_no_warn"> Orion VERSION 10.2 Command line: -run advanced -testname orion14 -matrix point -num_large 4 -size_large 1024 -num_disks 4 -type seq -num_streamIO 8 -simulate raid0 -cache_size 0 -verbose This maps to this test: Test: orion14 Small IO size: 8 KB Large IO size: 1024 KB IO Types: Small Random IOs, Large Sequential Streams Number of Concurrent IOs Per Stream: 8 Force streams to separate disks: No Simulated Array Type: RAID 0 Stripe Depth: 1024 KB Write: 0% Cache Size: 0 MB Duration for each Data Point: 60 seconds Small Columns:, 0 Large Columns:, 4 Total Data Points: 1 Name: /dev/vx/rdsk/asm_vol1_1500m Size: 1572864000 Name: /dev/vx/rdsk/asm_vol2_1500m Size: 1573912576 Name: /dev/vx/rdsk/asm_vol3_1500m Size: 1573912576 Name: /dev/vx.rdsk/asm_vol4_1500m Size: 1573912576 4 FILEs found. Maximum Large MBPS=57.30 @ Small=0 and Large=4 </pre> <p>In this example, the maximum throughput for this particular workload is 57.30 MB per second.</p> </div> <!-- class="sect3" --></div> <!-- class="sect2" --></div> <!-- class="sect1" --> <p><a id="CFHDHBGD" name="CFHDHBGD"></a><a id="TDPDW0033" name="TDPDW0033"></a></p> <div class="sect1"><!-- infolevel="all" infotype="General" --> <h2><a name="TDPDW0033|Setting up a Database for a Data Warehouse"></a>Setting Up a Database for a Data Warehouse</h2> <p>After <a id="sthref31" name="sthref31"></a>you set up your environment and install Oracle Database software, ensure that you have the database parameters set correctly. Note that there are not many database parameters that have to be set.</p> <p>As a general guideline, avoid changing a database parameter unless you have good reason to do so. You can use Oracle Enterprise Manager to set up your data warehouse. To view various parameter settings, navigate to the Database page, then click <span class="bold">Server</span>. Under Database Configuration, click <span class="bold">Memory</span> <span class="bold">Parameters</span> or <span class="bold">All</span> <span class="bold">Inititalization</span> <span class="bold">Parameters</span>.</p> <a id="TDPDW00331" name="TDPDW00331"></a> <div class="sect2"><!-- infolevel="all" infotype="General" --><a id="sthref32" name="sthref32"></a> <h3><a name="TDPDW00331|How Should I Set the Memory Management Parameters"></a>How Should I Set the Memory Management Parameters?</h3> <p>On a high<a id="sthref33" name="sthref33"></a><a id="sthref34" name="sthref34"></a> level, there are two memory segments:</p> <ul> <li> <p>Shared memory: Also called the system global area (SGA), this is the memory used by the Oracle instance.</p> </li> <li> <p>Session-based memory: Also called program global area (PGA), this is the memory that is occupied by sessions in the database. It is used to perform database operations, such as sorts and aggregations.</p> </li> </ul> <p>Oracle Database can automatically tune the distribution of the memory components in two memory areas. As a result, you must set only the following parameters:</p> <ul> <li> <p><code>SGA_TARGET</code></p> <p>The <code>SGA_TARGET</code> parameter is the amount of memory you want to allocate for shared memory. For a data warehouse, the SGA can be relatively small compared to the total memory consumed by the PGA. To get started, assign 25% of the total memory you allow Oracle Database to use to the SGA. The SGA, at a minimum, should be 100 MB.</p> </li> <li> <p><code>PGA_AGGREGATE_TARGET</code></p> <p>The <code>PGA_AGGREGATE_TARGET</code> parameter is the target amount of memory that you want the total PGA across all sessions to consume. As a starting point, you can use the following formula to define the <code>PGA_AGGREGATE_TARGET</code> value:</p> <ul> <li> <p><code>PGA_AGGREGATE_TARGET = 3 * SGA_TARGET</code> If you do not have enough physical memory for the <code>PGA_AGGREGATE_TARGET</code> to fit in memory, then reduce <code>PGA_AGGREGATE_TARGET</code>.</p> </li> </ul> </li> <li> <p><code>MEMORY_TARGET</code> and <code>MEMORY_MAX_TARGET</code></p> <p>The <code>MEMORY_TARGET</code> parameter enables you to set a target memory size and the related initialization parameter, <code>MEMORY_MAX_TARGET</code>, sets a maximum target memory size. The database then tunes to the target memory size, redistributing memory as needed between the system global area (SGA) and aggregate program global area (PGA). Because the target memory initialization parameter is dynamic, you can change the target memory size at any time without restarting the database. The maximum memory size serves as an upper limit so that you cannot accidentally set the target memory size too high. Because certain SGA components either cannot easily shrink or must remain at a minimum size, the database also prevents you from setting the target memory size too low.</p> </li> </ul> </div> <!-- class="sect2" --> <a id="TDPDW00332" name="TDPDW00332"></a> <div class="sect2"><!-- infolevel="all" infotype="General" --><a id="sthref35" name="sthref35"></a> <h3><a name="TDPDW00332|Example: Setting an Initialization Parameter"></a>Example: Setting an Initialization Parameter</h3> <p>You can set an initialization parameter by issuing an <code>ALTER</code> <code>SYSTEM</code> statement, as follows:</p> <pre xml:space="preserve" class="oac_no_warn"> ALTER SYSTEM SET SGA_TARGET = 1024M; </pre></div> <!-- class="sect2" --> <a id="TDPDW00333" name="TDPDW00333"></a> <div class="sect2"><!-- infolevel="all" infotype="General" --><a id="sthref36" name="sthref36"></a> <h3><a name="TDPDW00333|What Initialization Parameter Settings are Important"></a>What Other Initialization Parameter Settings Are Important?</h3> <p>A good starting<a id="sthref37" name="sthref37"></a><a id="sthref38" name="sthref38"></a> point for a data warehouse is the data warehouse template database that you can select when you run the Database Configuration Assistant (DBCA). However, any database will be acceptable as long as you make sure you take the following initialization parameters into account:</p> <ul> <li> <p><code>COMPATIBLE</code></p> <p>The <code>COMPATIBLE</code> parameter identifies the level of compatibility that the database has with earlier releases. To benefit from the latest features, set the <code>COMPATIBLE</code> parameter to your database release number.</p> </li> <li> <p><code>OPTIMIZER_FEATURES_ENABLE</code></p> <p>To benefit from advanced cost-based optimizer features such as <a id="sthref39" name="sthref39"></a><a id="sthref40" name="sthref40"></a>query rewrite, make sure this parameter is set to the value of the current database version.</p> </li> <li> <p><code>DB_BLOCK_SIZE</code></p> <p>The default value of 8 KB is appropriate for most data warehousing needs. If you intend to use table compression, consider a larger block size.</p> </li> <li> <p><code>DB_FILE_MULTIBLOCK_READ_COUNT</code></p> <p>The <code>DB_FILE_MULTIBLOCK_READ_COUNT</code> parameter enables reading several database blocks in a single operating-system read call. Because a typical workload on a data warehouse consists of many sequential I/Os, make sure you can take advantage of fewer large I/Os as opposed to many small I/Os. When setting this parameter, take into account the block size as well as the maximum I/O size of the operating system, and use the following formula:</p> <ul> <li> <p><code>DB_FILE_MULTIBLOCK_READ_COUNT * DB_BLOCK_SIZE = <maximum operating system I/O size></code></p> </li> </ul> <p>Maximum operating-system I/O sizes vary between 64 KB and 1 MB.</p> </li> <li> <p><code>PARALLEL_MAX_SERVERS</code></p> <p>The <code>PARALLEL_MAX_SERVERS</code> parameter sets a resource limit on the maximum number of processes available for parallel execution. Parallel operations need at most twice the number of query server processes as the maximum degree of parallelism (DOP) attributed to any table in the operation.</p> <p>Oracle Database sets the <code>PARALLEL_MAX_SERVERS</code> parameter to a default value that is sufficient for most systems. The default value for <code>PARALLEL_MAX_SERVERS</code> is as follows:</p> <ul> <li> <p><code>(CPU_COUNT x PARALLEL_THREADS_PER_CPU x (2 if PGA_AGGREGATE_TARGET > 0; otherwise 1) x 5)</code></p> </li> </ul> <p>This might not be enough for parallel queries on tables with higher DOP attributes. Oracle recommends users who expect to run queries of higher DOP to set <code>PARALLEL_MAX_SERVERS</code> as follows:</p> <ul> <li> <p><code>2 x DOP x NUMBER_OF_CONCURRENT_USERS</code></p> </li> </ul> <p>For example, setting the <code>PARALLEL_MAX_SERVERS</code> parameter to 64 will allow you to run four parallel queries simultaneously, assuming that each query is using two slave sets with a DOP of eight for each set.</p> <p>If the hardware system is neither CPU-bound nor I/O bound, then you can increase the number of concurrent parallel execution users on the system by adding more query server processes. When the system becomes CPU- or I/O-bound, however, adding more concurrent users becomes detrimental to the overall performance. Careful setting of the <code>PARALLEL_MAX_SERVERS</code> parameter is an effective method of restricting the number of concurrent parallel operations.</p> </li> <li> <p><code>PARALLEL_ADAPTIVE_MULTI_USER</code></p> <p>The <code>PARALLEL_ADAPTIVE_MULTI_USER</code> parameter, which can be <code>TRUE</code> or <code>FALSE</code>, defines whether or not the server will use an algorithm to dynamically determine the degree of parallelism for a particular statement depending on the current workload. To take advantage of this feature, set <code>PARALLEL_ADAPTIVE_MULTI_USER</code> to <code>TRUE</code>.</p> </li> <li> <p><code>QUERY_REWRITE_ENABLED</code></p> <p>To take advantage of <a id="sthref41" name="sthref41"></a><a id="sthref42" name="sthref42"></a>query rewrite against materialized views, you must set this parameter to <code>TRUE</code>. This parameter defaults to <code>TRUE</code>.</p> </li> <li> <p><code>QUERY_REWRITE_INTEGRITY</code></p> <p>The default for the <code>QUERY_REWRITE_INTEGRITY</code> parameter is <code>ENFORCED</code>. The database will rewrite queries against only fully up-to-date materialized views, if it can base itself on enabled and validated primary, unique, and foreign key constraints.</p> <p>In <code>TRUSTED</code> mode, the optimizer trusts that the data in the materialized views is current and the hierarchical relationships declared in dimensions and <code>RELY</code> constraints are correct.</p> </li> <li> <p><code>STAR_TRANSFORMATION_ENABLED</code></p> <p>To take advantage of highly optimized star transformations, make sure to set this parameter to <code>TRUE</code>.</p> </li> </ul> </div> <!-- class="sect2" --></div> <!-- class="sect1" --> <p><a id="BABJEBDG" name="BABJEBDG"></a><a id="TDPDW00173" name="TDPDW00173"></a></p> <div class="sect1"><a id="BABEJIBH" name="BABEJIBH"></a> <h2><a name="TDPDW00173" id="TDPDW00173"></a>Accessing Oracle Warehouse Builder</h2> <p>Oracle Warehouse Builder (OWB) enables you to design and deploy various types of data management strategies, including traditional data warehouses.</p> <a id="sthref43" name="sthref43"></a> <p class="subhead2">To enable Warehouse Builder, complete the following steps:</p> <ol> <li> <p>Ensure that you have access to either Enterprise or Standard Edition of Oracle Database.</p> <p>Oracle Database 11<span class="italic">g</span> comes with Warehouse Builder server components pre-installed. This includes a schema for the Warehouse Builder repository.</p> </li> <li> <p>To utilize the default Warehouse Builder schema installed in Oracle Database, first unlock the schema as follows:</p> <p>Connect to SQL*Plus as the SYS or SYSDBA user. Enter the following commands:</p> <p><code>SQL> ALTER USER OWBSYS ACCOUNT UNLOCK;</code></p> <p><code>SQL> ALTER USER OWBSYS IDENTIFIED BY</code> owbsys_passwd;</p> </li> <li> <p>Launch the Warehouse Builder Design Center.</p> <p>For Windows, select <span class="bold">Start, Programs, Oracle, Warehouse Builder</span> and then select <span class="bold">Design Center.</span></p> <p>For UNIX, locate <code><span class="codeinlineitalic">owb home</span></code>/owb/bin/unix and then run <code>owbclient.sh</code></p> </li> <li> <p>Define a workspace and assign a user to the workspace.</p> <p>In the single Warehouse Builder repository, you can define multiple workspaces with each workspace corresponding to a set of users working on related projects. For instance, you could create a workspace for each of the following environments: development, test, and production.</p> <p>For simplicity, create one workspace called MY_WORKSPACE and assign a user.</p> <p>In the Design Center dialog box, click <span class="bold">Show Details</span> and then <span class="bold">Workspace Management.</span></p> <p>The Repository Assistant appears.</p> <p>Follow the prompts and accept the default settings in the Repository Assistant to create a workspace and assign a user as the workspace owner.</p> </li> <li> <p>Log in to the Design Center with the user name and password you created.</p> </li> </ol> <div class="helpinfonotealso"> <h2>Related Topics</h2> <a href="javascript:open('http://www.oracle.com/pls/db112/lookup?id=WBINS','newWindow').focus()"><span class="italic">Oracle Warehouse Builder Installation and Administration Guide for Windows and UNIX</span></a></div> <a id="BABBHBGF" name="BABBHBGF"></a><a id="TDPDW00174" name="TDPDW00174"></a> <div class="sect2"><a id="BABDCDFG" name="BABDCDFG"></a> <h3><a name="TDPDW00174" id="TDPDW00174"></a>Installing the Oracle Warehouse Builder Demonstration</h3> <p>In subsequent topics, this guide uses exercises from OBEs for Oracle Warehouse Builder to illustrate how to consolidate data from multiple flat file sources, transform the data, and load it into a new relational target. (OBE stands for Oracle By Example.)</p> <p>The exercises and examples are available on Oracle Technology Network (OTN) at <code><a href="http://www.oracle.com/technology/obe/admin/owb_main.htm">http://www.oracle.com/technology/obe/admin/owb_main.htm</a></code>. To facilitate your learning of the product, the demonstration provides you with flat file data and scripts that create various Warehouse Builder objects. The OBE pages provide additional information about OWB and the latest information about the exercises.</p> <a id="sthref44" name="sthref44"></a> <p class="subhead2">To perform the Warehouse Builder exercises presented in this guide, complete the following steps:</p> <ol> <li> <p>Download the demonstration. (The OTN</p> <ul> <li> <p>Navigate to the location for OWB examples, which is available on OTN from the following location:</p> <p><code><a href="http://www.oracle.com/technology/obe/admin/owb_main.htm">http://www.oracle.com/technology/obe/admin/owb_main.htm</a></code></p> </li> <li> <p>Click the link for the Oracle By Example (OBE) set for the latest release.</p> </li> </ul> <p>The demonstration comprises a set of files in a zip archive called <code>owbdemo_files.zip</code>.</p> <p>The zip archive includes a SQL script, two files in comma-separated values format, and scripts written in Tcl.</p> </li> <li> <p>(Optional) Download <code>xsales.zip</code> from the same link, which includes XSALES table data.</p> </li> <li> <p>Edit the script <code>owbdemoinit.tcl.</code></p> <p>The script <code>owbdemoinit.tcl</code> defines and sets variables used by the other tcl scripts. Edit the following variables to match the values in your computer environment:</p> <ul> <li> <p>set tempspace TEMP</p> </li> <li> <p>set owbclientpwd <code><span class="codeinlineitalic">workspace_owner</span></code></p> </li> <li> <p>set sysuser <code><span class="codeinlineitalic">sys</span></code></p> </li> <li> <p>set syspwd pwd</p> </li> <li> <p>set host hostname</p> </li> <li> <p>set port portnumber</p> </li> <li> <p>set service servicename</p> </li> <li> <p>set project <code>owb_project_name</code></p> </li> <li> <p>set owbclient <code><span class="codeinlineitalic">workspace_owner</span></code></p> </li> <li> <p>set sourcedir drive:/<code>newowbdemo</code></p> </li> <li> <p>set indexspace USERS</p> </li> <li> <p>set dataspace USERS</p> </li> <li> <p>set snapspace USERS</p> </li> <li> <p>set sqlpath <code><span class="codeinlineitalic">drive:/</span></code><code>oracle/11.1.0/db_1/BIN</code></p> </li> <li> <p>set sid servicename</p> </li> </ul> </li> <li> <p>Execute the Tcl scripts from the Warehouse Builder scripting utility, OMB Plus.</p> <p>For Windows, select <span class="bold">Start, Programs, Oracle, Warehouse Builder</span> and then select <span class="bold">OMB Plus</span>.</p> <p>For UNIX, locate <code><span class="codeinlineitalic">owb home</span></code>/owb/bin/unix and then execute <code>OMBPlus.sh</code></p> <p>At the OMB+> prompt, enter the following command to change to the directory containing the scripts:</p> <p><code>cd</code> <code><span class="codeinlineitalic">drive</span></code><code>:\\newowbdemo\\</code></p> <p>Run all of the Tcl scripts in desired sequence by entering the following command:</p> <p><code>source loadall.tcl</code></p> </li> <li> <p>Launch the Design Center and log in to it as the workspace owner, using the credentials you specified in the script <code>owbdemoinit.tcl</code>.</p> </li> <li> <p>Verify that you successfully set up the Warehouse Builder client to follow the demonstration.</p> <p>In the Design Center, expand the Locations node, which is on the right side and in the Connection Explorer. Expand <span class="bold">Databases</span> and then <span class="bold">Oracle</span>. The Oracle node should include the follow locations:</p> <p>OWB_REPOSITORY</p> <p>SALES_WH_LOCATION</p> </li> </ol> <p>When you successfully install the Warehouse Builder demonstration, the Design Center displays with an Oracle module named <code>EXPENSE_WH</code>.</p> </div> <!-- class="sect2" --></div> <!-- class="sect1" --> <!-- Start Footer --> <div class="footer"> <table class="simple oac_no_warn" summary="" cellspacing="0" cellpadding="0" width="100%"> <col width="86%" /> <col width="*" /> <tr> <td align="left"><span class="copyrightlogo">Copyright © 2007, 2009, Oracle and/or its affiliates. All rights reserved.</span><br /> <a href="./dcommon/html/cpyr.htm"><span class="copyrightlogo">Legal Notices</span></a></td> <td align="center"><a href="tdpdw_intro.htm"><img src="./dcommon/gifs/leftnav.gif" alt="Previous" /><br /> <span class="icon">Previous</span></a> </td> <td align="center"><a href="tdpdw_owb_sources.htm"><img src="./dcommon/gifs/rightnav.gif" alt="Next" /><br /> <span class="icon">Next</span></a></td> </tr> </table> </div> <!-- class="footer" --> </body> </html>
Ms-Dos/Windows
Unix
Write backup
jsp File Browser version 1.2 by
www.vonloesch.de