To optimize the creation of an index on a big table, there are several procedures you should take in consideration. This document provides a background about the elements that are involved during an index creation.

Also, a good document to read to understand informix tunning:

Informix tunning basics

1 Considerations

Some considerations to pay attention to:

  • LOG_INDEX_BUILDS: this onconfig parameter should be off to avoid a large consumption of logical log. That speeds up the creation process but consumes much more logical log files. This argument needs to be active when RSS replication is also present. Note that when LOG_INDEX_BUILDS is disabled, index building causes locks on the table, this means that if the table is in use there will be performance issues.
  • HDR: Whenever possible it is recommended to disable HDR replication and recreate it after index creation.
  • ER: Whenever possible it is recommended to disable ER replication and recreate it after index creation.
  • Database logging: Consider when possible to create the index without database transactions. This will have implications on replication which will need to be re-established.
  • Table in mode RAW: Another possiblity is to convert the table to RAW mode and create index on it. While in RAW mode the table cannot be used. This will have implications on replication which will need to be re-established.

2 Calculate index size

First of all, you should have a proper location (dbspace) with enought free space to store the index. So you need to anticipate index size and prepare a location (dbspace).

You can estimate the size of an R-tree index in pages by performing a series of calculations based on the number of rows in the table.

  1. Determine the size, in bytes, of each index entry in the leaf page with the following formula that incorporates the overhead: $$ leafentrysize = \sum_{colsizes} + 5 $$
    For data types other than VARCHAR, the length of an index item is calculated by adding the length of the key value plus 5 bytes for each rowid information associated with the key value.
  2. Estimate the number of entries per index-leaf page with the following formula: $$ leafpagents = \frac{pagesize}{leafentrysize} * \frac{60}{100} $$

    The value leafpagents is multiplied by 60 % because index leaf pages are usually just over half full.

  3. Estimate the number of leaf pages need with the following formula $$ leaves = \frac{rows}{leafpagents} $$ and finally $$ leaves = \frac{rows}{\frac{pagesize}{leafentrysize} * \frac{60}{100}} $$

Remember you should indicate a location during index creation or it will placed in rootdbs.

3 Index fill factor

On a very large index use the FILLFACTOR option to specify the degree of index-page fullness.

3.1 Providing a Low Percentage Value

If you provide a low percentage value, such as 50, you allow room for growth in your index. The nodes of the index initially fill to a certain percentage and contain space for inserts. The amount of available space depends on the number of keys in each page as well as the percentage value.

For example, with a 50-percent FILLFACTOR value, the page would be half full and could accommodate doubling in size. A low percentage value can result in faster inserts and can be used for indexes that you expect to grow.

3.2 Providing a High Percentage Value

If you provide a high percentage value, such as 99, indexes are compacted, and any new index inserts result in splitting nodes. The maximum density is 100 percent. With a 100-percent FILLFACTOR value, the index has no room available for growth; any addition to the index results in splitting the nodes.

A 99-percent FILLFACTOR value allows room for at least one insertion per node. A high percentage value can result in faster queries and is appropriate for indexes that you do not expect to grow, or for mostly read-only indexes.

4 Verify temporary space

To estimate the amount of temporary space needed for an index build, perform the following steps:

  1. Add the total widths of the indexed columns or returned values from user-defined functions. This value is referred to as colsize.
  2. Estimate the size of a typical item to sort with one of the following formulas, depending on whether the index is attached or not:
    • For a nonfragmented table and a fragmented table with an index created without an explicit fragmentation strategy, use the following formula: $$ \texttt{sizeof_sort_item} = keysize + 4 $$
    • For fragmented tables with the index explicitly fragmented, use the following formula: $$ \texttt{sizeof_sort_item} = keysize + 8 $$
  3. Estimate the number of bytes needed to sort with the following formula: $$ \texttt{temp_bytes} = 2 * (\texttt{rows} * \texttt{sizeof_sort_item}) $$

This formula uses the factor 2 because everything is stored twice when intermediate sort runs use temporary space. Intermediate sort runs occur when not enough memory exists to perform the entire sort in memory.

If need, use the onspaces -t utility to create large temporary dbspaces and specify them in the DBSPACETEMP configuration parameter or the DBSPACETEMP environment variable.

4.1 Determine temp space available

It is important to verify you have enought pages avaiable to sort the index in temp space.

You can use onstat -d to look for temporary spaces and sum it's pages. Or you can use the following SQL commands.

Copy
SELECT   sum(syschunks.chksize) pages, 
         format_units(sum(syschunks.chksize * (SELECT sh_pagesize FROM sysshmvals)))::CHAR(10) total,
         format_units(sum(syschunks.nfree   * (SELECT sh_pagesize FROM sysshmvals)))::CHAR(10) free,
         format_units(sum(syschunks.chksize * (SELECT sh_pagesize FROM sysshmvals)) -
                                  sum(syschunks.nfree   * (SELECT sh_pagesize FROM sysshmvals)))::CHAR(10) used,
         round (100 - ((sum(syschunks.nfree)) / (sum(syschunks.chksize)) * 100), 2) pct_used
    FROM sysdbspaces,syschunks
   WHERE sysdbspaces.dbsnum = syschunks.dbsnum
     AND sysdbspaces.is_temp = 1
     AND sysdbspaces.is_sbspace = 0
pages total      free       used               pct_used

         8192000 15.6 GB    13.5 GB    2.10 GB               13,42

To see all temporary dbspaces in detail

Copy
SELECT              
         sysdbspaces.name[1,15] name,
         sum(syschunks.chksize) pages,
         format_units(sum(syschunks.chksize * (SELECT sh_pagesize FROM sysshmvals)))::CHAR(10) total,
         format_units(sum(syschunks.nfree   * (SELECT sh_pagesize FROM sysshmvals)))::CHAR(10) free,
         format_units(sum(syschunks.chksize * (SELECT sh_pagesize FROM sysshmvals)) -
                                  sum(syschunks.nfree   * (SELECT sh_pagesize FROM sysshmvals)))::CHAR(10) used,
         round (100 - ((sum(syschunks.nfree)) / (sum(syschunks.chksize)) * 100), 1)::DECIMAL(5,1) pct_used
    FROM sysdbspaces,syschunks
   WHERE sysdbspaces.dbsnum = syschunks.dbsnum
     AND sysdbspaces.is_sbspace = 0
     AND sysdbspaces.is_temp = 1
GROUP BY 1
ORDER BY 1
name                       pages total      free       used       pct_used
         
t_tmp1                   2048000 3.91 GB    3.38 GB    537 MB         13,4
t_tmp2                   2048000 3.91 GB    3.38 GB    537 MB         13,4
t_tmp3                   2048000 3.91 GB    3.38 GB    537 MB         13,4
t_tmp4                   2048000 3.91 GB    3.38 GB    537 MB         13,4

If you don't have enought temp space you may see the following message on Informix logs.

Copy
$ onstat -m
20:51:33  WARNING: Not enough temp space for parallel index build.
          Space required = 18750290 pages; space available = 8163188 pages.
          Partial index build started.

5 Prevent long transaction

If your database uses logging, you should ensure to have enogth logical logs available index creation (including other database server activity).

You can determine your log size capacity by looking the column total_logsize_mb (capacity in megabytes) and total_logsize_pg (number of pages) on the following SQL statement from sysmaster database.

Copy
DATABASE sysmaster;
-- log space is either used by ongoing transactions or because certain logs have not been backed up yet (not news)
SELECT "Used", ROUND(SUM(size * (select sh_pagesize from sysshmvals)/1024/1024),2) mb
  FROM syslogs
 WHERE uniqid >=
       (SELECT min(tx_logbeg) FROM  systrans WHERE tx_logbeg > 0)
    OR (is_backed_up = 0 and is_new=0)
UNION ALL
-- total log space
SELECT "Total", ROUND(SUM(size * (select sh_pagesize from sysshmvals)/1024/1024), 2)
  FROM syslogs;
used                  97,66
free                8789,06

6 Paralellization

Whenever possible, the database server uses parallel processing to improve the response time of index builds. There are three configuration variables to take in consideration:

  • The number of processors available for database server operations are defined in VPCLASS.
  • The number of parallel processes is based on the number of fragments in the index (and the value of the PSORT_NPROCS environment variable).
  • The PDQPRIORITY value of degree of parallelism used in the SQL session before index creation.

6.1 VPCLASS

The cpu and num options of the VPCLASS parameter specifies the number of CPU VPs that the database server brings up initially. Do not allocate more CPU VPs than there are CPUs available to service them.

For multiprocessor systems with four or more CPUs that are primarily used as database servers, it is recommended that you set the VPCLASS num option to one less than the total number of processors. For example, if you have four CPUs, use the following specification:

Copy
VPCLASS cpu,num=3

To see cpu's configure in your instance use (in the example, 8 CPU VP's + 2 Java VP are configured):

Copy
onstat -g sch
IBM Informix Dynamic Server Version 12.10.FC9 -- On-Line -- Up 00:31:25 -- 7768612 Kbytes

VP Scheduler Statistics:
 vp    pid       class       semops    busy waits  spins/wait  bsy lspins
 1     15334     cpu         25        39          8820        0         
 2     15335     adm         0         0           0           0         
 3     15336     lio         2         0           0           0         
 4     15337     pio         2         0           0           0         
 5     15338     aio         171       0           0           0         
 6     15339     msc         5         0           0           0         
 7     15340     fifo        2         0           0           0         
 8     15341     cpu         2054      2071        9948        0         
 9     15342     cpu         2220      2234        9959        0         
 10    15343     cpu         1929      1947        9944        0         
 11    15344     cpu         1938      1956        9950        0         
 12    15345     cpu         1911      1998        9668        0         
 13    15346     cpu         1893      1909        9947        0         
 14    15347     cpu         1953      1985        9894        0         
 15    15348     jvp         1         1           10001       0         
 16    15349     jvp         1         1           10001       0         
 17    15350     soc         2         2           10000       0         
 18    15351     aio         7         0           0           0         
 19    15352     aio         7         0           0           0         
 20    15353     aio         6         0           0           0         
 21    15354     aio         5         0           0           0 
 ...

6.2 PSORT_NPROCS

The PSORT_NPROCS environment variable enables the database server to improve the performance of the parallel-process sorting package by allocating more threads for sorting.

Setup PSORT_NPROCS in linux shell before starting dbaccess command to the number of CPU cores minus 2 (leave at least 2 CPU cores for regular operations). On a 8 CPU cores, we will set tot 6.

Copy
export PSORT_NPROCS=6

6.3 PDQPRIORITY

The PDQPRIORITY environment variable determines the degree of parallelism that the database server uses and affects how the database server allocates resources, including memory, processors, and disk reads.

Is an integer in the range 0 to 100. The value 1 is the same as LOW, and 100 is the same as HIGH. Values lower than 0 are set to 0 (OFF), and values greater than 100 are set to 100 (HIGH).

If you set PDQPRIORITY to a positive value, you enable parallel processing overall and the Memory Grant Manager and parallel query manager determine the number of sort threads in an "optimal" way based on two sort threads per sort with the number of sorts equal to the number of partitions in the index for an index build. PDQPRIORITY overrides the PSORT_NPROCS setting.

The percentage of resources represented by the session's effective PDQPRIORITY (throttled by MAX_PDQPRIORITY) is used to determine the amount of memory available for in-memory sorting during index builds and update statistics runs.

This option is only available on Informix Enterprise Edition.

7 Replication impact

If replication is enabled and database is logged, index creation will be propagated to the replica. This may affect performance.

The LOG_INDEX_BUILDS configuration parameter enables or disables index page logging.

7.1 HDR disabled

You should disable LOG_INDEX_BUILDS (which you can do using onmode -wm) until after the index is built

Copy
$ onmode -wf LOG_INDEX_BUILDS=0
Current value for LOG_INDEX_BUILDS (0) was saved in config file.

7.2 HDR enabled

With the default settings (LOG_INDEX_BUILDS=0), once an index has been created on a primary server, it is sent directly (not via logical logs) afterwards to the HDR secondary server with text appended to the Informix message logs as follows:

Copy
Primary
	HH:MM:SS  DR: Sending index database:"owner".table#index : Started
	HH:MM:SS  DR: Sending index database:"owner".table#index : Completed.
	
Secondary
	HH:MM:SS  DR: Receiving index database:"owner".table#index : Started
	HH:MM:SS  DR: Receiving index database:"owner".table#index : Completed.

If the index is large, this can take some time. The table is meanwhile exclusively locked on both primary and secondary, even though control has already been returned to the calling application. Unless the table is empty, if it tries to use the table right away, even to create a second index, it will most likely fail with:

Copy
243: Count not position within a table (owner.table)
113: ISAM error: the file is locked.

A solution is set LOG_INDEX_BUILDS=1 so that new pages are instead passed to the secondary via the logical logs when indexes are created on the primary:

Copy
$ onmode -wf LOG_INDEX_BUILDS=1
Current value for LOG_INDEX_BUILDS (1) was saved in config file.

With the new setting in force, a table is no longer locked and is immediately available for use after an index has been built. You can check whether it is currently enabled as in the example below:

Copy
$ onstat -g ipl

IBM Informix Dynamic Server Version ...
Index page logging status: Enabled
Index page logging was enabled at: 2017/08/29 16:52:22

7.3 Special caveats

  • The documentation advises: "If LOG_INDEX_BUILDS is enabled, logical log file space consumption will increase, depending on the size of the indexes".
  • It will take longer to create an index due to the logging overhead.
  • Most importantly, a large index build might fail with "Long transaction aborted" if the database engine decides it has consuming too big a proportion of the total log space available.

Experienced DBAs therefore temporarily disable this parameter during outages for major schema changes with replication stopped.

Note that index page logging is mandatory for the alternative Informix Remote Secondary Server (RSS) replication method.

8 Memory considerations

Index creation will user memory from the shared memory pool (resident portion, not buffers).

8.1 Estimating memory needed for sorting

To calculate the amount of virtual shared memory that the database server might need for sorting, estimate the maximum number of sorts that might occur concurrently and multiply that number by the average number of rows and the average row size.

For example, if you estimate that 30 sorts could occur concurrently, the average row size is 200 bytes, and the average number of rows in a table is 400, you can estimate the amount of shared memory that the database server needs for sorting as follows: 30 sorts * 200 bytes * 400 rows = 2,400,000 bytes

You can use the DS_NONPDQ_QUERY_MEM configuration parameter to configure the amount sort memory available for non-PDQ queries. Important

You can only use this parameter if the PDQ priority is set to zero. Its setting has no effect if the PDQ priority is greater than zero.

The minimum and default value of DS_NONPDQ_QUERY_MEM is 128 kilobytes. The maximum supported value is 25 percent of DS_TOTAL_MEMORY.

If the PDQ priority is greater than 0, the maximum amount of shared memory that the database server allocates for a sort is controlled by the memory grant manager (MGM). The MGM uses the settings of PDQ priority and the following configuration parameters to determine how much memory to grant for the sort:

  • DS_TOTAL_MEMORY
  • DS_MAX_QUERIES
  • MAX_PDQPRIORITY

8.2 BUFFERPOOL

Even not used directly for the indexing process, the BUFFERPOOL is used for reading data from indexed table.

Ensure to setup a BUFFERPOOL entry for the page size. On large index build you may change to 16K page size and default BUFFERPOOL is normally tunned for 2K page sizes.

Starting with 12.10.xC3 the BUFFERPOOL configuration parameter has now two formats:

  1. Use the BUFFERPOOL configuration parameter with the memory field if you want to specify the size of your buffer pool in units of memory like KB, MB or GB.
  2. Use the BUFFERPOOL configuration parameter with the buffers field if you want to specify the size of your buffer pool in units of pages, or to retain settings from a previous release.

Copy
BUFFERPOOL      size=2K, start_memory=2gb,memory=4gb,extendable=1,cache_hit_ratio=98,lrus=8,lru_min_dirty=50,lru_max_dirty=60
BUFFERPOOL      size=16K,start_memory=2gb,memory=6gb,extendable=1,cache_hit_ratio=98,lrus=8,lru_min_dirty=50,lru_max_dirty=60

8.3 Memory Grant Manager

The Memory Grant Manager (MGM) is a database server component that coordinates the use of memory, CPU virtual processors (VPs), disk I/O, and scan threads among decision-support queries. The MGM uses the DS_MAX_QUERIES, DS_TOTAL_MEMORY, DS_MAX_SCANS, and MAX_PDQPRIORITY configuration parameters to determine the quantity of these PDQ resources that can be granted to a decision-support query.

The MGM dynamically allocates the following resources for decision-support queries:

  • The number of scan threads that are started for each decision-support query
  • The number of threads that can be started for each query
  • The amount of memory in the virtual portion of database server shared memory that the query can reserve

The MGM grants memory to a query for such activities as sorts, hash joins, and processing of GROUP BY clauses. The amount of memory that decision-support queries use cannot exceed DS_TOTAL_MEMORY. In our case, the MGM may be used by index sort operation.

The amount of memory that decision-support queries use cannot exceed DS_TOTAL_MEMORY. The MGM grants memory to queries in quantum increments. To calculate the approximate size of the quantum, use the following formula: $$ \texttt{memory quantum} = \frac{\texttt{DS_TOTAL_MEMORY}}{\texttt{DS_MAX_QUERIES}} $$

For example, if DS_TOTAL_MEMORY is 12 MB and DS_MAX_QUERIES is 4, the quantum is 3 MB (12/4). Thus, with these values in effect, a quantum of memory equals 3 MB. The database server can adjust the size of the quantum dynamically when it grants memory. In general, memory is allocated more efficiently when quanta are smaller. You can often improve performance of concurrent queries by increasing DS_MAX_QUERIES to reduce the size of a quantum of memory. In constrast, for low concurrency decrease DS_MAX_QUERIES to have a bigger quantum for each query.

To monitor resources that the MGM allocates, run the onstat -g mgm command. This command shows only the amount of memory that is used; it does not show the amount of memory that is granted.

The MGM also grants a maximum number of scan threads per query that is based on the values of the DS_MAX_SCANS and the DS_MAX_QUERIES parameters.

The following formula yields the maximum number of scan threads per query: $$ \texttt{scan_threads} = min (nfrags, \texttt{DS_MAX_SCANS} * (\frac{pdqpriority}{100}) * (\frac{\texttt{MAX_PDQPRIORITY}}{100})) $$

You can monitor MGM by using onstat -g mgm command:

Copy
$ onstat -g mgm
IBM Informix Dynamic Server Version 12.10.FC9 -- On-Line (Prim) -- Up 17:33:49 -- 7881924 Kbytes

Memory Grant Manager (MGM) 
--------------------------

MAX_PDQPRIORITY:  100
DS_MAX_QUERIES:    1536
DS_MAX_SCANS:      1048576
DS_NONPDQ_QUERY_MEM: 256 KB
DS_TOTAL_MEMORY:   1024000 KB

Queries:   Active     Ready   Maximum
                0         0      1536 

Memory:     Total      Free   Quantum 
(KB)       1024000   1024000       664 

Scans:      Total      Free   Quantum 
           1048576   1048576         1 

Load Control:    (Memory)      (Scans)  (Priority)  (Max Queries)   (Reinit)
                   Gate 1       Gate 2      Gate 3         Gate 4     Gate 5
(Queue Length)          0            0           0              0          0

Active Queries:  None 

Ready Queries:  None 

Free Resource        Average #        Minimum # 
--------------    ---------------     --------- 
Memory             512.0 +- 0.0           512
Scans             1048575.5 +- 0.0          1048575 

Queries              Average #        Maximum #    Total #
--------------    ---------------     ---------    -------
Active               1.0 +- 0.0             1        107
Ready                0.0 +- 0.0             0          0

Resource/Lock Cycle Prevention count:  0

TO DO

This section is incomplete and will be concluded as soon as possible.

8.4 Give maximum memory for index build

TO DO

This section is incomplete and will be concluded as soon as possible.

9 Fragmentation

When index is fragmented by range, it's not parelized. Only one thread is used.

Copy
CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber)
  FRAGMENT BY RANGE(l_orderkey) 
            INTERVAL(100000) 
            STORE IN (d_cons, d_data, d_auto)
  PARTITION lineitem_pk_frag VALUES < 1 IN d_cons

To solve it, use FRAGMENT BY EXPRESSION instead. By using this fragmention, the index is created using paralellization.

Copy
CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber) 
    FRAGMENT BY EXPRESSION 
    (l_orderkey between       0 and 1000000) in d_data,
    (l_orderkey between 1000000 and 2000000) in d_auto,
    REMAINDER IN d_cons

Or use FRAGMENT BY ROUND ROBIN.

TO DO

This section is incomplete and will be concluded as soon as possible.

10 Concurrency

By default, CREATE INDEX attempts to place an exclusive lock on the indexed table to prevent all other users from accessing the table while the index is being created. The CREATE INDEX statement fails if another user already has a lock on the table, or is currently accessing the table at the Dirty Read isolation level

10.1 Concurrent index creation

You can use the CREATE INDEX ONLINE and DROP INDEX ONLINE statements to create and drop an index in an online environment, when the database and its associated tables are continuously available.

The CREATE INDEX ONLINE statement enables you to create an index without having an exclusive lock placed over the table during the duration of the index build. You can use the CREATE INDEX ONLINE statement even when reads or updates are occurring on the table. This means index creation can begin immediately.

When you create an index online, the database server logs the operation with a flag, so data recovery and restore operations can recreate the index. When you create an index online, you may use the ONLIDX_MAXMEM configuration parameter to limit the amount of memory that is allocated to the preimage log pool and to the updator log pool in shared memory. You might want to do this step if you plan to complete other operations on a table column while executing the CREATE INDEX ONLINE statement on the column.

The advantages of creating indexes using the CREATE INDEX ONLINE statement are:

  • If a new index is necessary to improve the performance of queries on a table, you can immediately create the index without a lock placed over the table.
  • The database server can create an index while a table is being updated.
  • The table is available for the duration of the index build. The query optimizer can establish better query plans because the optimizer can update statistics in unlocked tables.

The advantages of dropping indexes using the DROP INDEX ONLINE statement are:

  • You can drop an inefficient index without disturbing ongoing queries that are using that index.
  • When the index is flagged, the query optimizer will not use the index for new SELECT operations on tables.

If you initiate a DROP INDEX ONLINE statement for a table that is being updated, the operation does not occur until after the table update is completed. After you issue the DROP INDEX ONLINE statement, no one can reference the index, but concurrent operations can use the index until the operations terminate. The database server waits to drop the index until all users have finished accessing the index.

An example of creating an index in an online environment is:

Copy
CREATE INDEX idx_1 ON table1(col1) ONLINE

An example of dropping an index in an online environment is:

Copy
DROP INDEX idx_1 ONLINE

11 Sample case

We will use the TPCH database as a target environment to test index creation using scale factors 1, 10, 100, 2000 and 1000.

The lineitem table as the largest numbers of rows (60M at scale factor 1) and has a primary key based on columns l_orderkey, l_linenumber. As both columns are integers we can expect index leaf size is 4+4 = 8 bytes.

Copy
SELECT sum(collength)
  FROM syscolumns, systables
 WHERE syscolumns.tabid = systables.tabid
   AND systables.tabname = "lineitem"
   AND syscolumns.colname in ("l_orderkey", "l_linenumber")
8

Use the formula aplying the SQL using 8 as the index leaf size (plus 5).

Copy
SELECT trunc(sh_pagesize / (8 + 5)) * 60/100 FROM sysmaster:sysshmvals;
94.2

$$ \frac{60000000}{94.2} = 636942\texttt{ 2K pages} $$

So you need to place index in a dbspace with at least 636942 free pages.

11.1 Informix Enterprise Edition

if you are running Informix Enterprise edition you can use paralellization. To see how paralellization works to speedup index creation, we will test diferent cases.

Setup server to accept MAX_PDQPRIORITY=100

Copy
onmode -wf MAX_PDQPRIORITY=100
Current value for MAX_PDQPRIORITY (100) was saved in config file.

11.1.1 PDQPRIORITY = 0

First, we can test index creation with no PDQPRIORITY and no PSORT_NPROCS environment variable defined.

If PDQPRIORITY is zero, index builds use DS_NONPDQ_QUERY_MEM to determine how much memory to use for in-memory sorting. This can be up to 25% of DS_TOTAL_MEMORY.

Copy
SET PDQPRIORITY 0;
SELECT sid FROM sysmaster:syssessions WHERE sid = DBINFO('sessionid');
CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber) in DBSPACE;
time = 2m6.477s

We can see the session is running with two SORT processors.

Copy
onstatt -g ses 5301
IBM Informix Dynamic Server Version 12.10.FC9 -- On-Line (Prim) -- Up 12:00:08 -- 7881924 Kbytes

session           effective                            #RSAM    total      used       dynamic 
id       user     user      tty      pid      hostname threads  memory     memory     explain 
5301     informix -         1        9995     dbsrv3   4        397312     347040     off 

Program :
/home/informix/bin/dbaccess

tid      name     rstcb            flags    curstk   status
5407     sqlexec  576b1398         Y-BP---  20816    cond wait  opened_up -
5408     mb_colle 576a8ef0         --B----  1408     sleeping secs: 1     -
5409     xchg_1.0 576ad5b0         Y-B----  944      cond wait  opened_up -
5410     xchg_2.0 576c1410         --B----  3328     running-

Memory pools    count 3
name         class addr              totalsize  freesize   #allocfrag #freefrag 
5301         V     8e384040         208896     7816       200        9         
5301_SORT_0  V     8e404040         155648     15984      12         4         
5301_SORT_0  V     8e43a040         32768      26472      30         11        

name           free       used           name           free       used      
...

sqscb info
scb              sqscb            optofc   pdqpriority optcompind  directives
581211c0         8e377028         0        0           0           1         

Sess       SQL            Current            Iso Lock       SQL  ISAM F.E. 
Id         Stmt type      Database           Lvl Mode       ERR  ERR  Vers  Explain    
5301       CREATE INDEX   tpch               CR  Not Wait   0    0    9.24  Off        

Current SQL statement (4) :
  CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber) in
    d_cons

Last parsed SQL statement :
  CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber) in
    d_cons

And no memory is used. Notice that server is configured with:

Parameter Value
MAX_PDQPRIORITY 100
DS_MAX_QUERIES 1536
DS_MAX_SCANS 1048576 (1024*1024)
DS_NONPDQ_QUERY_MEM 256 KB
DS_TOTAL_MEMORY 1024000 KB (1 GB)

Copy
onstat -g mgm
IBM Informix Dynamic Server Version 12.10.FC9 -- On-Line (Prim) -- Up 12:09:39 -- 7881924 Kbytes

Memory Grant Manager (MGM) 
--------------------------

MAX_PDQPRIORITY:  100
DS_MAX_QUERIES:    1536
DS_MAX_SCANS:      1048576
DS_NONPDQ_QUERY_MEM: 256 KB
DS_TOTAL_MEMORY:   1024000 KB

Queries:   Active     Ready   Maximum
                0         0      1536 

Memory:     Total      Free   Quantum 
(KB)       1024000   1024000       664 

Scans:      Total      Free   Quantum 
           1048576   1048576         1 

Load Control:    (Memory)      (Scans)  (Priority)  (Max Queries)   (Reinit)
                   Gate 1       Gate 2      Gate 3         Gate 4     Gate 5
(Queue Length)          0            0           0              0          0

Active Queries:  None 

Ready Queries:  None 

Free Resource        Average #        Minimum # 
--------------    ---------------     --------- 
Memory             512.0 +- 0.0           512
Scans             1048575.0 +- 0.0          1048575 

Queries              Average #        Maximum #    Total #
--------------    ---------------     ---------    -------
Active               1.0 +- 0.0             1          2
Ready                0.0 +- 0.0             0          0

Resource/Lock Cycle Prevention count:  0

11.1.2 PDQPRIORITY=100

If you set PDQPRIORITY to a positive value, you enable parallel processing overall and the Memory Grant Manager and parallel query manager determine the number of sort threads in an "optimal" way based on two sort threads per sort with the number of sorts equal to the number of partitions in the index for an index build.

PSORT_NPROCS not set

Run index creation with default server PSORT_NPROCS selection.

  1. Remove PSORT_NPROCS system variable.
    Copy
    $ unset PSORT_NPROCS
  2. Run dbaccess command
    Copy
    SET PDQPRIORITY 100;
    SELECT sid FROM sysmaster:syssessions WHERE sid = DBINFO('sessionid');
    CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber) in d_cons;
    time = 0m46.380s

Checking session activity we can see backend psortpro threads 0 to 1 twice and one running.

Copy
$ onstat -g ses 5470
IBM Informix Dynamic Server Version 12.10.FC9 -- On-Line (Prim) -- Up 12:17:52 -- 7881924 Kbytes

session           effective                            #RSAM    total      used       dynamic 
id       user     user      tty      pid      hostname threads  memory     memory     explain 
5470     informix -         1        14231    dbsrv3   13       5480448    5409648    off 

Program :
/home/informix/bin/dbaccess

tid      name     rstcb            flags    curstk   status
5614     sqlexec  576b2e20         Y-BP---  20816    cond wait  opened_up -
5615     mb_colle 576b1398         --B----  1408     sleeping secs: 1     -
5616     xchg_1.0 576a8ef0         Y-B----  944      cond wait  opened_up -
5617     xchg_2.0 576ad5b0         Y-B----  832      cond wait  packet_con-
5618     xchg_2.1 576c1410         Y-B----  832      cond wait  packet_con-
5619     xchg_2.2 576b2548         Y-B----  832      cond wait  packet_con-
5620     xchg_2.3 576b36f8         Y-B----  832      cond wait  packet_con-
5621     xchg_3.0 576b3fd0         --B-R--  1008     ready-
5622     psortpro 576bd628         Y-B----  944      cond wait  backend:0 -
5623     psortpro 576bcd50         Y-B----  944      cond wait  backend:1 -
5624     psortpro 576bc478         Y-B----  944      cond wait  backend:0 -
5625     psortpro 576bbba0         Y-B----  944      cond wait  backend:1 -
5626     psortpro 576bb2c8         --B----  1616     running-

Memory pools    count 7
name         class addr              totalsize  freesize   #allocfrag #freefrag 
5470         V     8e3f4040         495616     33480      339        21        
5470*O0      V     ade11040         4096       768        1          1         
5470_SORT_0  V     ade0f040         81920      8848       8          3         
5470_SORT_0  V     8e3e9040         1708032    3136       15080      1         
5470_SORT_0  V     ade03040         204800     20312      12         6         
5470_SORT_0  V     c3c9e040         1708032    3136       15080      1         
5470_SORT_0  V     adfa6040         1277952    480        10576      3         

name           free       used           name           free       used      
....

sqscb info
scb              sqscb            optofc   pdqpriority optcompind  directives
5807f1c0         8dec0028         0        100         0           1         

Sess       SQL            Current            Iso Lock       SQL  ISAM F.E. 
Id         Stmt type      Database           Lvl Mode       ERR  ERR  Vers  Explain    
5470       CREATE INDEX   tpch               CR  Not Wait   0    0    9.24  Off        

Current SQL statement (5) :
  CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber) in
    d_cons

Last parsed SQL statement :
  CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber) in d_cons

PSORT_NPROCS=6

Run index creation with PSORT_NPROCS=6.

  1. Set environment variable PSORT_NPROCS before starting dbaccess.
    Copy
    $ export PSORT_NPROCS=6
  2. Run dbaccess command
    Copy
    SET PDQPRIORITY 100;
    SELECT sid FROM sysmaster:syssessions WHERE sid = DBINFO('sessionid');
    CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber) in d_cons;
    time = 0m38.686s

Checking session activity we can see psortpro backend threads 0 to 5 (twice) and two running.

Copy
IBM Informix Dynamic Server Version 12.10.FC9 -- On-Line (Prim) -- Up 12:29:18 -- 7881924 Kbytes

session           effective                            #RSAM    total      used       dynamic 
id       user     user      tty      pid      hostname threads  memory     memory     explain 
5512     informix -         1        15665    dbsrv3   22       6144000    6051808    off 

Program :
/home/informix/bin/dbaccess

tid      name     rstcb            flags    curstk   status
5671     sqlexec  576b1c70         Y-BP---  20816    cond wait  opened_up -
5672     mb_colle 576b2e20         --B----  1408     sleeping secs: 1     -
5673     xchg_1.0 576b1398         Y-B----  944      cond wait  opened_up -
5674     xchg_2.0 576a8ef0         Y-B----  832      cond wait  packet_con-
5675     xchg_2.1 576ad5b0         Y-B----  832      cond wait  packet_con-
5676     xchg_2.2 576c1410         Y-B----  832      cond wait  packet_con-
5677     xchg_2.3 576b2548         Y-B----  832      cond wait  packet_con-
5678     xchg_3.0 576b36f8         --B-R--  1664     ready-
5680     psortpro 576bbba0         Y-B----  944      cond wait  backend:0 -
5682     psortpro 576bc478         Y-B----  944      cond wait  backend:1 -
5683     psortpro 576ba118         Y-B----  944      cond wait  backend:2 -
5685     psortpro 576ba9f0         Y-B----  944      cond wait  backend:3 -
5686     psortpro 576bcd50         Y-B----  944      cond wait  backend:4 -
5687     psortpro 576b9840         Y-B----  944      cond wait  backend:5 -
5692     psortpro 576bb2c8         Y-B----  944      cond wait  backend:0 -
5694     psortpro 576bd628         Y-B----  944      cond wait  backend:1 -
5695     psortpro 576b3fd0         Y-B----  944      cond wait  backend:2 -
5696     psortpro 576b48a8         Y-B----  944      cond wait  backend:3 -
5698     psortpro 576b74e0         Y-B----  944      cond wait  backend:4 -
5699     psortpro 576b8f68         Y-B----  944      cond wait  backend:5 -
5701     psortpro 576b6330         --B----  1616     running-
5703     psortpro 576c0b38         --B----  1616     running-

Memory pools    count 7
name         class addr              totalsize  freesize   #allocfrag #freefrag 
5512         V     ae0a9040         749568     54344      413        40        
5512*O0      V     8dd8f040         4096       768        1          1         
5512_SORT_0  V     ade55040         155648     15984      12         4         
5512_SORT_0  V     8e3eb040         2097152    1328       15032      1         
5512_SORT_0  V     c3c9f040         868352     432        5230       4         
5512_SORT_0  V     8e458040         172032     17072      12         6         
5512_SORT_0  V     ae03d040         2097152    1544       15029      1         

name           free       used           name           free       used      
...

sqscb info
scb              sqscb            optofc   pdqpriority optcompind  directives
580491c0         8dff4028         0        100         0           1         

Sess       SQL            Current            Iso Lock       SQL  ISAM F.E. 
Id         Stmt type      Database           Lvl Mode       ERR  ERR  Vers  Explain    
5512       CREATE INDEX   tpch               CR  Not Wait   0    0    9.24  Off        

Current SQL statement (5) :
  CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber) in
    d_cons

Last parsed SQL statement :
  CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber) in
    d_cons

DSS Memory

The Memory Grant Manager is using memory in both cases. You can see the Quantum is 664 and it the result of dividing DS_TOTAL_MEMORY by DS_MAX_QUERIES.

Copy
$ onstat -g mgm
IBM Informix Dynamic Server Version 12.10.FC9 -- On-Line (Prim) -- Up 12:17:45 -- 7881924 Kbytes

Memory Grant Manager (MGM) 
--------------------------

MAX_PDQPRIORITY:  100
DS_MAX_QUERIES:    1536
DS_MAX_SCANS:      1048576
DS_NONPDQ_QUERY_MEM: 256 KB
DS_TOTAL_MEMORY:   1024000 KB

Queries:   Active     Ready   Maximum
                1         0      1536 

Memory:     Total      Free   Quantum 
(KB)       1024000      4096       664 

Scans:      Total      Free   Quantum 
           1048576   1048575         1 

Load Control:    (Memory)      (Scans)  (Priority)  (Max Queries)   (Reinit)
                   Gate 1       Gate 2      Gate 3         Gate 4     Gate 5
(Queue Length)          0            0           0              0          0

Active Queries:  
---------------
Session          Query  Priority         Thread   Memory  Scans      Gate 
   5470       8e3ff8f0    100          8e3f2028    0/127488    0/1       - 

Ready Queries:  None 

Free Resource        Average #        Minimum # 
--------------    ---------------     --------- 
Memory             512.0 +- 0.0           512
Scans             1048575.2 +- 0.0          1048575 

Queries              Average #        Maximum #    Total #
--------------    ---------------     ---------    -------
Active               1.0 +- 0.0             1          4
Ready                0.0 +- 0.0             0          0

Resource/Lock Cycle Prevention count:  0

11.2 Informix Workgroup Edition

With Workgroup edition you can not increase the degree of parallelism using PDQPRIORITY. If you do not or cannot set PDQPRIORITY to a positive value the setting PSORT_NPROCS is the only way to get multiple sort threads running.

11.2.1 DS_NONPDQ_QUERY_MEM

Use the DS_NONPDQ_QUERY_MEM configuration parameter to increase the amount of memory that is available for a query that is not a Parallel Database Query (PDQ).

12 Monitor activity

Once index creation is executed, you can monitor it's progress.

12.1 Monitor session environment

You can monitor a session id to get it's current environment parameters. This could be useful, for example, to ensure we have properly set a variable.

Copy
onstat -g env SESSIONID | grep PSORT
PSORT_NPROCS        6

12.2 Monitor session activity

You can determine the session and it's activity using onstat.

Copy
$ onstat -g ses 6
...

12.3 Monitor memory grant manager

You can the activity in memory grant manager to see PDQ usage.

Copy
$ onstat -g mgm
...

12.4 Monitor logical log usage

  1. Force a logical log backup to monitor logical log consumption. And inmeditatelly show logical logs.
    Monitor logical log usage will only be reliable if other transaction activity is low compared with index creation.
    Copy
    $ onmode -l
    $ onstat -l
    IBM Informix Dynamic Server Version 12.10.FC7W1WE -- On-Line (Prim) -- Up 17 days 04:17:17 -- 7881924 Kbytes
    
    ....      
    
    address          number   flags    uniqid   begin                size     used    %used
    57712f88         1        U-B----  2056     1:25263             50000    50000   100.00
    56d78f68         2        U-B----  2057     1:75263             50000    50000   100.00
    56e4cf40         3        U-B----  2058     1:125263            50000    50000   100.00
    56e4cfa8         4        U-B----  2059     6:53                50000    50000   100.00
    576f6c90         5        U-B----  2060     6:50053             50000    50000   100.00
    576f6cf8         6        U-B----  2061     6:100053            50000    50000   100.00
    576f6d60         7        U-B----  2062     6:150053            50000    50000   100.00
    576f6dc8         8        U-B----  2063     6:200053            50000    50000   100.00
    576f6e30         9        U-B----  2064     6:250053            50000    50000   100.00
    576f6e98         10       U-B----  2065     6:300053            50000    50000   100.00
    576f6f00         11       U-B----  2066     6:350053            50000    50000   100.00
    576f6f68         12       U-B----  2067     6:400053            50000    50000   100.00
    58a5c028         13       U-B----  2068     6:450053            50000    50000   100.00
    58a5c090         14       U-B----  2069     6:500053            50000    50000   100.00
    58a5c0f8         15       U-B----  2070     6:550053            50000    50000   100.00
    58a5c160         16       U-B----  2071     6:600053            50000    50000   100.00
    58a5c1c8         17       U-B----  2072     6:650053            50000    50000   100.00
    58a5c230         18       U-B----  2073     6:700053            50000    50000   100.00
    58a5c298         19       U-B----  2074     6:750053            50000    50000   100.00
    58a5c300         20       U-B----  2075     6:800053            50000    50000   100.00
    58a5c368         21       U-B----  2076     6:850053            50000    50000   100.00
    58a5c3d0         22       U-B----  2077     6:900053            50000    50000   100.00
    58a5c438         23       U-B----  2078     6:950053            50000    50000   100.00
    58a5c4a0         24       U-B----  2079     6:1000053           50000    50000   100.00
    58a5c508         25       U-B----  2080     6:1050053           50000    50000   100.00
    58a5c570         26       U-B----  2081     6:1100053           50000    50000   100.00
    58a5c5d8         27       U-B---L  2082     6:1150053           50000    34476    68.95
    58a5c640         28       U-B----  2083     6:1200053           50000       18     0.04
    58a5c6a8         29       U---C--  2084     6:1250053           50000        8     0.02
    58a5c710         30       U-B----  2036     6:1300053           50000    50000   100.00
    58a5c778         31       U-B----  2037     6:1350053           50000    50000   100.00
    ....
  2. Now create the index in the appropriate dbspace and examine logical log usage.

12.5 Monitor time for rollback

Use the following command to minitor rollback time column (est. rb_time)

Copy
$ onstat -x
IBM Informix Dynamic Server Version 12.10.FC9 -- On-Line (Prim) -- Up 17:25:41 -- 7881924 Kbytes

Transactions
                                                                                           est.    
address          flags userthread       locks  begin_logpos      current logpos    isol    rb_time  retrys coord
576e1028         A---- 5769a028         0      -                 -                 COMMIT  -        0      
576e1398         A---- 5769a900         0      -                 -                 COMMIT  -        0      
576e1708         A---- 5769b1d8         0      -                 -                 COMMIT  -        0      
576e1a78         A---- 5769bab0         0      -                 -                 COMMIT  -        0      
576e1de8         A---- 5769c388         0      -                 -                 COMMIT  -        0      
576e2158         A---- 5769cc60         0      -                 -                 COMMIT  -        0      
576e24c8         A---- 5769d538         0      -                 -                 COMMIT  -        0      
576e2838         A---- 5769de10         0      -                 -                 COMMIT  -        0      
576e2ba8         A---- 5769e6e8         0      -                 -                 COMMIT  -        0      
...

12.6 Monitor chunks activity

You can use onstat to check for disk I/O

Copy
$ onstat -g iof

You can also inspect the sysmaster database to get a more human readable view.

Copy
SELECT 
        name[1,10],      
        chknum,    
        round((reads/(select sum(reads) from syschktab)) *100, 1)::DECIMAL(5,1) r_pct,
        round((writes/(select sum(writes) from syschktab)) *100, 1)::DECIMAL(5,1) w_pct,
        round((pagesread/(select sum(pagesread) from syschktab)) *100, 1)::DECIMAL(5,1) pg_r_pct,
        round((pageswritten/(select sum(pageswritten) from syschktab)) *100, 1)::DECIMAL(5,1) pg_w_pct,
        format_units(pagesread,(SELECT os_pagesize FROM sysmachineinfo))::CHAR(10) pg_r,
        format_units(pageswritten,(SELECT os_pagesize FROM sysmachineinfo))::CHAR(10) pg_w
  FROM sysdbstab
      ,syschktab
 WHERE 
       sysdbstab.dbsnum = syschktab.dbsnum
ORDER BY 
       r_pct desc
name       chknum   r_pct   w_pct pg_r_pct pg_w_pct pg_r       pg_w
        
i_data         12    88,0    49,7      5,4      3,1 28.2 GB    3.09 GB
t_tmp2          8     2,6    13,6      4,8     26,2 25.1 GB    26.0 GB
d_data         11     2,5     0,0     77,0      0,0 404 GB     32.0 KB
t_tmp1          7     2,4    12,5      4,4     24,1 23.1 GB    23.9 GB
t_tmp3          9     2,3    11,9      4,2     23,0 22.0 GB    22.8 GB
t_tmp4         10     2,3    12,2      4,3     23,6 22.6 GB    23.4 GB
s_sbstd         2     0,0     0,0      0,0      0,0 60.0 KB    8.00 KB
s_sbsys         3     0,0     0,0      0,0      0,0 60.0 KB    8.00 KB
s_sbtmp         4     0,0     0,0      0,0      0,0 64.0 KB    92.0 KB
d_plog          5     0,0     0,0      0,0      0,0 16.0 KB    4.00 KB
rootdbs         1     0,0     0,0      0,0      0,0 11.3 MB    10.3 MB
d_llog          6     0,0     0,0      0,0      0,0 2.16 MB    148 KB

12.7 Inspect extents size

After index is created, you can indpect it's fragments and location.

Copy
$ oncheck -pt tpch:lineitem
....

		  Index lineitem_pk fragment partition d_cons in DBspace d_cons

    Physical Address               31:569
    Creation date                  08/08/2017 17:42:17
    TBLspace Flags                 802        Row Locking
                                              TBLspace use 4 bit bit-maps
    Maximum row size               146       
    Number of special columns      0         
    Number of keys                 1         
    Number of extents              1         
    Current serial value           1         
    Current SERIAL8 value          1         
    Current BIGSERIAL value        1         
    Current REFID value            1         
    Pagesize (k)                   2         
    First extent size              4         
    Next extent size               65536     
    Number of pages allocated      589824    
    Number of pages used           571114    
    Number of data pages           0         
    Number of rows                 0         
    Partition partnum              17825862  
    Partition lockid               17826216  

    Extents                       
         Logical Page     Physical Page        Size Physical Pages
                    0        32:1027899      589824     589824

12.8 Estimate time to complete

There is no simple way to estimate time to complete index creation once started.

A trick may include whatching the number of bytes read from dbspace chunks where table is located (only valid if table has it's own dbspace or dbspaces).

  • Determine number of pages of table beeing indexed ( 9763204 in this example).
    Copy
    $ oncheck -pt tpch:lineitem
    TBLspace Report for tpch:informix.lineitem
    
        Physical Address               11:272
        Creation date                  08/13/2017 18:51:48
        TBLspace Flags                 8902       Row Locking
                                                  TBLspace contains VARCHARS
                                                  TBLspace use 4 bit bit-maps
        Maximum row size               146       
        Number of special columns      1         
        Number of keys                 0         
        Number of extents              1         
        Current serial value           1         
        Current SERIAL8 value          1         
        Current BIGSERIAL value        1         
        Current REFID value            1         
        Pagesize (k)                   16        
        First extent size              10000000  
        Next extent size               500000    
        Number of pages allocated      10000000  
        Number of pages used           9763204   
        Number of data pages           9760782   
        Number of rows                 1200018434
  • Reset statistics
    Copy
    $ onmode -z
  • Start index creation.
  • Monitor IO on chunks belonging the data table
    Copy
    $ onstat -g iof
    AIO global files:
    
    gfd pathname         bytes read     page reads  bytes write    page writes io/s
    
    13  d_data_01        159960334336   78105984    245760         120         1672.5
    ....

On the following example for TPCH 200, we have al data located in d_data_01 chunk. After index is created, we can see that 159960334336 bytes have been read during index creation and as we have a 16 K pagesize. And notice that: $$ \frac{159960334336}{16384} = 9763204 $$ witch is exactly the number of pages used by lineitem table.

After this data is read, the process of writing the index from the temporary spaces have need to be completed.

So, whatching the SUM of I/O on chunks that belong to the table beeing indexed we can estimate index progression.

13 Test results

The following tables shows a comparision of different configurations on different sizes. The example uses the TPCH table lineitem to create a primary key index on columns (l_orderkey, l_linenumber).

13.1 Times

The following onconfig parameters has been used as initial setup (modifications are shown when applied):

Variable Value Size
SHMTOTAL 16777216 16 GB
SHMVIRTSIZE 8388608 8 GB
DS_TOTAL_MEMORY 2560000 2.4 GB
DS_NONPDQ_QUERY_MEM 256 256 KB
DS_MAX_QUERIES 64 -
DS_MAX_SCANS 1048576 -
MAX_PDQPRIORITY 100 -
BUFFERPOOL default,start_memory=1gb,memory=2gb,extendable=1,cache_hit_ratio=98,...
BUFFERPOOL size=2K,start_memory=2gb,memory=3gb,extendable=1,cache_hit_ratio=98,...
BUFFERPOOL size=16K,start_memory=3gb,memory=4gb,extendable=1,cache_hit_ratio=98,...
DBSPACETEMP 4 dbspaces on same SSD disk (same disk holds rootdbs, logical log and physical log)

TPCH
Scale
Rows
M
Table
GB
Temp
GB
CPU
VP
PDQ=0
PSORT_NPROC
PDQ=100
PSORT_NPROC
0 6 12 0 6 12
DS_TOTAL_MEMORY=2560000, DS_NONPDQ_QUERY_MEM=256
10 60 8 1.8 3 00:02:11 00:03:27 00:02:49 00:01:58 00:00.38 ?
7 ? ... ... ? ? ?
15 ? ... ... ? ? ?
100 600 74 18 3 00:12:02 ... ... 00:03:41 00:04:06 ?
7 ? ... ... ? ? ?
15 ... ... ? ? ?
200 1200 148 36 3 01:28:30 01:19:23 01:02:15 00:14:06 00:10:41 00:12:48
7 01:41:07 01:37:29 01:41:01 00:13:28 00:11:06 00:11:20
15 01:27:08 01:10:09 01:27:43 00:12:20 00:11:56 00:12:05
1000 6000 .. 179 3 ... ... ... ... ... ...
PSORT_DBTEMP on a 40GB ramdisk.
200 1200 148 36 3 00:45:14 01:05:05 01:01:23 00:13:10 00:10:15 00:11:09
7 00:57:32 00:57:31 00:58:33 00:10:15 00:09:58 00:11:10
DS_TOTAL_MEMORY=2560000, DS_NONPDQ_QUERY_MEM=640000
200 1200 148 36 3 00:28:49 00:10:22 00:11:49 n/a
parameter only applies to non PDQ
7 00:18:11 00:11:05 00:11:38
15 00:13:25 00:11:48 00:33:21
DS_TOTAL_MEMORY=10240000, DS_NONPDQ_QUERY_MEM=2560000
200 1200 148 36 3 00:14:05 00:11:38 00:12:37 00:14:25 00:12:01 00:11:53
7 00:13:38 00:11:02 00:11:16 00:20:36 00:23:39 00:11:18
15 00:22:27 00:11:31 00:11:28 00:14:06 00:11:30 00:23:09

13.2 onstat -g iof + ses

The following table shows the activity on dbspaces (table dbspace data + temp dbspaces) plus the #RSAM threads used on session after 60 seconds (the number varies during the process) and the memory pools allocated.

TPCH
Scale
Rows
M
Table
GB
Temp
GB
CPU
VP
GB I/O PDQ=0
PSORT_NPROC
PDQ=100
PSORT_NPROC
0 6 12 0 6 12
DS_NONPDQ_QUERY_MEM=256 (default)
200 1200 148 36 3 Table read
Temp R/W
#RSAM
Memory pools
595
200
5
4
595
200
5
4
595
200
5
4
148
13
9
7
148
13
14
7
148
13
20
7
7 Table read
Temp R/W
#RSAM
Memory pools
893
200
5
4
893
200
5
4
893
200
5
4
148
9
12
9
148
9
21
9
148
9
32
9
15 Table read
Temp R/W
#RSAM
Memory pools
893
200
5
4
893
200
5
4
893
200
5
4
148
13
12
9
148
13
17
9
148
13
26
9
DS_NONPDQ_QUERY_MEM=640000 (DS_TOTAL_MEMORY=2560000)
200 1200 148 36 3 Table read
Temp R/W
#RSAM
Memory pools
148
13
10
7
148
13
14
7
148
13
20
7
n/a
parameter only applies on non PDQ
7 Table read
Temp R/W
#RSAM
Memory pools
148
13
14
9
148
13
22
9
148
14
34
9
15 Table read
Temp R/W
#RSAM
Memory pools
148
13
12
9
148
13
16
9
148
14
22
9
DS_NONPDQ_QUERY_MEM=2560000 (DS_TOTAL_MEMORY=1024000)
200 1200 148 36 3 Table read
Temp R/W
#RSAM
Memory pools
148
13
10
7
148
13
16
7
148
13
24
7
n/a n/a n/a
7 Table read
Temp R/W
#RSAM
Memory pools
148
13
14
9
148
13
22
9
148
14
34
9
n/a n/a n/a
15 Table read
Temp R/W
#RSAM
Memory pools
148
13
12
9
148
13
22
9
148
14
28
9
n/a n/a n/a
RSAM threads samples
5
Copy
tid      name     rstcb            flags    curstk   status
68       sqlexec  61d3b7c8         Y-BP---  20816    cond wait  opened_up -
69       mb_colle 61d3c0a0         --B----  1408     sleeping secs: 1     -
70       xchg_1.0 61d3c978         Y-B----  944      cond wait  opened_up -
71       xchg_2.0 61d3d250         Y-B----  832      cond wait  packet_con-
72       xchg_3.0 61d3db28         --B-R--  1344     IO Wait-
9
Copy
tid      name     rstcb            flags    curstk   status
68       sqlexec  61d3aef0         Y-BP---  20816    cond wait  opened_up -
69       mb_colle 61d3b7c8         --B----  1408     sleeping secs: 1     -
70       xchg_1.0 61d3c978         Y-B----  944      cond wait  opened_up -
71       xchg_2.0 61d3d250         Y-B----  832      cond wait  packet_con-
72       xchg_2.1 61d3db28         Y-B----  832      cond wait  packet_con-
73       xchg_2.2 61d3e400         Y-B----  832      cond wait  packet_con-
74       xchg_2.3 61d3ecd8         Y-B----  832      cond wait  packet_con-
75       xchg_3.0 61d3f5b0         --B-R--  1344     IO Wait-
81       psortpro 61d38b90         --B----  1616     running-
12
Copy
tid      name     rstcb            flags    curstk   status
75       sqlexec  61d3aef0         Y-BP---  20816    cond wait  opened_up -
80       mb_colle 61d3b7c8         --B----  1408     sleeping secs: 1     -
81       xchg_1.0 61d3d250         Y-B----  944      cond wait  opened_up -
82       xchg_2.0 61d3db28         Y-B----  832      cond wait  packet_con-
83       xchg_2.1 61d3e400         Y-B----  832      cond wait  packet_con-
84       xchg_2.2 61d3ecd8         Y-B----  832      cond wait  packet_con-
85       xchg_2.3 61d3f5b0         Y-B----  832      cond wait  packet_con-
86       xchg_2.4 61d3fe88         Y-B----  832      cond wait  packet_con-
87       xchg_2.5 61d40760         Y-B----  832      cond wait  packet_con-
88       xchg_3.0 61d41038         --B-R--  1344     IO Wait-        
91       psortpro 61d38b90         Y-B----  896      cond wait  incore    -
92       psortpro 61d42ac0         --B----  1616     running-
14
Copy
tid      name     rstcb            flags    curstk   status
73       sqlexec  61d3d250         Y-BP---  20816    cond wait  opened_up -
74       mb_colle 61d3db28         --B----  1408     sleeping secs: 1     -
75       xchg_1.0 61d3e400         Y-B----  944      cond wait  opened_up -
76       xchg_2.0 61d3ecd8         Y-B----  832      cond wait  packet_con-
77       xchg_2.1 61d3f5b0         Y-B----  832      cond wait  packet_con-
78       xchg_2.2 61d3fe88         Y-B----  832      cond wait  packet_con-
79       xchg_2.3 61d40760         Y-B----  832      cond wait  packet_con-
80       xchg_3.0 61d41038         --B-R--  1344     IO Wait-
81       psortpro 61d38b90         Y-B----  944      cond wait  backend:0 -
82       psortpro 61d41910         Y-B----  944      cond wait  backend:1 -
83       psortpro 61d421e8         Y-B----  944      cond wait  backend:2 -
84       psortpro 61d42ac0         Y-B----  944      cond wait  backend:3 -
85       psortpro 61d43398         Y-B----  944      cond wait  backend:4 -
86       psortpro 61d43c70         --B----  1616     running-
20
Copy
tid      name     rstcb            flags    curstk   status
68       sqlexec  61d3b7c8         Y-BP---  20816    cond wait  opened_up -
69       mb_colle 61d3aef0         --B----  1408     sleeping secs: 1     -
70       xchg_1.0 61d3c978         Y-B----  944      cond wait  opened_up -
71       xchg_2.0 61d3d250         Y-B----  832      cond wait  packet_con-
72       xchg_2.1 61d3db28         Y-B----  832      cond wait  packet_con-
73       xchg_2.2 61d3e400         Y-B----  832      cond wait  packet_con-
74       xchg_2.3 61d3ecd8         Y-B----  832      cond wait  packet_con-
75       xchg_3.0 61d3f5b0         --B-R--  1344     IO Wait-
81       psortpro 61d41910         Y-B----  944      cond wait  backend:0 -
82       psortpro 61d38b90         Y-B----  944      cond wait  backend:1 -
83       psortpro 61d421e8         Y-B----  944      cond wait  backend:2 -
84       psortpro 61d42ac0         Y-B----  944      cond wait  backend:3 -
85       psortpro 61d43398         Y-B----  944      cond wait  backend:4 -
86       psortpro 61d43c70         Y-B----  944      cond wait  backend:5 -
87       psortpro 61d44548         Y-B----  944      cond wait  backend:6 -
88       psortpro 61d44e20         Y-B----  944      cond wait  backend:7 -
89       psortpro 61d456f8         Y-B----  944      cond wait  backend:8 -
90       psortpro 61d45fd0         Y-B----  944      cond wait  backend:9 -
91       psortpro 61d468a8         Y-B----  944      cond wait  backend:10-
92       psortpro 61d47180         Y-B----  944      cond wait  backend:11-
Memory pool samples
4
Copy
Memory pools    count 4
name         class addr              totalsize  freesize   #allocfrag #freefrag
5            V     62d7d040         266240     22200      245        13
5*O0         V     63300040         4096       768        1          1
5_SORT_0     V     63326040         81920      8848       8          3
5_SORT_0     V     633c1040         401408     8664       4487       78
7
Copy
Memory pools    count 7
name         class addr              totalsize  freesize   #allocfrag #freefrag
5            V     62d7d040         409600     28040      303        18
5*O0         V     6330c040         4096       768        1          1
5_SORT_0     V     6339f040         81920      8848       8          3
5_SORT_0     V     6354b040         3948544    208        37944      1
5_SORT_0     V     63599040         278528     21840      12         6
5_SORT_0     V     635dd040         278528     21840      12         6
5_SORT_0     V     63651040         278528     21840      12         6
9
Copy
Memory pools    count 9
name         class addr              totalsize  freesize   #allocfrag #freefrag
5            V     62e18040         524288     33112      349        22    
5*O0         V     63765040         4096       768        1          1     
5_SORT_0     V     637d4040         81920      8848       8          3     
5_SORT_0     V     636dd040         237568     20840      12         6     
5_SORT_0     V     636df040         2723840    3008       25292      1     
5_SORT_0     V     6375b040         237568     20840      12         6     
5_SORT_0     V     6338f040         237568     20840      12         6     
5_SORT_0     V     633ec040         237568     20840      12         6     
5_SORT_0     V     63473040         2723840    3152       25290      1

13.3 Test system

Test was done in a Dell 730, virtualized machine with 64GB RAM, 16 CPU (32 cores) Intel Xeon E312xx 2.7 Ghz, Samsung SSD 1TB disks. Informix version was 12.10.FC9 configured with.

  • Database has been limited to consume not more than 16GB to impose a pressure on large scale tests.
  • A 40 GB ram disk is configured to test PSORT_TEMP for test.

13.4 Test shells

The provided shell generates an output with the metrics of creation of index on a database using a different configuration for the parameters DS_TOTAL_MEMORY and DS_NONPDQ_QUERY_MEM. The shell creates a directory with the pattern

Copy
TPCH200.DS_TOTALM=[P_DS_TOTAL_MEMORY].DS_NONPDQ=[P_DS_NONPDQ_QUERY_MEM].DS_MAXSCANS=[P_DS_MAX_SCANS]

for example

Copy
TPCH200.DS_TOTALM=10240000.DS_NONPDQ=2560000.DS_MAXSCANS=1048576

according with the setted values on the shell

Copy
onmode -wf _DS_TOTAL_MEMORY=2560000
onmode -wf _DS_NONPDQ_QUERY_MEM=256

You can change this values to perform different set of tests.

The shell executes a creation of index for each value of this parameters and creates the associated log file:

Argument Value 1 Value 2 Value 3
VPCLASS cpu 3 7 15
psort_nprocs 0 6 12

create_index.sh

Copy
#!/bin/sh
#
# create_index.sh
#

# ==============================================================
# Perform onstat at regular intervals (called from index loop)
# ==============================================================

tpch_onstat()
{
	SES=`onstat -g sql | grep "CREATE INDEX" | awk '{ print $1 }'`
        if [ x$SES != "x" ]; then
	    onstat -g ses $SES > $FILE.onstat
#	    onmode -z $SES
	fi
	echo "$SES"
}


# ==============================================================
# Generate performance metrics
# ==============================================================
tpch_status()
{
	date
	oncheck -pt tpch$SCALE:lineitem
	onstat -p
	onstat -g sch
	onstat -g iov
	onstat -g iof
	onstat -g iob
	onstat -g mem
	onstat -g mgm
	onstat -g cfg
}

# ==============================================================
# Create the INDEX
# ==============================================================
tcph_index()
{
	echo "-------------------------------------------------------------------"
	echo INDEX PDQ = $1
        echo PSORT_DBTEMP=$PSORT_DBTEMP
        echo PSORT_NPROCS=$PSORT_NPROCS
	echo "-------------------------------------------------------------------"

	onstat -z
	time dbaccess - - <<!
		database tpch$SCALE;
		drop index lineitem_pk;
		set PDQPRIORITY $1;
		SELECT sid FROM sysmaster:syssessions WHERE sid = DBINFO('sessionid');
		CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber) in i_data_$SCALE;
!
	tpch_status
}


# ==============================================================
# START
#
# Setup INFORMIX
#
# ==============================================================

# Determine server onconfig
config=${ONCONFIG-$INFORMIXDIR/etc/onconfig}

#         DS_TOTAL_MEMORY  DS_NONPDQ_QUERY_MEM
# DEFAULT   1024000             256
# TEST      2560000		640000
# TEST     10240000		2560000

onmode -wf DS_TOTAL_MEMORY=2560000
onmode -wf DS_NONPDQ_QUERY_MEM=256

# The DS_MAX_QUERIES configuration parameter specifies a maximum number of
# decision-support queries that can run at any one time.
# In other words, DS_MAX_QUERIES controls only queries whose PDQ priority is nonzero.
# The database server uses the value of DS_MAX_QUERIES with DS_TOTAL_MEMORY to calculate
#  quantum units of memory to allocate to a query
# DEFAULT 1536
onmode -wf DS_MAX_QUERIES=64

# default 1048576
#onmode -wf DS_MAX_SCANS=1024
onmode -wf DS_MAX_SCANS=1048576



# ==============================================================
# Setup SCALE
# ==============================================================
SCALE=200
DS_TOTALM=`onstat -g cfg | grep DS_TOTAL_MEMORY | awk '{ print $2 }'`
DS_NONPDQ=`onstat -g cfg | grep DS_NONPDQ_QUERY_MEM | awk '{ print $2 }'`
DS_MAXSCANS=`onstat -g cfg | grep DS_MAX_SCANS | awk '{ print $2 }'`


# ==============================================================
# Define directory where output will be placed
# Dir will have a form like
#     
#   TPCH200.DS_TOTALM=2560000.DS_NONPDQ=256.DS_MAXSCANS=1048576
#   TPCH200.DS_TOTALM=10240000.DS_NONPDQ=2560000.DS_MAXSCANS=1048576
# ==============================================================
TPCH_DIR=TPCH$SCALE.DS_TOTALM=$DS_TOTALM.DS_NONPDQ=$DS_NONPDQ.DS_MAXSCANS=$DS_MAXSCANS
mkdir -p $TPCH_DIR

echo $TPCH_DIR

# ==============================================================
# foreach VCPU 3 7 15 ...
# ==============================================================

for cpu in 3  7 15  ; do

   # --------------------------------------------------------
   # setup VPCLASS=cpu,num=3,noage
   # --------------------------------------------------------
   echo "Setting $cpu VPCLASS"
   sed -i -e "s/VPCLASS.*cpu.*/VPCLASS cpu,num=${cpu},noage/" $INFORMIXDIR/etc/onconfig
   onstat -g cfg | grep VPCLASS
   onstat -g cfg | grep DS_TOTAL_MEMORY

   # --------------------------------------------------------
   # foreach PDQPRIORITY 100 0
   # for pdq in 100 0; do
   # --------------------------------------------------------
   for pdq in 100 ; do
      for psort in 0 6 12; do
        for temp in 0; do
    	   if [ $temp == "0" ]; then
		unset PSORT_DBTEMP
	   else
		PSORT_DBTEMP=/mnt/ramdisk
		export PSORT_DBTEMP
	   fi

   	   if [ $psort == "0" ]; then
               unset PSORT_NPROCS
	   else
	       PSORT_NPROCS=$psort
	       export PSORT_NPROCS
           fi

  	   FILE=`printf "%s/index-RAM%d-CPU%02d-PDQ%03d-PSORT%02d.log" $TPCH_DIR $temp $cpu $pdq $psort`
	   echo "Generating $FILE `date`"
	   echo `date` > $FILE
	   env >> $FILE

   	   echo "Stopping informix"
	   onmode -ky
   	   echo "Starting informix"
	   oninit
   	   echo "Starting test"

   	   # --------------------------------------------------------
	   # start index in separate process, then wait to end and
	   # acum statistics every 60 seconds. 
	   # --------------------------------------------------------   	   

       tcph_index $pdq >> $FILE 2>&1 &

	   echo "Sleeping (depends tpch! scale) ..."
	   sleep 60
	   echo "Onstat ..."
	   tpch_onstat >> $FILE 2>&1

	   echo "Waiting process ..."
	   wait
	   echo "Process completed"
	done
      done
   done
done

Use the following shell to show the statistics for the create index operation:

show_stats.sh

Copy
#!/bin/sh
#
# show_stats.sh
#

echo "       +--------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------+"
echo "            00       d     TR    TW      06                           12                           00                           06                           12"
# CPU 03     [[01:28:30][ 595][ 201][ 201] [01:19:23][ 595][ 199][ 200] [01:02:15][ 595][ 204][ 205] [00:19:58][ 148][  13][  13] [00:10:41][ 148][  13][  13] [00:12:48][ 148][  13][  13] ]

for temp in 0 1; do
    echo "RAMDISK $temp"
    for cpu in 3 7 15; do
        echo -n `printf "CPU %02d [ " $cpu`
        for pdq in 0 100; do
        	for psort in 0 6 12; do

	            FILE=`printf "index-RAM%d-CPU%02d-PDQ%03d-PSORT%02d.log" $temp $cpu $pdq $psort`

	            if [ -f "$FILE" ]; then
	                if grep -q real $FILE; then
	                    TIME=`grep real $FILE |
	                        awk '{ print $2 }' |
	                        awk -F . '{ print $1 }' |
	                        awk -F m '{ printf("[%02d:%02d:%02d]", int($1/60), ($1 - (int($1/60) * 60)), $2) }'`

				# SUM(d_data_100, d_data_200)
				DATA_R=`grep d_data $FILE | awk '{ s+=$3; } END { printf("[%4d]", s/(1024*1024*1024)) }'`
				TEMP_R=`grep t_tmp  $FILE | awk '{ s+=$3; } END { printf("[%4d]", s/(1024*1024*1024)) }'`
				TEMP_W=`grep t_tmp  $FILE | awk '{ s+=$5; } END { printf("[%4d]", s/(1024*1024*1024)) }'`

				echo -n "$TIME$DATA_R$TEMP_R$TEMP_W "
	                else
				echo -n "[??:??:??][    ][    ][    ] "
	                fi
	             else
	            	echo -n "[  :  :  ][    ][    ][    ] "
	             fi
           	done
        done
        echo "]"
   done
done

14 Recomendations

You can often improve the performance of an index build by considering the following rules:

14.1 CPU

  • Setup the number of CPU VP's according your hardware CPU power.
  • Setup MAX_PDQPRIORITY onconfig variable to 100 to allow use of PDQ priority to max value for a session.
  • Set PDQ priority to a value greater than 0 to obtain more memory than the default 128 KB. When you set PDQ priority to greater than 0 (zero), the index build can take advantage of the additional memory for parallel processing. To set PDQ priority, use either the PDQPRIORITY environment variable or the SET PDQPRIORITY statement in SQL.
  • IBM Informix documentation says:

    A good practice is not to set the PSORT_NPROCS environment variable. If you have a computer with multiple processors, the database server uses two threads per sort when it sorts index keys and PSORT_NPROCS is not set. The number of sorts depends on the number of fragments in the index, the number of keys, the key size, and the values of the PDQ memory configuration parameters.

    You can see in some cases you can get a little inprovement by setting PSORT_NPROCS. But setting it too high may degrade performance significatively.

14.2 Memory

Allocate enough memory to build the entire index with the following activities:

  • Specify more memory with the DS_TOTAL_MEMORY and tune DS_MAX_QUERIES configuration parameters to the acceptable number of parallel queries.
  • If not enough memory is available, estimate the amount of temporary space necessary for an entire index build.
  • Adjust DS_NONPDQ_QUERY_MEM to 25% of DS_TOTAL_MEMORY.

According the available memory size of the database instance, for SHMMAX

Parameter SHMMAX
Default 16GB 32GB 64GB
DS_TOTAL_MEMORY 1024000 2560000 10240000 20480000
DS_NONPDQ_QUERY_MEM 256 640000 2560000 5120000

DS_NONPDQ_QUERY_MEM default value is too low. Database will have not enought memory for sorting and will force intensive use of temporay space for sorting. Use recommended values from previous table according available memory for database instance.

14.3 Temp space

  • Ensure to have enough temporary space for parallel index build. This is need even with no PDQ priority.

    If you have not enough space you may get a warning in Informix logs and parallel index build is disables. Performance will be degraded.

    Copy
    20:51:33  WARNING: Not enough temp space for parallel index build.
              Space required = 18750290 pages; space available = 8163188 pages.
              Partial index build started.
  • Use the onspaces -t utility to create large temporary dbspaces and specify them in the DBSPACETEMP configuration parameter or the DBSPACETEMP environment variable.

14.4 Concurrency

Use the CREATE INDEX ONLINE modifier if you don't want to lock table during index build.

14.5 Sample of shell

This is a simple sample of shell to export the variables and do the creation of indexes.

Create the file index.sql with the content of SQL instructions:

Copy
SET PDQPRIORITY 100;
CREATE UNIQUE INDEX lineitem_pk ON lineitem (l_orderkey, l_linenumber) in DBSPACE;

A sample of shell:

Copy
# Accelerating 'Create index' using PDQ

# <decision support memory in kbytes>
# use 32Gb 32*1024*1000 Kb
#onmode -M 32768000
onmode -wm DS_TOTAL_MEMORY=32768000

# <max # decision support scans>
#onmode -S 100
onmode -wm DS_MAX_SCANS=100

# <% of MAX_PDQPRIORITY>
export PDQPRIORITY=100

# <Number NUMCPUVPS>
export PSORT_NPROCS=16

ontape -s -N mydb -t /dev/null

echo `date`> ./debug.log
echo "Using PDQPRIORITY=$PDQPRIORITY and PSORT_NPROCS=$PSORT_NPROCS"  >> debug.log
dbaccess mydb ./index.sql >> debug.log 2>&1

ontape -s -U mydb -t /dev/null

echo `date` >> debug.log