提交 1f12ebf1 authored 作者: Thomas Mueller's avatar Thomas Mueller

Documentation.

上级 790a5ccd
...@@ -18,7 +18,8 @@ Change Log ...@@ -18,7 +18,8 @@ Change Log
<h1>Change Log</h1> <h1>Change Log</h1>
<h2>Next Version (unreleased)</h2> <h2>Next Version (unreleased)</h2>
<ul><li>- <ul><li>Database-level connection settings could only be set in the database URL,
but not using the Properties parameter of DriverManaget.getConnection(String url, Properties info).
</li></ul> </li></ul>
<h2>Version 1.3.152 Beta (2011-03-01)</h2> <h2>Version 1.3.152 Beta (2011-03-01)</h2>
......
...@@ -28,10 +28,10 @@ Performance ...@@ -28,10 +28,10 @@ Performance
Application Profiling</a><br /> Application Profiling</a><br />
<a href="#database_profiling"> <a href="#database_profiling">
Database Profiling</a><br /> Database Profiling</a><br />
<a href="#storage_and_indexes">
How Data is Stored and How Indexes Work</a><br />
<a href="#explain_plan"> <a href="#explain_plan">
Statement Execution Plans</a><br /> Statement Execution Plans</a><br />
<a href="#storage_and_indexes">
How Data is Stored and How Indexes Work</a><br />
<a href="#fast_import"> <a href="#fast_import">
Fast Database Import</a><br /> Fast Database Import</a><br />
...@@ -611,6 +611,92 @@ following profiling data (results vary): ...@@ -611,6 +611,92 @@ following profiling data (results vary):
-- 0% 100% 0 1 0 SET TRACE_LEVEL_FILE 3; -- 0% 100% 0 1 0 SET TRACE_LEVEL_FILE 3;
</pre> </pre>
<h2 id="explain_plan">Statement Execution Plans</h2>
<p>
The SQL statement <code>EXPLAIN</code> displays the indexes and optimizations the database uses for a statement.
The following statements support <code>EXPLAIN</code>: <code>SELECT, UPDATE, DELETE, MERGE, INSERT</code>.
The following query shows that the database uses the primary key index to search for rows:
</p>
<pre>
EXPLAIN SELECT * FROM TEST WHERE ID=1;
SELECT
TEST.ID,
TEST.NAME
FROM PUBLIC.TEST
/* PUBLIC.PRIMARY_KEY_2: ID = 1 */
WHERE ID = 1
</pre>
<p>
For joins, the tables in the execution plan are sorted in the order they are processed.
The following query shows the database first processes the table <code>INVOICE</code> (using the primary key).
For each row, it will additionally check that the value of the column <code>AMOUNT</code> is larger than zero,
and for those rows the database will search in the table <code>CUSTOMER</code> (using the primary key).
The query plan contains some redundancy so it is a valid statement.
</p>
<pre>
CREATE TABLE CUSTOMER(ID IDENTITY, NAME VARCHAR);
CREATE TABLE INVOICE(ID IDENTITY,
CUSTOMER_ID INT REFERENCES CUSTOMER(ID),
AMOUNT NUMBER);
EXPLAIN SELECT I.ID, C.NAME FROM CUSTOMER C, INVOICE I
WHERE I.ID=10 AND AMOUNT>0 AND C.ID=I.CUSTOMER_ID;
SELECT
I.ID,
C.NAME
FROM PUBLIC.INVOICE I
/* PUBLIC.PRIMARY_KEY_9: ID = 10 */
/* WHERE (I.ID = 10)
AND (AMOUNT > 0)
*/
INNER JOIN PUBLIC.CUSTOMER C
/* PUBLIC.PRIMARY_KEY_5: ID = I.CUSTOMER_ID */
ON 1=1
WHERE (C.ID = I.CUSTOMER_ID)
AND ((I.ID = 10)
AND (AMOUNT > 0))
</pre>
<h3>Displaying the Scan Count</h3>
<p>
<code>EXPLAIN ANALYZE</code> additionally shows the scanned rows per table and pages read from disk per table or index.
This will actually execute the query, unlike <code>EXPLAIN</code> which only prepares it.
The following query scanned 1000 rows, and to do that had to read 85 pages from the data area of the table.
Running the query twice will not list the pages read from disk, because they are now in the cache.
The <code>tableScan</code> means this query doesn't use an index.
</p>
<pre>
EXPLAIN ANALYZE SELECT * FROM TEST;
SELECT
TEST.ID,
TEST.NAME
FROM PUBLIC.TEST
/* PUBLIC.TEST.tableScan */
/* scanCount: 1000 */
/*
total: 85
TEST.TEST_DATA read: 85 (100%)
*/
</pre>
<h3>Special Optimizations</h3>
<p>
For certain queries, the database doesn't need to read all rows, or doesn't need to sort the result even if <code>ORDER BY</code> is used.
</p><p>
For queries of the form <code>SELECT COUNT(*), MIN(ID), MAX(ID) FROM TEST</code>, the query plan includes the line
<code>/* direct lookup */</code> if the data can be read from an index.
</p><p>
For queries of the form <code>SELECT DISTINCT CUSTOMER_ID FROM INVOICE</code>, the query plan includes the line
<code>/* distinct */</code> if there is an non-unique or multi-column index on this column, and if this column has a low selectivity.
</p><p>
For queries of the form <code>SELECT * FROM TEST ORDER BY ID</code>, the query plan includes the line
<code>/* index sorted */</code> to indicate there is no separate sorting required.
</p><p>
For queries of the form <code>SELECT * FROM TEST GROUP BY ID ORDER BY ID</code>, the query plan includes the line
<code>/* group sorted */</code> to indicate there is no separate sorting required.
</p>
<h2 id="storage_and_indexes">How Data is Stored and How Indexes Work</h2> <h2 id="storage_and_indexes">How Data is Stored and How Indexes Work</h2>
<p> <p>
Internally, each row in a table is identified by a unique number, the row id. Internally, each row in a table is identified by a unique number, the row id.
...@@ -650,6 +736,8 @@ FROM PUBLIC.ADDRESS ...@@ -650,6 +736,8 @@ FROM PUBLIC.ADDRESS
/* PUBLIC.ADDRESS.tableScan */ /* PUBLIC.ADDRESS.tableScan */
WHERE NAME = 'Miller'; WHERE NAME = 'Miller';
</pre> </pre>
<h3>Indexes</h3>
<p> <p>
An index internally is basically just a table that contains the indexed column(s), plus the row id: An index internally is basically just a table that contains the indexed column(s), plus the row id:
</p> </p>
...@@ -713,7 +801,13 @@ WHERE FIRST_NAME = 'John'; ...@@ -713,7 +801,13 @@ WHERE FIRST_NAME = 'John';
</pre> </pre>
<p> <p>
If your application often queries the table for a phone number, then it makes sense to create If your application often queries the table for a phone number, then it makes sense to create
an additional index on it, which then contains the following data: an additional index on it:
</p>
<pre>
CREATE INDEX IDX_PHONE ON ADDRESS(PHONE);
</pre>
<p>
This index contains the phone number, and the row id:
</p> </p>
<table> <table>
<tr><th>PHONE</th><th>_ROWID_</th></tr> <tr><th>PHONE</th><th>_ROWID_</th></tr>
...@@ -721,92 +815,31 @@ an additional index on it, which then contains the following data: ...@@ -721,92 +815,31 @@ an additional index on it, which then contains the following data:
<tr><td>123 456 789</td><td>1</td></tr> <tr><td>123 456 789</td><td>1</td></tr>
</table> </table>
<h2 id="explain_plan">Statement Execution Plans</h2> <h3>Using Multiple Indexes</h3>
<p> <p>
The SQL statement <code>EXPLAIN</code> displays the indexes and optimizations the database uses for a statement. Within a query, only one index per logical table is used.
The following statements support <code>EXPLAIN</code>: <code>SELECT, UPDATE, DELETE, MERGE, INSERT</code>. Using the condition <code>PHONE = '123 567 789' OR CITY = 'Berne'</code>
The following query shows that the database uses the primary key index to search for rows: would use a table scan instead of first using the index on the phone number and then the index on the city.
It makes sense to write two queries and combine then using <code>UNION</code>.
In this case, each individual query uses a different index:
</p> </p>
<pre> <pre>
EXPLAIN SELECT * FROM TEST WHERE ID=1; EXPLAIN SELECT NAME FROM ADDRESS WHERE PHONE = '123 567 789'
SELECT UNION SELECT NAME FROM ADDRESS WHERE CITY = 'Berne';
TEST.ID,
TEST.NAME
FROM PUBLIC.TEST
/* PUBLIC.PRIMARY_KEY_2: ID = 1 */
WHERE ID = 1
</pre>
<p>
For joins, the tables in the execution plan are sorted in the order they are processed.
The following query shows the database first processes the table <code>INVOICE</code> (using the primary key).
For each row, it will additionally check that the value of the column <code>AMOUNT</code> is larger than zero,
and for those rows the database will search in the table <code>CUSTOMER</code> (using the primary key).
The query plan contains some redundancy so it is a valid statement.
</p>
<pre>
CREATE TABLE CUSTOMER(ID IDENTITY, NAME VARCHAR);
CREATE TABLE INVOICE(ID IDENTITY,
CUSTOMER_ID INT REFERENCES CUSTOMER(ID),
AMOUNT NUMBER);
EXPLAIN SELECT I.ID, C.NAME FROM CUSTOMER C, INVOICE I (SELECT
WHERE I.ID=10 AND AMOUNT>0 AND C.ID=I.CUSTOMER_ID; NAME
FROM PUBLIC.ADDRESS
SELECT /* PUBLIC.IDX_PHONE: PHONE = '123 567 789' */
I.ID, WHERE PHONE = '123 567 789')
C.NAME UNION
FROM PUBLIC.INVOICE I (SELECT
/* PUBLIC.PRIMARY_KEY_9: ID = 10 */ NAME
/* WHERE (I.ID = 10) FROM PUBLIC.ADDRESS
AND (AMOUNT > 0) /* PUBLIC.INDEX_PLACE: CITY = 'Berne' */
*/ WHERE CITY = 'Berne')
INNER JOIN PUBLIC.CUSTOMER C
/* PUBLIC.PRIMARY_KEY_5: ID = I.CUSTOMER_ID */
ON 1=1
WHERE (C.ID = I.CUSTOMER_ID)
AND ((I.ID = 10)
AND (AMOUNT > 0))
</pre>
<h3>Displaying the Scan Count</h3>
<p>
<code>EXPLAIN ANALYZE</code> additionally shows the scanned rows per table and pages read from disk per table or index.
This will actually execute the query, unlike <code>EXPLAIN</code> which only prepares it.
The following query scanned 1000 rows, and to do that had to read 85 pages from the data area of the table.
Running the query twice will not list the pages read from disk, because they are now in the cache.
The <code>tableScan</code> means this query doesn't use an index.
</p>
<pre>
EXPLAIN ANALYZE SELECT * FROM TEST;
SELECT
TEST.ID,
TEST.NAME
FROM PUBLIC.TEST
/* PUBLIC.TEST.tableScan */
/* scanCount: 1000 */
/*
total: 85
TEST.TEST_DATA read: 85 (100%)
*/
</pre> </pre>
<h3>Special Optimizations</h3>
<p>
For certain queries, the database doesn't need to read all rows, or doesn't need to sort the result even if <code>ORDER BY</code> is used.
</p><p>
For queries of the form <code>SELECT COUNT(*), MIN(ID), MAX(ID) FROM TEST</code>, the query plan includes the line
<code>/* direct lookup */</code> if the data can be read from an index.
</p><p>
For queries of the form <code>SELECT DISTINCT CUSTOMER_ID FROM INVOICE</code>, the query plan includes the line
<code>/* distinct */</code> if there is an non-unique or multi-column index on this column, and if this column has a low selectivity.
</p><p>
For queries of the form <code>SELECT * FROM TEST ORDER BY ID</code>, the query plan includes the line
<code>/* index sorted */</code> to indicate there is no separate sorting required.
</p><p>
For queries of the form <code>SELECT * FROM TEST GROUP BY ID ORDER BY ID</code>, the query plan includes the line
<code>/* group sorted */</code> to indicate there is no separate sorting required.
</p>
<h2 id="fast_import">Fast Database Import</h2> <h2 id="fast_import">Fast Database Import</h2>
<p> <p>
To speed up large imports, consider using the following options temporarily: To speed up large imports, consider using the following options temporarily:
......
...@@ -549,8 +549,11 @@ You can rename it to <code>H2Dialect.java</code> and include this as a patch in ...@@ -549,8 +549,11 @@ You can rename it to <code>H2Dialect.java</code> and include this as a patch in
or upgrade to a version of Hibernate where this is fixed. or upgrade to a version of Hibernate where this is fixed.
</p> </p>
<p> <p>
When using compatibility modes such as <code>MODE=MySQL</code> when using Hibernate When using Hibernate, try to use the <code>H2Dialect</code> if possible.
is not supported when using <code>H2Dialect</code>. When using the <code>H2Dialect</code>,
compatibility modes such as <code>MODE=MySQL</code> are not supported.
When using such a compatibility mode, use the Hibernate dialect for the
corresponding database instead of the <code>H2Dialect</code>.
</p> </p>
<h2 id="using_toplink">Using TopLink and Glassfish</h2> <h2 id="using_toplink">Using TopLink and Glassfish</h2>
......
Markdown 格式
0%
您添加了 0 到此讨论。请谨慎行事。
请先完成此评论的编辑!
注册 或者 后发表评论