Documentation.

fa2f2473 · Thomas Mueller · 0fec2501 · fa2f2473 · fa2f2473 · fa2f2473
--- a/h2/src/docsrc/html/changelog.html
+++ b/h2/src/docsrc/html/changelog.html
@@ -18,7 +18,13 @@ Change Log
 <h1>Change Log</h1>

 <h2>Next Version (unreleased)</h2>
-<ul><li>Improved error detection when starting a server with invalid arguments,
+<ul><li>MVCC: the probability of lock timeouts is now lower if multiple threads try to update the same rows.
+</li><li>Building only the documentation (without compiling all classes) didn't work, specially: ./build.sh clean javadocImpl.
+</li><li>Documentation on how data is stored internally and how indexes work (in the performance section).
+</li><li>Some people reported NullPointerException in FileObjectDiskMapped.
+    The most likely explanation is that multiple threads access the same object at the same time.
+    Therefore, the public methods in this class are now synchronized.
+</li><li>Improved error detection when starting a server with invalid arguments,
    such as "-tcpPort=9091" or "-tcpPort 9091" (as one parameter) instead of "-tcpPort", "9091".
 </li><li>The function STRINGDECODE ignored characters after a non-escaped double quote.
    This is no longer the case.

--- a/h2/src/docsrc/html/performance.html
+++ b/h2/src/docsrc/html/performance.html
@@ -28,6 +28,8 @@ Performance
    Application Profiling</a><br />
 <a href="#database_profiling">
    Database Profiling</a><br />
+<a href="#storage_and_indexes">
+    How Data is Stored and How Indexes Work</a><br />
 <a href="#explain_plan">
    Statement Execution Plans</a><br />
 <a href="#fast_import">
@@ -609,6 +611,116 @@ following profiling data (results vary):
 --   0% 100%       0       1       0 SET TRACE_LEVEL_FILE 3;
 </pre>

+<h2 id="storage_and_indexes">How Data is Stored and How Indexes Work</h2>
+<p>
+Internally, each row in a table is identified by a unique number, the row id.
+The rows of a table are stored with the row id as the key.
+The row id is a number of type long.
+If a table has a single column primary key of type <code>INT</code> or <code>BIGINT</code>,
+then the value of this column is the row id, otherwise the database generates the row id automatically.
+There is a (non-standard) way to access the row id: using the <code>_ROWID_</code> pseudo-column:
+</p>
+<pre>
+CREATE TABLE ADDRESS(FIRST_NAME VARCHAR, NAME VARCHAR, CITY VARCHAR, PHONE VARCHAR);
+INSERT INTO ADDRESS VALUES('John', 'Miller', 'Berne', '123 456 789');
+INSERT INTO ADDRESS VALUES('Philip', 'Jones', 'Berne', '123 012 345');
+SELECT _ROWID_, * FROM ADDRESS;
+</pre>
+<p>
+The data is stored in the database as follows:
+</p>
+<table>
+<tr><th>_ROWID_</th><th>FIRST_NAME</th><th>NAME</th><th>CITY</th><th>PHONE</th></tr>
+<tr><td>1</td><td>John</td><td>Miller</td><td>Berne</td><td>123 456 789</td></tr>
+<tr><td>2</td><td>Philip</td><td>Jones</td><td>Berne</td><td>123 012 345</td></tr>
+</table>
+<p>
+Access by row id is fast because the data is sorted by this key.
+If the query condition does not contain the row id (and if no other index can be used), then all rows of the table are scanned.
+A table scan iterates over all rows in the table, in the order of the row id.
+To find out what strategy the database uses to retrieve the data, use <code>EXPLAIN SELECT</code>:
+</p>
+<pre>
+SELECT * FROM ADDRESS WHERE NAME = 'Miller';
+
+EXPLAIN SELECT PHONE FROM ADDRESS WHERE NAME = 'Miller';
+SELECT
+    PHONE
+FROM PUBLIC.ADDRESS
+    /* PUBLIC.ADDRESS.tableScan */
+WHERE NAME = 'Miller';
+</pre>
+<p>
+An index internally is basically just a table that contains the indexed column(s), plus the row id:
+</p>
+<pre>
+CREATE INDEX INDEX_PLACE ON ADDRESS(CITY, NAME, FIRST_NAME);
+</pre>
+<p>
+In the index, the data is sorted by the indexed columns.
+So this index contains the following data:
+</p>
+<table>
+<tr><th>CITY</th><th>NAME</th><th>FIRST_NAME</th><th>_ROWID_</th></tr>
+<tr><td>Berne</td><td>Jones</td><td>Philip</td><td>2</td></tr>
+<tr><td>Berne</td><td>Miller</td><td>John</td><td>1</td></tr>
+</table>
+<p>
+When the database uses an index to query the data, it searches the index for the given data,
+and (if required) reads the remaining columns in the main data table (retrieved using the row id).
+An index on city, name, and first name allows to quickly search for rows when the city, name, and first name are known.
+If only the city and name, or only the city is known, then the index is also used.
+This index is also used when reading all rows, sorted by the indexed columns.
+However, if only the first name is known, then this index is not used:
+</p>
+<pre>
+EXPLAIN SELECT PHONE FROM ADDRESS WHERE CITY = 'Berne' AND NAME = 'Miller' AND FIRST_NAME = 'John';
+SELECT
+    PHONE
+FROM PUBLIC.ADDRESS
+    /* PUBLIC.INDEX_PLACE: FIRST_NAME = 'John'
+        AND CITY = 'Berne'
+        AND NAME = 'Miller'
+     */
+WHERE (FIRST_NAME = 'John')
+    AND ((CITY = 'Berne')
+    AND (NAME = 'Miller'));
+
+EXPLAIN SELECT PHONE FROM ADDRESS WHERE CITY = 'Berne';
+SELECT
+    PHONE
+FROM PUBLIC.ADDRESS
+    /* PUBLIC.INDEX_PLACE: CITY = 'Berne' */
+WHERE CITY = 'Berne';
+
+EXPLAIN SELECT * FROM ADDRESS ORDER BY CITY, NAME, FIRST_NAME;
+SELECT
+    ADDRESS.FIRST_NAME,
+    ADDRESS.NAME,
+    ADDRESS.CITY,
+    ADDRESS.PHONE
+FROM PUBLIC.ADDRESS
+    /* PUBLIC.INDEX_PLACE */
+ORDER BY 3, 2, 1
+/* index sorted */;
+
+EXPLAIN SELECT PHONE FROM ADDRESS WHERE FIRST_NAME = 'John';
+SELECT
+    PHONE
+FROM PUBLIC.ADDRESS
+    /* PUBLIC.ADDRESS.tableScan */
+WHERE FIRST_NAME = 'John';
+</pre>
+<p>
+If your application often queries the table for a phone number, then it makes sense to create
+an additional index on it, which then contains the following data:
+</p>
+<table>
+<tr><th>PHONE</th><th>_ROWID_</th></tr>
+<tr><td>123 012 345</td><td>2</td></tr>
+<tr><td>123 456 789</td><td>1</td></tr>
+</table>
+
 <h2 id="explain_plan">Statement Execution Plans</h2>
 <p>
 The SQL statement <code>EXPLAIN</code> displays the indexes and optimizations the database uses for a statement.

--- a/h2/src/docsrc/html/roadmap.html
+++ b/h2/src/docsrc/html/roadmap.html
@@ -557,6 +557,7 @@ See also <a href="build.html#providing_patches">Providing Patches</a>.
 </li><li>Compatibility with IBM DB2: SQL cursors.
 </li><li>Single-column primary key values are always stored explicitly. This is not required.
 </li><li>Compatibility with MySQL: support CREATE TABLE TEST(NAME VARCHAR(255) CHARACTER SET UTF8).
+</li><li>CALL is incompatible with other databases because it returns a result set, so that CallableStatement.execute() returns true.
 </li></ul>

 <h2>Not Planned</h2>