advanced.html 47.1 KB
Newer Older
1
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
2 3
<!--
Copyright 2004-2007 H2 Group. Licensed under the H2 License, Version 1.0 (http://h2database.com/html/license.html).
4 5
Initial Developer: H2 Group
-->
6 7
<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head><meta http-equiv="Content-Type" content="text/html;charset=utf-8" /><title>
8
Advanced Topics
9
</title><link rel="stylesheet" type="text/css" href="stylesheet.css" />
10 11 12 13 14
<script type="text/javascript" src="navigation.js"></script>
</head><body onload="frameMe();">
<table class="content"><tr class="content"><td class="content"><div class="contentDiv">

<h1>Advanced Topics</h1>
15
<a href="#result_sets">
16
    Result Sets</a><br />
17
<a href="#large_objects">
18
    Large Objects</a><br />
19
<a href="#linked_tables">
20
    Linked Tables</a><br />
21
<a href="#transaction_isolation">
22
    Transaction Isolation</a><br />
23 24
<a href="#mvcc">
    Multi-Version Concurrency Control (MVCC)</a><br />
25
<a href="#clustering">
26
    Clustering / High Availability</a><br />
27
<a href="#two_phase_commit">
28
    Two Phase Commit</a><br />
29
<a href="#compatibility">
30
    Compatibility</a><br />
31
<a href="#windows_service">
32
    Run as Windows Service</a><br />
33
<a href="#odbc_driver">
34
    ODBC Driver</a><br />
35
<a href="#acid">
36
    ACID</a><br />
37
<a href="#durability_problems">
38
    Durability Problems</a><br />
39
<a href="#using_recover_tool">
40
    Using the Recover Tool</a><br />
41
<a href="#file_locking_protocols">
42
    File Locking Protocols</a><br />
43
<a href="#sql_injection">
44
    Protection against SQL Injection</a><br />
45 46
<a href="#restricting_classes">
    Restricting Class Loading and Usage</a><br />
47
<a href="#security_protocols">
48
    Security Protocols</a><br />
49
<a href="#uuid">
50
    Universally Unique Identifiers (UUID)</a><br />
51
<a href="#system_properties">
52
    Settings Read from System Properties</a><br />
53 54
<a href="#server_bind_address">
    Setting the Server Bind Address</a><br />
55
<a href="#glossary_links">
56
    Glossary and Links</a><br />
57

58
<br /><a name="result_sets"></a>
59 60 61
<h2>Result Sets</h2>

<h3>Limiting the Number of Rows</h3>
62
<p>
63 64 65 66 67 68
Before the result is returned to the application, all rows are read by the database.
Server side cursors are not supported currently.
If only the first few rows are interesting for the application, then the
result set size should be limited to improve the performance.
This can be done using LIMIT in a query (example: SELECT * FROM TEST LIMIT 100),
or by using Statement.setMaxRows(max).
69
</p>
70 71

<h3>Large Result Sets and External Sorting</h3>
72
<p>
73 74 75
For result set larger than 1000 rows, the result is buffered to disk. If ORDER BY is used,
the sorting is done using an external sort algorithm. In this case, each block of rows is sorted using
quick sort, then written to disk; when reading the data, the blocks are merged together.
76
</p>
77

78
<br /><a name="large_objects"></a>
79 80 81
<h2>Large Objects</h2>

<h3>Storing and Reading Large Objects</h3>
82
<p>
83 84 85 86 87 88 89 90
If it is possible that the objects don't fit into memory, then the data type
CLOB (for textual data) or BLOB (for binary data) should be used.
For these data types, the objects are not fully read into memory, by using streams.
To store a BLOB, use PreparedStatement.setBinaryStream. To store a CLOB, use
PreparedStatement.setCharacterStream. To read a BLOB, use ResultSet.getBinaryStream,
and to read a CLOB, use ResultSet.getCharacterStream.
If the client/server mode is used, the BLOB and CLOB data is fully read into memory when
accessed. In this case, the size of a BLOB or CLOB is limited by the memory.
91
</p>
92

93
<br /><a name="linked_tables"></a>
94
<h2>Linked Tables</h2>
95
<p>
96 97
This database supports linked tables, which means tables that don't exist in the current database but
are just links to another database. To create such a link, use the CREATE LINKED TABLE statement:
98
</p>
99 100 101
<pre>
CREATE LINKED TABLE LINK('org.postgresql.Driver', 'jdbc:postgresql:test', 'sa', 'sa', 'TEST');
</pre>
102
<p>
103
It is then possible to access the table in the usual way.
104
There is a restriction when inserting data to this table: When inserting or updating rows into the table,
105
NULL and values that are not set in the insert statement are both inserted as NULL.
106
This may not have the desired effect if a default value in the target table is other than NULL.
107
</p>
108 109 110 111 112 113 114 115 116
<p>
For each linked table a new connection is opened. This can be a problem for some databases when using
many linked tables. For Oracle XE, the maximum number of connection can be increased.
Oracle XE needs to be restarted after changing these values:
</p>
<pre>
alter system set processes=100 scope=spfile;
alter system set sessions=100 scope=spfile;
</pre>
117

118
<br /><a name="transaction_isolation"></a>
119
<h2>Transaction Isolation</h2>
120 121 122 123
<p>
This database supports the following transaction isolation levels:
</p>
<ul>
124
<li><b>Read Committed</b><br />
125
    This is the default level.
126
    Read locks are released immediately.
127
    Higher concurrency is possible when using this level.<br />
128
    To enable, execute the SQL statement    'SET LOCK_MODE 3'<br />
129
    or append ;LOCK_MODE=3 to the database URL: jdbc:h2:~/test;LOCK_MODE=3
130 131 132 133
</li><li>
<b>Serializable</b><br />
    To enable, execute the SQL statement    'SET LOCK_MODE 1'<br />
    or append ;LOCK_MODE=1 to the database URL: jdbc:h2:~/test;LOCK_MODE=1
134
</li><li><b>Read Uncommitted</b><br />
135 136 137
    This level means that transaction isolation is disabled.<br />
    To enable, execute the SQL statement    'SET LOCK_MODE 0'<br />
    or append ;LOCK_MODE=0 to the database URL: jdbc:h2:~/test;LOCK_MODE=0
138 139 140 141 142
</li>
</ul>
<p>
When using the isolation level 'serializable', dirty reads, non-repeatable reads, and phantom reads are prohibited.
</p>
143
<ul>
144
<li><b>Dirty Reads</b><br />
145 146
    Means a connection can read uncommitted changes made by another connection.<br />
    Possible with: read uncommitted
147
</li><li><b>Non-Repeatable Reads</b><br />
148
    A connection reads a row, another connection changes a row and commits,
149 150
    and the first connection re-reads the same row and gets the new result.<br />
    Possible with: read uncommitted, read committed
151
</li><li><b>Phantom Reads</b><br />
152 153
    A connection reads a set of rows using a condition, another connection
    inserts a row that falls in this condition and commits, then the first connection
154
    re-reads using the same condition and gets the new row.<br />
155
    Possible with: read uncommitted, read committed
156 157
</li>
</ul>
158 159

<h3>Table Level Locking</h3>
160
<p>
161
The database allows multiple concurrent connections to the same database.
162
To make sure all connections only see consistent data, table level locking is used by default.
163 164 165 166 167 168 169 170 171 172
This mechanism does not allow high concurrency, but is very fast.
Shared locks and exclusive locks are supported.
Before reading from a table, the database tries to add a shared lock to the table
(this is only possible if there is no exclusive lock on the object by another connection).
If the shared lock is added successfully, the table can be read. It is allowed that
other connections also have a shared lock on the same object. If a connection wants
to write to a table (update or delete a row), an exclusive lock is required. To get the
exclusive lock, other connection must not have any locks on the object. After the
connection commits, all locks are released.
This database keeps all locks in memory.
173
</p>
174 175

<h3>Lock Timeout</h3>
176
<p>
177 178 179 180 181 182
If a connection cannot get a lock on an object, the connection waits for some amount
of time (the lock timeout). During this time, hopefully the connection holding the
lock commits and it is then possible to get the lock. If this is not possible because
the other connection does not release the lock for some time, the unsuccessful
connection will get a lock timeout exception. The lock timeout can be set individually
for each connection.
183
</p>
184

185 186 187
<br /><a name="mvcc"></a>
<h2>Multi-Version Concurrency Control (MVCC)</h2>
<p>
188 189 190
The MVCC feature allows higher concurrency than using (table level or row level) locks.
When using MVCC in this database, delete, insert and update operations will only issue a
shared lock on the table. Table are still locked exclusively when adding or removing columns,
191 192 193 194 195
when dropping the table, and when using SELECT ... FOR UPDATE. Connections
only 'see' committed data, and own changes. That means, if connection A updates
a row but doesn't commit this change yet, connection B will see the old value.
Only when the change is committed, the new value is visible by other connections
(read committed). If multiple connections concurrently try to update the same row, this
196
database fails fast: a concurrent update exception is thrown.
197 198 199 200 201 202 203 204
</p>
<p>
To use the MVCC feature, append MVCC=TRUE to the database URL:
<pre>
jdbc:h2:~/test;MVCC=TRUE
</pre>
</p>

205
<br /><a name="clustering"></a>
206
<h2>Clustering / High Availability</h2>
207
<p>
208 209 210 211 212 213
This database supports a simple clustering / high availability mechanism. The architecture is:
two database servers run on two different computers, and on both computers is a copy of the
same database. If both servers run, each database operation is executed on both computers.
If one server fails (power, hardware or network failure), the other server can still continue to work.
From this point on, the operations will be executed only on one server until the other server
is back up.
214
</p><p>
215 216 217 218
Clustering can only be used in the server mode (the embedded mode does not support clustering).
It is possible to restore the cluster without stopping the server, however it is critical that no other
application is changing the data in the first database while the second database is restored, so
restoring the cluster is currently a manual process.
219
</p><p>
220
To initialize the cluster, use the following steps:
221
</p>
222 223
<ul>
<li>Create a database
224
</li><li>Use the CreateCluster tool to copy the database to another location and initialize the clustering.
225
    Afterwards, you have two databases containing the same data.
226 227 228
</li><li>Start two servers (one for each copy of the database)
</li><li>You are now ready to connect to the databases with the client application(s)
</li></ul>
229 230

<h3>Using the CreateCluster Tool</h3>
231
<p>
232 233 234
To understand how clustering works, please try out the following example.
In this example, the two databases reside on the same computer, but usually, the
databases will be on different servers.
235
</p>
236 237 238
<ul>
<li>Create two directories: server1 and server2.
    Each directory will simulate a directory on a computer.
239
</li><li>Start a TCP server pointing to the first directory.
240
    You can do this using the command line:
241
<pre>
242 243 244 245
java org.h2.tools.Server
    -tcp -tcpPort 9101
    -baseDir server1
</pre>
246
</li><li>Start a second TCP server pointing to the second directory.
247 248
    This will simulate a server running on a second (redundant) computer.
    You can do this using the command line:
249
<pre>
250 251 252 253
java org.h2.tools.Server
    -tcp -tcpPort 9102
    -baseDir server2
</pre>
254
</li><li>Use the CreateCluster tool to initialize clustering.
255 256
    This will automatically create a new, empty database if it does not exist.
    Run the tool on the command line:
257
<pre>
258 259 260 261 262 263
java org.h2.tools.CreateCluster
  -urlSource jdbc:h2:tcp://localhost:9101/test
  -urlTarget jdbc:h2:tcp://localhost:9102/test
  -user sa
  -serverlist localhost:9101,localhost:9102
</pre>
264
</li><li>You can now connect to the databases using
265 266
an application or the H2 Console using the JDBC URL
jdbc:h2:tcp://localhost:9101,localhost:9102/test
267
</li><li>If you stop a server (by killing the process),
268 269
you will notice that the other machine continues to work,
and therefore the database is still accessible.
270
</li><li>To restore the cluster, you first need to delete the
271 272
database that failed, then restart the server that was stopped,
and re-run the CreateCluster tool.
273
</li></ul>
274

275
<h3>Clustering Algorithm and Limitations</h3>
276
<p>
277 278 279 280 281 282 283
Read-only queries are only executed against the first cluster node, but all other statements are
executed against all nodes. There is currently no load balancing made to avoid problems with
transactions. The following functions may yield different results on different cluster nodes and must be
executed with care: RANDOM_UUID(), SECURE_RAND(), SESSION_ID(), MEMORY_FREE(), MEMORY_USED(),
CSVREAD(), CSVWRITE(), RAND() [when not using a seed]. Those functions should not be used
directly in modifying statements (for example INSERT, UPDATE, or MERGE). However, they can be used
in read-only statements and the result can then be used for modifying statements.
284
</p>
285

286
<br /><a name="two_phase_commit"></a>
287
<h2>Two Phase Commit</h2>
288
<p>
289
The two phase commit protocol is supported. 2-phase-commit works as follows:
290
</p>
291 292
<ul>
<li>Autocommit needs to be switched off
293 294
</li><li>A transaction is started, for example by inserting a row
</li><li>The transaction is marked 'prepared' by executing the SQL statement
295 296
    <code>PREPARE COMMIT transactionName</code>
</li><li>The transaction can now be committed or rolled back
297
</li><li>If a problem occurs before the transaction was successfully committed or rolled back
298
    (for example because a network problem occurred), the transaction is in the state 'in-doubt'
299
</li><li>When re-connecting to the database, the in-doubt transactions can be listed
300 301 302
    with <code>SELECT * FROM INFORMATION_SCHEMA.IN_DOUBT</code>
</li><li>Each transaction in this list must now be committed or rolled back by executing
    <code>COMMIT TRANSACTION transactionName</code> or
303
    <code>ROLLBACK TRANSACTION transactionName</code>
304
</li><li>The database needs to be closed and re-opened to apply the changes
305
</li></ul>
306

307
<br /><a name="compatibility"></a>
308
<h2>Compatibility</h2>
309
<p>
310 311
This database is (up to a certain point) compatible to other databases such as HSQLDB, MySQL and PostgreSQL.
There are certain areas where H2 is incompatible.
312
</p>
313 314

<h3>Transaction Commit when Autocommit is On</h3>
315
<p>
316 317 318
At this time, this database engine commits a transaction (if autocommit is switched on) just before returning the result.
For a query, this means the transaction is committed even before the application scans through the result set, and before the result set is closed.
Other database engines may commit the transaction in this case when the result set is closed.
319
</p>
320 321

<h3>Keywords / Reserved Words</h3>
322
<p>
323 324
There is a list of keywords that can't be used as identifiers (table names, column names and so on),
unless they are quoted (surrounded with double quotes). The list is currently:
325
</p><p>
326 327 328
CURRENT_TIMESTAMP, CURRENT_TIME, CURRENT_DATE, CROSS, DISTINCT, EXCEPT, EXISTS, FROM,
FOR, FALSE, FULL, GROUP, HAVING, INNER, INTERSECT, IS, JOIN, LIKE, MINUS, NATURAL, NOT, NULL,
ON, ORDER, PRIMARY, ROWNUM, SELECT, SYSDATE, SYSTIME, SYSTIMESTAMP, TODAY, TRUE, UNION, WHERE
329
</p><p>
330 331
Certain words of this list are keywords because they are functions that can be used without '()' for compatibility,
for example CURRENT_TIMESTAMP.
332
</p>
333

334
<br /><a name="windows_service"></a>
335
<h2>Run as Windows Service</h2>
336
<p>
337
Using a native wrapper / adapter, Java applications can be run as a Windows Service.
338 339 340 341 342
There are various tools available to do that. The Java Service Wrapper from Tanuki Software, Inc.
(<a href="http://wrapper.tanukisoftware.org">http://wrapper.tanukisoftware.org</a>)
is included in the installation. Batch files are provided to install, start, stop and uninstall the H2 Database Engine Service.
This service contains the TCP Server and the H2 Console web application.
The batch files are located in the directory H2/service.
343
</p>
344 345

<h3>Install the Service</h3>
346
<p>
347
The service needs to be registered as a Windows Service first.
348
To do that, double click on 1_install_service.bat.
349
If successful, a command prompt window will pop up and disappear immediately. If not, a message will appear.
350
</p>
351 352

<h3>Start the Service</h3>
353
<p>
354
You can start the H2 Database Engine Service using the service manager of Windows,
355
or by double clicking on 2_start_service.bat.
356
Please note that the batch file does not print an error message if the service is not installed.
357
</p>
358 359

<h3>Connect to the H2 Console</h3>
360
<p>
361 362 363
After installing and starting the service, you can connect to the H2 Console application using a browser.
Double clicking on 3_start_browser.bat to do that. The
default port (8082) is hard coded in the batch file.
364
</p>
365 366

<h3>Stop the Service</h3>
367
<p>
368 369
To stop the service, double click on 4_stop_service.bat.
Please note that the batch file does not print an error message if the service is not installed or started.
370
</p>
371 372

<h3>Uninstall the Service</h3>
373
<p>
374 375
To uninstall the service, double click on 5_uninstall_service.bat.
If successful, a command prompt window will pop up and disappear immediately. If not, a message will appear.
376
</p>
377

378
<br /><a name="odbc_driver"></a>
379
<h2>ODBC Driver</h2>
380
<p>
381
This database does not come with its own ODBC driver at this time,
382 383 384 385
but it supports the PostgreSQL network protocol.
Therefore, the PostgreSQL ODBC driver can be used.
Support for the PostgreSQL network protocol is quite new and should be viewed
as experimental. It should not be used for production applications.
386
</p>
387 388 389 390 391
<p>
At this time, the PostgreSQL ODBC driver does not work on 64 bit versions of Windows.
For more information, see:
<a href="http://svr5.postgresql.org/pgsql-odbc/2005-09/msg00127.php">ODBC Driver on Windows 64 bit</a>
</p>
392 393

<h3>ODBC Installation</h3>
394
<p>
395 396 397 398
First, the ODBC driver must be installed.
Any recent PostgreSQL ODBC driver should work, however version 8.2.4 or newer is recommended.
The Windows version of the PostgreSQL ODBC driver is available at
<a href="http://www.postgresql.org/ftp/odbc/versions/msi">http://www.postgresql.org/ftp/odbc/versions/msi</a>.
399
</p>
400

401 402 403 404 405 406 407 408 409 410 411
<h3>Starting the Server</h3>
<p>
After installing the ODBC driver, start the H2 Server using the command line:
<pre>
java -cp h2.jar org.h2.tools.Server
</pre>
The PG Server (PG for PostgreSQL protocol) is started as well.
By default, databases are stored in the current working directory where the server is started.
Use -baseDir to save databases in another directory, for example the user home directory:
<pre>
java -cp h2.jar org.h2.tools.Server -baseDir ~
412
</pre>
413 414 415 416 417
The PG server can be started and stopped from within a Java application as follows:
<pre>
Server server = Server.createPgServer(new String[]{"-baseDir", "~"});
server.start();
...
418
server.stop();
419 420 421 422 423 424 425
</pre>
By default, only connections from localhost are allowed. To allow remote connections, use
<code>-pgAllowOthers true</code> when starting the server.
</p>

<h3>ODBC Configuration</h3>
<p>
426
After installing the driver, a new Data Source must be added. In Windows,
427 428 429 430 431
run <code>odbcad32.exe</code> to open the Data Source Administrator. Then click on 'Add...'
and select the PostgreSQL Unicode driver. Then click 'Finish'.
You will be able to change the connection properties:
</p>
<table>
432
<tr><th>Property</th><th>Example</th><th>Remarks</th></tr>
433 434
<tr><td>Data Source</td><td>H2 Test</td><td>The name of the ODBC Data Source</td></tr>
<tr><td>Database</td><td>test</td>
435 436 437 438 439 440 441
    <td>
        The database name. Only simple names are supported at this time; <br />
        relative or absolute path are not supported in the database name. <br />
        By default, the database is stored in the current working directory <br />
        where the Server is started except when the -baseDir setting is used. <br />
        The name must be at least 3 characters.
    </td></tr>
442 443 444 445 446 447 448 449 450 451 452 453
<tr><td>Server</td><td>localhost</td><td>The server name or IP address.<br />By default, only remote connections are allowed</td></tr>
<tr><td>User Name</td><td>sa</td><td>The database user name.</td></tr>
<tr><td>SSL Mode</td><td>disabled</td><td>At this time, SSL is not supported.</td></tr>
<tr><td>Port</td><td>5435</td><td>The port where the PG Server is listening.</td></tr>
<tr><td>Password</td><td>sa</td><td>The database password.</td></tr>
</table>
<p>
Afterwards, you may use this data source.
</p>

<h3>PG Protocol Support Limitations</h3>
<p>
454
At this time, only a subset of the PostgreSQL network protocol is implemented.
455 456 457 458
Also, there may be compatibility problems on the SQL level, with the catalog, or with text encoding.
Problems are fixed as they are found.
Currently, statements can not be cancelled when using the PG protocol.
</p>
459 460

<h3>Security Considerations</h3>
461
<p>
462 463
Currently, the PG Server does not support challenge response or encrypt passwords.
This may be a problem if an attacker can listen to the data transferred between the ODBC driver
464 465 466
and the server, because the password is readable to the attacker.
Also, it is currently not possible to use encrypted SSL connections.
Therefore the ODBC driver should not be used where security is important.
467
</p>
468

469
<br /><a name="acid"></a>
470
<h2>ACID</h2>
471
<p>
472
In the database world, ACID stands for:
473
</p>
474
<ul>
475
<li>Atomicity: Transactions must be atomic, meaning either all tasks are performed or none.
476
</li><li>Consistency: All operations must comply with the defined constraints.
477
</li><li>Isolation: Transactions must be isolated from each other.
478
</li><li>Durability: Committed transaction will not be lost.
479
</li></ul>
480 481

<h3>Atomicity</h3>
482
<p>
483
Transactions in this database are always atomic.
484
</p>
485 486

<h3>Consistency</h3>
487
<p>
488 489
This database is always in a consistent state.
Referential integrity rules are always enforced.
490
</p>
491 492

<h3>Isolation</h3>
493
<p>
494
For H2, as with most other database systems, the default isolation level is 'read committed'.
495 496
This provides better performance, but also means that transactions are not completely isolated.
H2 supports the transaction isolation levels 'serializable', 'read committed', and 'read uncommitted'.
497
</p>
498 499

<h3>Durability</h3>
500
<p>
501
This database does not guarantee that all committed transactions survive a power failure.
502
Tests show that all databases sometimes lose transactions on power failure (for details, see below).
503 504
Where losing transactions is not acceptable, a laptop or UPS (uninterruptible power supply) should be used.
If durability is required for all possible cases of hardware failure, clustering should be used,
505
such as the H2 clustering mode.
506
</p>
507

508 509
<br /><a name="durability_problems"></a>
<h2>Durability Problems</h2>
510 511
<p>
Complete durability means all committed transaction survive a power failure.
512
Some databases claim they can guarantee durability, but such claims are wrong.
513
A durability test was run against H2, HSQLDB, PostgreSQL, and Derby.
514
All of those databases sometimes lose committed transactions.
515
The test is included in the H2 download, see org.h2.test.poweroff.Test.
516 517
</p>

518 519
<h3>Ways to (Not) Achieve Durability</h3>
<p>
520 521
Making sure that committed transaction are not lost is more complicated than it seems first.
To guarantee complete durability, a database must ensure that the log record is on the hard drive
522
before the commit call returns. To do that, databases use different methods. One
523 524
is to use the 'synchronous write' file access mode. In Java, RandomAccessFile
supports the modes "rws" and "rwd":
525 526
</p>
<ul>
527
<li>rwd: Every update to the file's content is written synchronously to the underlying storage device.
528
</li><li>rws: In addition to rwd, every update to the metadata is written synchronously.</li>
529 530
</ul>
<p>
531 532
This feature is used by Derby.
A test (org.h2.test.poweroff.TestWrite) with one of those modes achieves around 50 thousand write operations per second.
533
Even when the operating system write buffer is disabled, the write rate is around 50 thousand operations per second.
534
This feature does not force changes to disk because it does not flush all buffers.
535
The test updates the same byte in the file again and again. If the hard drive was able to write at this rate,
536
then the disk would need to make at least 50 thousand revolutions per second, or 3 million RPM
537
(revolutions per minute). There are no such hard drives. The hard drive used for the test is about 7200 RPM,
538
or about 120 revolutions per second. There is an overhead, so the maximum write rate must be lower than that.
539 540
</p>
<p>
541
Buffers can be flushed by calling the function fsync. There are two ways to do that in Java:
542
</p>
543
<ul>
544
<li>FileDescriptor.sync(). The documentation says that this forces all system buffers to synchronize with the underlying device.
545
Sync is supposed to return after all in-memory modified copies of buffers associated with this FileDescriptor
546
have been written to the physical medium.
547
</li><li>FileChannel.force() (since JDK 1.4). This method is supposed to force any updates to this channel's file
548
to be written to the storage device that contains it.
549
</li></ul>
550
<p>
551
By default, MySQL calls fsync for each commit. When using one of those methods, only around 60 write operations
552
per second can be achieved, which is consistent with the RPM rate of the hard drive used.
553
Unfortunately, even when calling FileDescriptor.sync() or FileChannel.force(),
554
data is not always persisted to the hard drive, because most hard drives do not obey
555 556 557 558 559 560 561 562 563 564 565
fsync(): see 'Your Hard Drive Lies to You' at http://hardware.slashdot.org/article.pl?sid=05/05/13/0529252.
In Mac OS X fsync does not flush hard drive buffers:
http://lists.apple.com/archives/darwin-dev/2005/Feb/msg00072.html.
So the situation is confusing, and tests prove there is a problem.
</p>
<p>
Trying to flush hard drive buffers hard, and if you do the performance is very bad.
First you need to make sure that the hard drive actually flushes all buffers.
Tests show that this can not be done in a reliable way.
Then the maximum number of transactions is around 60 per second.
Because of those reasons, the default behavior of H2 is to delay writing committed transactions.
566 567
</p>
<p>
568 569
In H2, after a power failure, a bit more than one second of committed transactions may be lost.
To change the behavior, use SET WRITE_DELAY and CHECKPOINT SYNC.
570
Most other databases support commit delay as well.
571
In the performance comparison, commit delay was used for all databases that support it.
572
</p>
573

574
<h3>Running the Durability Test</h3>
575
<p>
576 577
To test the durability / non-durability of this and other databases, you can use the test application
in the package org.h2.test.poweroff. Two computers with network connection are required to run this test.
578
One computer just listens, while the test application is run (and power is cut) on the other computer.
579 580 581 582
The computer with the listener application opens a TCP/IP port and listens for an incoming connection.
The second computer first connects to the listener, and then created the databases and starts inserting
records. The connection is set to 'autocommit', which means after each inserted record a commit is performed
automatically. Afterwards, the test computer notifies the listener that this record was inserted successfully.
583 584 585
The listener computer displays the last inserted record number every 10 seconds. Now, switch off the power
manually, then restart the computer, and run the application again. You will find out that in most cases,
none of the databases contains all the records that the listener computer knows about. For details, please
586
consult the source code of the listener and test application.
587
</p>
588

589
<br /><a name="using_recover_tool"></a>
590
<h2>Using the Recover Tool</h2>
591
<p>
592 593 594 595 596 597 598 599 600 601
The recover tool can be used to extract the contents of a data file, even if the database is corrupted.
At this time, it does not extract the content of the log file or large objects (CLOB or BLOB).
To run the tool, type on the command line:
<pre>
java org.h2.tools.Recover
</pre>
For each database in the current directory, a text file will be created.
This file contains raw insert statement (for the data) and data definition (DDL) statement to recreate
the schema of the database. This file cannot be executed directly, as the raw insert statements
don't have the correct table names, so the file needs to be pre-processed manually before executing.
602
</p>
603

604
<br /><a name="file_locking_protocols"></a>
605
<h2>File Locking Protocols</h2>
606
<p>
607 608 609
Whenever a database is opened, a lock file is created to signal other processes
that the database is in use. If database is closed, or if the process that opened
the database terminates, this lock file is deleted.
610
</p><p>
611 612 613 614 615 616 617 618
In special cases (if the process did not terminate normally, for example because
there was a blackout), the lock file is not deleted by the process that created it.
That means the existence of the lock file is not a safe protocol for file locking.
However, this software uses a challenge-response protocol to protect the database
files. There are two methods (algorithms) implemented to provide both security
(that is, the same database files cannot be opened by two processes at the same time)
and simplicity (that is, the lock file does not need to be deleted manually by the user).
The two methods are 'file method' and 'socket methods'.
619
</p>
620 621

<h3>File Locking Method 'File'</h3>
622
<p>
623
The default method for database file locking is the 'File Method'. The algorithm is:
624
</p>
625 626 627 628 629 630
<ul>
<li>When the lock file does not exist, it is created (using the atomic operation File.createNewFile).
Then, the process waits a little bit (20ms) and checks the file again. If the file was changed
during this time, the operation is aborted. This protects against a race condition
when a process deletes the lock file just after one create it, and a third process creates
the file again. It does not occur if there are only two writers.
631
</li><li>
632 633 634 635 636 637 638 639
If the file can be created, a random number is inserted together with the locking method
('file'). Afterwards, a watchdog thread is started that
checks regularly (every second once by default) if the file was deleted or modified by
another (challenger) thread / process. Whenever that occurs, the file is overwritten with the
old data. The watchdog thread runs with high priority so that a change to the lock file does
not get through undetected even if the system is very busy. However, the watchdog thread
does use very little resources (CPU time), because it waits most of the time. Also, the watchdog only reads from the hard disk
and does not write to it.
640
</li><li>
641 642 643 644 645 646 647 648 649
If the lock file exists, and it was modified in the 20 ms, the process waits for some time (up to 10 times).
If it was still changed, an exception is thrown (database is locked). This is done to eliminate race conditions with many concurrent
writers. Afterwards, the file is overwritten with a new version (challenge).
After that, the thread waits for 2 seconds.
If there is a watchdog thread protecting the file, he will overwrite the change
and this process will fail to lock the database.
However, if there is no watchdog thread, the lock file will still be as written by
this thread. In this case, the file is deleted and atomically created again.
The watchdog thread is started in this case and the file is locked.
650
</li></ul>
651 652 653 654 655 656 657
<p>
This algorithm is tested with over 100 concurrent threads. In some cases, when there are
many concurrent threads trying to lock the database, they block each other (meaning
the file cannot be locked by any of them) for some time. However, the file never gets
locked by two threads at the same time. However using that many concurrent threads
/ processes is not the common use case. Generally, an application should throw an error
to the user if it cannot open a database, and not try again in a (fast) loop.
658
</p>
659 660

<h3>File Locking Method 'Socket'</h3>
661
<p>
662 663
There is a second locking mechanism implemented, but disabled by default.
The algorithm is:
664
</p>
665 666 667 668 669
<ul>
<li>If the lock file does not exist, it is created.
Then a server socket is opened on a defined port, and kept open.
The port and IP address of the process that opened the database is written
into the lock file.
670
</li><li>If the lock file exists, and the lock method is 'file', then the software switches
671
to the 'file' method.
672
</li><li>If the lock file exists, and the lock method is 'socket', then the process
673 674 675 676
checks if the port is in use. If the original process is still running, the port is in use
and this process throws an exception (database is in use). If the original process
died (for example due to a blackout, or abnormal termination of the virtual machine),
then the port was released. The new process deletes the lock file and starts again.
677
</li></ul>
678
<p>
679 680 681 682
This method does not require a watchdog thread actively polling (reading) the same
file every second. The problem with this method is, if the file is stored on a network
share, two processes (running on different computers) could still open the same
database files, if they do not have a direct TCP/IP connection.
683
</p>
684

685
<br /><a name="sql_injection"></a>
686 687
<h2>Protection against SQL Injection</h2>
<h3>What is SQL Injection</h3>
688
<p>
689 690
This database engine provides a solution for the security vulnerability known as 'SQL Injection'.
Here is a short description of what SQL injection means.
691
Some applications build SQL statements with embedded user input such as:
692
</p>
693 694 695 696
<pre>
String sql = "SELECT * FROM USERS WHERE PASSWORD='"+pwd+"'";
ResultSet rs = conn.createStatement().executeQuery(sql);
</pre>
697
<p>
698
If this mechanism is used anywhere in the application, and user input is not correctly filtered or encoded,
699
it is possible for a user to inject SQL functionality or statements by using specially built input
700
such as (in this example) this password: ' OR ''='. In this case the statement becomes:
701
</p>
702 703 704
<pre>
SELECT * FROM USERS WHERE PASSWORD='' OR ''='';
</pre>
705
<p>
706
Which is always true no matter what the password stored in the database is.
707
For more information about SQL Injection, see Glossary and Links.
708
</p>
709 710

<h3>Disabling Literals</h3>
711
<p>
712 713
SQL Injection is not possible if user input is not directly embedded in SQL statements.
A simple solution for the problem above is to use a PreparedStatement:
714
</p>
715 716 717 718 719 720
<pre>
String sql = "SELECT * FROM USERS WHERE PASSWORD=?";
PreparedStatement prep = conn.prepareStatement(sql);
prep.setString(1, pwd);
ResultSet rs = prep.executeQuery();
</pre>
721
<p>
722 723 724
This database provides a way to enforce usage of parameters when passing user input
to the database. This is done by disabling embedded literals in SQL statements.
To do this, execute the statement:
725
</p>
726 727 728
<pre>
SET ALLOW_LITERALS NONE;
</pre>
729
<p>
730 731 732 733 734 735 736 737
Afterwards, SQL statements with text and number literals are not allowed any more.
That means, SQL statement of the form WHERE NAME='abc' or WHERE CustomerId=10 will fail.
It is still possible to use PreparedStatements and parameters as described above. Also, it is still possible to generate
SQL statements dynamically, and use the Statement API, as long as the SQL statements
do not include literals.
There is also a second mode where number literals are allowed: SET ALLOW_LITERALS NUMBERS.
To allow all literals, execute SET ALLOW_LITERALS ALL (this is the default setting).
Literals can only be enabled or disabled by an administrator.
738
</p>
739 740

<h3>Using Constants</h3>
741
<p>
742
Disabling literals also means disabling hard-coded 'constant' literals. This database supports
743
defining constants using the CREATE CONSTANT command. Constants can be defined only
744 745
when literals are enabled, but used even when literals are disabled. To avoid name clashes
with column names, constants can be defined in other schemas:
746
</p>
747 748 749 750 751 752
<pre>
CREATE SCHEMA CONST AUTHORIZATION SA;
CREATE CONSTANT CONST.ACTIVE VALUE 'Active';
CREATE CONSTANT CONST.INACTIVE VALUE 'Inactive';
SELECT * FROM USERS WHERE TYPE=CONST.ACTIVE;
</pre>
753
<p>
754 755 756
Even when literals are enabled, it is better to use constants instead
of hard-coded number or text literals in queries or views. With constants, typos are found at compile
time, the source code is easier to understand and change.
757
</p>
758 759

<h3>Using the ZERO() Function</h3>
760
<p>
761
It is not required to create a constant for the number 0 as there is already a built-in function ZERO():
762
</p>
763 764 765 766
<pre>
SELECT * FROM USERS WHERE LENGTH(PASSWORD)=ZERO();
</pre>

767 768 769 770 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 786 787 788 789
<br /><a name="restricting_classes"></a>
<h2>Restricting Class Loading and Usage</h2>
<p>
By default there is no restriction on loading classes and executing Java code for admins.
That means an admin may call system functions such as System.setProperty by executing:
<pre>
CREATE ALIAS SET_PROPERTY FOR "java.lang.System.setProperty";
CALL SET_PROPERTY('abc', '1');
CREATE ALIAS GET_PROPERTY FOR "java.lang.System.getProperty";
CALL GET_PROPERTY('abc');
</pre>
To restrict users (including admins) from loading classes and executing code,
the list of allowed classes can be set in the system property h2.allowedClasses
in the form of a comma separated list of classes or patterns (items ending with '*').
By default all classes are allowed. Example:
<pre>
java -Dh2.allowedClasses=java.lang.Math,com.acme.*
</pre>
This mechanism is used for all user classes, including database event listeners,
trigger classes, user defined functions, user defined aggregate functions, and JDBC
driver classes (with the exception of the H2 driver) when using the H2 Console.
</p>

790
<br /><a name="security_protocols"></a>
791
<h2>Security Protocols</h2>
792
<p>
793 794 795
The following paragraphs document the security protocols used in this database.
These descriptions are very technical and only intended for security experts that already know
the underlying security primitives.
796
</p>
797 798

<h3>User Password Encryption</h3>
799
<p>
800 801 802 803 804 805 806 807 808 809 810
When a user tries to connect to a database, the combination of
user name, @, and password hashed using SHA-256, and this hash value
is transmitted to the database.
This step does not try to an attacker from re-using the value if he is able to listen to the
(unencrypted) transmission between the client and the server.
But, the passwords are never transmitted as plain text,
even when using an unencrypted connection between client and server.
That means if a user reuses the same password for different things,
this password is still protected up to some point. See also
'RFC 2617 - HTTP Authentication: Basic and Digest Access Authentication'
for more information.
811
</p><p>
812 813 814 815
When a new database or user is created, a new cryptographically secure
random salt value is generated. The size of the salt is 64 bit.
Using the random salt reduces the risk of an attacker pre-calculating hash values
for many different (commonly used) passwords.
816
</p><p>
817 818 819 820 821 822 823 824 825 826 827
The combination of user-password hash value (see above) and salt is hashed
using SHA-256. The resulting value is stored in the database.
When a user tries to connect to the database, the database combines
user-password hash value with the stored salt value and calculated the
hash value. Other products use multiple iterations (hash the hash value again and again),
but this is not done in this product to reduce the risk of denial of service attacks
(where the attacker tries to connect with bogus passwords, and the server
spends a lot of time calculating the hash value for each password).
The reasoning is: if the attacker has access to the hashed passwords, he also has
access to the data in plain text, and therefore does not need the password any more.
If the data is protected by storing it on another computer and only remotely, then the iteration count is not required at all.
828
</p>
829 830

<h3>File Encryption</h3>
831
<p>
832 833 834 835
The database files can be encrypted using two different algorithms: AES-128 and
XTEA (using 32 rounds). The reasons for supporting XTEA is performance
(XTEA is about twice as fast as AES) and to have an alternative algorithm if
AES is suddenly broken.
836
</p><p>
837 838 839
When a user tries to connect to an encrypted database, the combination of the word
'file', @, and the file password is hashed using SHA-256. This hash value is
transmitted to the server.
840
</p><p>
841 842 843 844 845
When a new database file is created, a new cryptographically secure
random salt value is generated. The size of the salt is 64 bit.
The combination of the file password hash and the salt value is hashed 1024 times
using SHA-256. The reason for the iteration is to make it harder for an attacker to
calculate hash values for common passwords.
846
</p><p>
847 848 849 850 851
The resulting hash value is used as the key for the block cipher algorithm
(AES-128 or XTEA with 32 rounds). Then, an initialization vector (IV) key
is calculated by hashing the key again using SHA-256.
This is to make sure the IV is unknown to the attacker.
The reason for using a secret IV is to protect against watermark attacks.
852
</p><p>
853 854 855 856
Before saving a block of data (each block is 8 bytes long), the following operations are executed:
First, the IV is calculated by encrypting the block number with the IV key (using the same
block cipher algorithm). This IV is combined with the plain text using XOR. The resulting data is
encrypted using the AES-128 or XTEA algorithm.
857
</p><p>
858 859
When decrypting, the operation is done in reverse. First, the block is decrypted using the key,
and then the IV is calculated combined with the decrypted text using XOR.
860
</p><p>
861 862 863 864
Therefore, the block cipher modes of operation is CBC (Cipher-block chaining), but each chain
is only one block long. The advantage over the ECB (Electronic codebook) mode is that patterns
in the data are not revealed, and the advantage over multi block CBC is that flipped cipher text bits
are not propagated to flipped plaintext bits in the next block.
865 866
</p><p>
Database encryption is meant for securing the database while it is not in use (stolen laptop and so on).
867 868
It is not meant for cases where the attacker has access to files while the database is in use.
When he has write access, he can for example replace pieces of files with pieces of older versions
869 870
and manipulate data like this.
</p><p>
871
File encryption slows down the performance of the database engine. Compared to unencrypted mode,
872
database operations take about 2.2 times longer when using XTEA, and 2.5 times longer using AES (embedded mode).
873
</p>
874 875

<h3>SSL/TLS Connections</h3>
876
<p>
877 878 879
Remote SSL/TLS connections are supported using the Java Secure Socket Extension
(SSLServerSocket / SSLSocket). By default, anonymous SSL is enabled.
The default cipher suite is <code>SSL_DH_anon_WITH_RC4_128_MD5</code>.
880
</p>
881 882

<h3>HTTPS Connections</h3>
883
<p>
884 885 886
The web server supports HTTP and HTTPS connections using SSLServerSocket.
There is a default self-certified certificate to support an easy starting point, but
custom certificates are supported as well.
887
</p>
888

889
<br /><a name="uuid"></a>
890
<h2>Universally Unique Identifiers (UUID)</h2>
891
<p>
892
This database supports the UUIDs. Also supported is a function to create new UUIDs using
893 894 895
a cryptographically strong pseudo random number generator.
With random UUIDs, the chance of two having the same value can be calculated
using the probability theory. See also 'Birthday Paradox'.
896
Standardized randomly generated UUIDs have 122 random bits.
897 898
4 bits are used for the version (Randomly generated UUID), and 2 bits for the variant (Leach-Salz).
This database supports generating such UUIDs using the built-in function RANDOM_UUID().
899
Here is a small program to estimate the probability of having two identical UUIDs
900 901 902
after generating a number of values:
<pre>
double x = Math.pow(2, 122);
903
for(int i=35; i&lt;62; i++) {
904 905
    double n = Math.pow(2, i);
    double p = 1 - Math.exp(-(n*n)/(2*x));
906
    String ps = String.valueOf(1+p).substring(1);
907
    System.out.println("2^"+i+"="+(1L&lt;&lt;i)+" probability: 0"+ps);
908
}
909 910 911 912 913 914 915
</pre>
Some values are:
<pre>
2^36=68'719'476'736 probability: 0.000'000'000'000'000'4
2^41=2'199'023'255'552 probability: 0.000'000'000'000'4
2^46=70'368'744'177'664 probability: 0.000'000'000'4
</pre>
916 917
To help non-mathematicians understand what those numbers mean, here a comparison:
One's annual risk of being hit by a meteorite is estimated to be one chance in 17 billion,
918
that means the probability is about 0.000'000'000'06.
919
</p>
920

921
<br /><a name="system_properties"></a>
922
<h2>Settings Read from System Properties</h2>
923 924 925
<p>
Some settings of the database can be set on the command line using
-DpropertyName=value. It is usually not required to change those settings manually.
926
The settings are case sensitive.
927 928 929 930 931
Example:
<pre>
java -Dh2.serverCachedObjects=256 org.h2.tools.Server
</pre>
The current value of the settings can be read in the table
932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947
INFORMATION_SCHEMA.SETTINGS.
</p>
<p>
For a complete list of settings, see
<a href="../javadoc/org/h2/constant/SysProperties.html">SysProperties</a>.
</p>

<br /><a name="server_bind_address"></a>
<h2>Setting the Server Bind Address</h2>
<p>
Usually server sockets accept connections on any/all local addresses.
This may be a problem on multi-homed hosts.
To bind only to one address, use the system property h2.bindAddress.
This setting is used for both regular server sockets and for SSL server sockets.
IPv4 and IPv6 address formats are supported.
</p>
948

949
<br /><a name="glossary_links"></a>
950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977
<h2>Glossary and Links</h2>
<table><tr><th>Term</th><th>Description</th></tr>
<tr>
  <td>AES-128</td>
  <td>
    A block encryption algorithm. See also:
    <a href="http://en.wikipedia.org/wiki/Advanced_Encryption_Standard">Wikipedia: AES</a>
  </td>
</tr>
<tr>
  <td>Birthday Paradox</td>
  <td>
      Describes the higher than expected probability that two persons in a room have the same birthday.
    Also valid for randomly generated UUIDs. See also:
    <a href="http://en.wikipedia.org/wiki/Birthday_paradox">Wikipedia: Birthday Paradox</a>
  </td>
</tr>
<tr>
  <td>Digest</td>
  <td>
    Protocol to protect a password (but not to protect data). See also:
    <a href="http://www.faqs.org/rfcs/rfc2617.html">RFC 2617: HTTP Digest Access Authentication</a>
  </td>
</tr>
<tr>
  <td>GCJ</td>
  <td>
    GNU Compiler for Java.
978
    <a href="http://gcc.gnu.org/java/">http://gcc.gnu.org/java/</a> and
979
    <a href="http://nativej.mtsystems.ch">http://nativej.mtsystems.ch/ (not free any more)</a>
980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045
  </td>
</tr>
<tr>
  <td>HTTPS</td>
  <td>
    A protocol to provide security to HTTP connections. See also:
    <a href="http://www.ietf.org/rfc/rfc2818.txt">RFC 2818: HTTP Over TLS</a>
  </td>
</tr>
<tr>
  <td>Modes of Operation</td>
  <td>
  <a href="http://en.wikipedia.org/wiki/Block_cipher_modes_of_operation">Wikipedia: Block cipher modes of operation</a>
  </td>
</tr>
<tr>
  <td>Salt</td>
  <td>
      Random number to increase the security of passwords.
      See also:
    <a href="http://en.wikipedia.org/wiki/Key_derivation_function">Wikipedia: Key derivation function</a>
  </td>
</tr>
<tr>
  <td>SHA-256</td>
  <td>
      A cryptographic one-way hash function.
      See also:
    <a href="http://en.wikipedia.org/wiki/SHA_family">Wikipedia: SHA hash functions</a>
  </td>
</tr>
<tr>
  <td>SQL Injection</td>
  <td>
      A security vulnerability where an application generates SQL statements with embedded user input.
      See also:
    <a href="http://en.wikipedia.org/wiki/SQL_injection">Wikipedia: SQL Injection</a>
  </td>
</tr>
<tr>
  <td>Watermark Attack</td>
  <td>
    Security problem of certain encryption programs where the existence of certain
    data can be proven without decrypting.
    For more information, search in the internet for 'watermark attack cryptoloop'
  </td>
</tr>
<tr>
  <td>SSL/TLS</td>
  <td>
    Secure Sockets Layer / Transport Layer Security.
    See also:
      <a href="http://java.sun.com/products/jsse/">Java Secure Socket Extension (JSSE)</a>
  </td>
</tr>
<tr>
  <td>XTEA</td>
  <td>
    A block encryption algorithm.
      See also:
    <a href="http://en.wikipedia.org/wiki/XTEA">Wikipedia: XTEA</a>
  </td>
</tr>
</table>

</div></td></tr></table></body></html>