ccs.beetree
Class BeeTree

java.lang.Object
  extended by ccs.beetree.BeeTree

public final class BeeTree
extends java.lang.Object

A BeeTree provides storage, access and indexing to a collection of Binary Large OBjects (blobs) gathered together in a single disk file (which may have any name or extension). These blobs are indexed by a single, arbitrary-length key String. Everything in the file, excluding a small amount of blob-length information which must be stored in a more weakly obscured form, and 32 bytes of serial-ID information at the start of the file used for locking, is encrypted. A "password" (an arbitrary-length byte string) is supplied for this purpose. If the password is zero-length then a null cipher is used, otherwise a real cipher is used: currently AES-CBC-HMAC-SHA1 with 128- to 256-bit keys. Alternatively, each blob may supply its own Cipher object to handle its own encryption.

This is a multi-user / multithread-safe persister. The locking system is robust but can be defeated: in particular, access from a remote host (e.g. via mapped network drives under windows) will hose it, and your data as well. Many BeeTree objects in many threads in many VM's can share a disk file.

The file managed using a BeeTree consists of a set of BeeObjects. The BeeObject provides some basic services required to serialise the object, and some data fields which the BeeTree uses to keep track of things. Objects which you wish to store in the BeeTree must extend BeeObject.

A BeeTree which has lots of deletes on it will become fragmented. This will reduce performance and will make the file excessively large. When the app has time (time enough to read and write every blob in the store - could be a while!) it should compact it. This is achieved using a BeeTreeCompactor.

Keys must be valid UTF strings. However, this format covers a multitude of sins, so this restriction should not be problematic.

The BeeTree maintains a "current object" pointer, which is set by find and used by prev, next and match. Some other operations affect the pointer, as described in the per-method documentation. If an operation throws an exception, the current-pointer status becomes undefined.

The locking system requires a unique file ID for every BeeTree disk file. However, it can happen that two files end up with the same "unique" file ID, by chance. If this happens, the two lock each other, reducing efficiency but not risking data corruption.

Most applications will prefer to use CDBeeTree instead, which provides richer functionality. CDBeeTree uses BeeTree internally.

See Also:
BeeObject, CDBeeTree

Field Summary
static int CIPHERAES
          Cipher version constant.
static int CIPHERBLOWFISH
          Cipher version constant.
static int CIPHERFISHY
          Cipher version constant.
static int CIPHERSTILLFISHY
          Cipher version constant.
static long LIMIT
          The maximum size of a BeeObject; (2^62 - epsilon) bytes.
 
Constructor Summary
BeeTree(java.io.File f, byte[] password, boolean create)
          Create a new BeeTree indexing the supplied File.
BeeTree(java.io.File f, byte[] password, boolean check, boolean create)
          Deprecated. The extra consistency checks are now implemented as assertions. To enable these checks, enable assertions with the -ea VM parameter.
 
Method Summary
 void backup(java.io.File tgt)
          Backs up the beetree to the supplied file, in an safe way.
static void changeFileID(java.io.File btf)
          Change the file ID of a BeeTree.
 void changeKey(BeeObject bo, java.lang.String newkey)
          Changes the key of the supplied object without changing its data.
 void delete(BeeObject bo)
          Deletes the supplied object from the BeeTree.
 void eviscerate(boolean isWipe)
          Remove every single object from the BeeTree in one fell swoop.
 boolean findExact(BeeObject bo)
          Finds an object from an exact key.
 boolean findExactKey(BeeObject bo)
          As findExact, except that the object is not unmarshalled.
 boolean findFirst(BeeObject bo)
          Finds an object from a partial key.
 boolean findFirstKey(BeeObject bo)
          As findFirst, except that the object is not unmarshalled.
 Cipher getCipherFor(byte[] password)
          Returns a configured Cipher corresponding to the current cipher version and the supplied password.
 Cipher getCipherInstance()
          Returns a new, uninitialised instance of the Cipher corresponding to the current Cipher version.
 int getCipherVersion()
          Returns the current cipher version.
 java.io.File getContainingFile()
          Returns the File containing the beetree.
 long getFileID()
          Returns the file ID of the beetree.
 Cipher getNullCipherInstance()
          Returns a new, uninitialised instance of the Cipher the tree uses when it's not actually encrypted.
 byte[] getPassword()
          returns a copy of the current password, if any.
 long getTotalFree()
          Returns the total amount of free space inside the BeeTree.
 int getTotalSpaces()
          Returns the number of free spaces inside the BeeTree.
 void insert(BeeObject bo)
          Inserts an object into the BeeTree.
 void lock()
          Acquires a hold on the BeeTree's nestable (re-entrant) mutex.
 boolean next(BeeObject bo)
          Find the next object.
 boolean nextKey(BeeObject bo)
          As next, except the BeeObject is not unmarshalled.
 boolean nextMatch(BeeObject bo)
          Find the next object which matches the partial key from the most recent findFirst.
 boolean nextMatchKey(BeeObject bo)
          As nextMatch, except that the object is not unmarshalled.
 boolean prev(BeeObject bo)
          Find the previous object, as next.
 boolean prevKey(BeeObject bo)
          Find the previous object but don't unmarshal, as nextKey.
 void restore(java.io.File src)
          Deprecated. Considered dodgy! This has not been used in anger and so has not accrued the same level of confidence as everything else. Most applications won't need to do this except during disaster recovery; under these circumstances it's much safer to stop the application, restore manually and then restart the application.
static void setDeadlockTimeoutMillis(long millis)
          Sets a deadlock timeout, in milliseconds.
 void setNodeCacheSize(int size)
          Changes the size of the node cache.
 void setThreshold(int threshold)
          Changes the swapping threshold for temporary space when marshalling.
 void threadLock()
           Lock the beetree file so that only this thread in this VM can use it; all other threads in all other VMs are locked out.
 void threadLockRO()
          As threadLock, but only acquires an RO lock.
 void threadUnlock()
          Release the beetree from your previously-applied thread lock.
 void threadUnlockIfHolder()
          Release the beetree from a previously-applied thread lock, iff the current thread has one to release.
 void threadUnlockRO()
          Release the beetree from your previously-applied RO thread lock.
 void threadUnlockROIfHolder()
          Release the beetree from a previously-applied thread lock, iff the current thread has one to release.
 void unlock()
          Releases a hold obtained by lock.
 void update(BeeObject bo)
          Updates the object, which is already in the BeeTree.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CIPHERFISHY

public static final int CIPHERFISHY
Cipher version constant. Encryption using a pre-release implementation of Blowfish, with some undesirable featuresbugs. Withdrawn.

See Also:
Constant Field Values

CIPHERSTILLFISHY

public static final int CIPHERSTILLFISHY
Cipher version constant. Encryption using a nearly-secure implementation of Blowfish, with the above problems fixed, but the MAC slightly cocked up. BeeObject lengths limited to 2GB. Blob delimiters stored in clear. Withdrawn.

See Also:
Constant Field Values

CIPHERBLOWFISH

public static final int CIPHERBLOWFISH
Cipher version constant. Encryption using a believed-secure implementation of Blowfish, with the MAC not cocked up, with packed, whitened blob delimiters (these obscure the internal structure of the BeeTree file, making cryptananlysis more difficult) and with the 2GB BeeObject limit removed (the limit is now 2^62 - epsilon). Legacy.

See Also:
Constant Field Values

CIPHERAES

public static final int CIPHERAES
Cipher version constant. Encryption as CIPHERBLOWFISH, but using AES / SHA-1 rather than Blowfish / RIPEMD-160. Current.

See Also:
Constant Field Values

LIMIT

public static final long LIMIT
The maximum size of a BeeObject; (2^62 - epsilon) bytes. (ie. about 4 Exabytes).

See Also:
Constant Field Values
Constructor Detail

BeeTree

public BeeTree(java.io.File f,
               byte[] password,
               boolean check,
               boolean create)
        throws java.io.IOException
Deprecated. The extra consistency checks are now implemented as assertions. To enable these checks, enable assertions with the -ea VM parameter.

Create a new BeeTree indexing the supplied File. The password (passphrase, see opening discussion) encrypts the beetree; if zero-length a null cipher is applied. Applying strong encryption is little slower than applying null encryption - the Blowfish algorithm used is designed to be fast (order MB per second on a bog-standard modern PC). If check is true, extra consistency checks will be made, and ConsistencyException thrown if they fail (this would probably indicate a BeeTree internal fault or disk-file corruption).

Parameters:
f - The disk file to contain the BeeTree.
password - The password for the BeeTree, supplied as a set of bytes. If your app is using a string, think about character encoding carefully. Supply null to not use encryption.
check - Ignored.
create - If true, create a new db file. If the file exists, it will be truncated to length zero (thus preserving any file permissions). If false, read in details from an existing db file, throwing an exception if the file isn't valid.
Throws:
java.io.IOException

BeeTree

public BeeTree(java.io.File f,
               byte[] password,
               boolean create)
        throws java.io.IOException
Create a new BeeTree indexing the supplied File. The password (passphrase, see opening discussion) encrypts the beetree; if zero-length a null cipher is applied. Applying strong encryption is little slower than applying null encryption - the Blowfish algorithm used is designed to be fast (order MB per second on a bog-standard modern PC). Extra consistency checks are made available via assertions; to enable these checks, enable assertions by passing the -ea parameter to the VM.

Parameters:
f - The disk file to contain the BeeTree.
password - The password for the BeeTree, supplied as a set of bytes. If your app is using a string, think about character encoding carefully. Supply null to not use encryption.
create - If true, create a new db file. If the file exists, it will be truncated to length zero (thus preserving any file permissions). If false, read in details from an existing db file, throwing an exception if the file isn't valid.
Throws:
java.io.IOException
Method Detail

insert

public void insert(BeeObject bo)
            throws ConsistencyException,
                   java.io.IOException,
                   BeeException
Inserts an object into the BeeTree. The current-object pointer is not affected.

Parameters:
bo - The object to insert
Throws:
ConsistencyException
java.io.IOException
BeeException

update

public void update(BeeObject bo)
            throws java.io.IOException
Updates the object, which is already in the BeeTree. The current-object pointer for prev / next operations is set to the updated object. Note: to change the object's cipher, or the password of that cipher, you must delete the object, change the cipher / password and then re-insert it. Update will not work.

Parameters:
bo - The object to update.
Throws:
java.io.IOException

changeKey

public void changeKey(BeeObject bo,
                      java.lang.String newkey)
               throws java.io.IOException
Changes the key of the supplied object without changing its data. In a sense this is the inverse operation to update, which changes the data without changing the key. Before invoking the method, set the BeeObject to return its old (current) key. The method will invoke setKey on the BeeObject to change the key during the method. Since the object is not marshalled or unmarshalled during the process, niether preMarshal or preUnmarshal will be invoked on it by this method.

Parameters:
bo - The BeeObject whose key is to be changed. On entry, its key should be set to the old (current) value.
newkey - The key to change to.
Throws:
java.io.IOException

delete

public void delete(BeeObject bo)
            throws ConsistencyException,
                   BeeException,
                   java.io.IOException
Deletes the supplied object from the BeeTree. The space it occupied in the file is wiped clear and marked for re-use. The current-object pointer is set to the deleted object (resulting in a throw if used without a find).

Parameters:
bo - the object to delete.
Throws:
ConsistencyException
BeeException
java.io.IOException

findFirst

public boolean findFirst(BeeObject bo)
                  throws java.io.IOException
Finds an object from a partial key. Give the partial key in bo.getKey. The full object corresponding to the first key matching the partial key is returned, together with the full key. After a findFirst, next and prev will find the next and previous records (and can be applied iteratively). If the method returns false, the key will not have changed. (bo.setKey will not have been called.)

Parameters:
bo - a BeeObject containing the partial key.
Returns:
true if the search succeeded, false if it failed (no blob had a key consistent with that supplied.)
Throws:
java.io.IOException

findFirstKey

public boolean findFirstKey(BeeObject bo)
                     throws java.io.IOException
As findFirst, except that the object is not unmarshalled. For finding keys, when not concerned with the objects.

Parameters:
bo - a BeeObject containing the partial key.
Returns:
true if the search succeeded, false if it failed (no blob had a key consistent with that supplied.)
Throws:
java.io.IOException

findExact

public boolean findExact(BeeObject bo)
                  throws java.io.IOException
Finds an object from an exact key. Give the key in bo.getKey. This saves you from having to check that the key you get is not a superset of the key you wanted. It is also fractionally quicker than findFirst. After a findExact, next and prev will find the next and previous records. (and can be applied iteratively). If the method returns false, the key will not have changed. (bo.setKey will not have been called.)

Parameters:
bo - a BeeObject containing the key.
Returns:
true if the search succeeded, false if it failed (no blob had a key consistent with that supplied.)
Throws:
java.io.IOException

findExactKey

public boolean findExactKey(BeeObject bo)
                     throws java.io.IOException
As findExact, except that the object is not unmarshalled. For finding keys, when not concerned with the objects.

Parameters:
bo - a BeeObject containing the key.
Returns:
true if the search succeeded, false if it failed (no blob had that key).
Throws:
java.io.IOException

next

public boolean next(BeeObject bo)
             throws java.io.IOException
Find the next object. Returns as find. If the search fails (as in 'returns false', not as in 'throws an exception') the current-record pointer remains unchanged.

Parameters:
bo - a blank BeeObject.
Returns:
true if the search succeeded, false if it failed - this object was the last in the beetree.
Throws:
java.io.IOException

nextKey

public boolean nextKey(BeeObject bo)
                throws java.io.IOException
As next, except the BeeObject is not unmarshalled. For finding keys, when not concerned with the objects they index.

Parameters:
bo - a blank BeeObject.
Returns:
true if the search succeeded, false if it failed - this object was the last in the beetree.
Throws:
java.io.IOException

nextMatch

public boolean nextMatch(BeeObject bo)
                  throws java.io.IOException
Find the next object which matches the partial key from the most recent findFirst. Returns as next. If the operation returns false, the current-record pointer remains unchanged. There is no prevMatch because the process is asymmetrical - prevMatch would be twinned with findLast, and that is not implemented either. (It could be, were there enough demand).

Parameters:
bo - a blank BeeObject.
Returns:
true if the search succeeded, false if it failed - this object was the last in the beetree that matched.
Throws:
java.io.IOException

nextMatchKey

public boolean nextMatchKey(BeeObject bo)
                     throws java.io.IOException
As nextMatch, except that the object is not unmarshalled.

Parameters:
bo - a blank BeeObject.
Returns:
true if the search succeeded, false if it failed - this object was the last in the beetree that matched.
Throws:
java.io.IOException

prev

public boolean prev(BeeObject bo)
             throws java.io.IOException
Find the previous object, as next.

Parameters:
bo - a blank BeeObject.
Returns:
true if the search succeeded, false if it failed (this blob was the first).
Throws:
java.io.IOException

prevKey

public boolean prevKey(BeeObject bo)
                throws java.io.IOException
Find the previous object but don't unmarshal, as nextKey.

Parameters:
bo - a blank BeeObject.
Returns:
true if the search succeeded, false if it failed (this blob was the first).
Throws:
java.io.IOException

threadLock

public void threadLock()
                throws java.io.IOException

Lock the beetree file so that only this thread in this VM can use it; all other threads in all other VMs are locked out. This guarantees that a set of operations on the beetree are atomic; this is necessary (but not sufficient, although it'll do if you don't need rollback) to comprise a transaction. For the sake of efficiency, apps should try to complete such atomicised operations quickly. The disk file is open while the lock holds, increasing its vulnerability in the event of a system failure.

Apps must ensure that the lock is released; generally this is best done using a try...finally block.

As well as operations that must be atomic, it also improves efficiency to thread-lock the beetree where several operations will be carried out on the same beetree in quick succession.

This call will block until such a lock can be obtained.

Calls can safely be nested - one unlock for every lock.

Throws:
java.io.IOException

threadUnlock

public void threadUnlock()
                  throws java.io.IOException
Release the beetree from your previously-applied thread lock.

Throws:
java.io.IOException

threadLockRO

public void threadLockRO()
                  throws java.io.IOException
As threadLock, but only acquires an RO lock. Improves multithread concurrency, but you do need to make sure that your transaction doesn't do any write operations. Don't nest RO and RW thread locks.

Throws:
java.io.IOException

threadUnlockRO

public void threadUnlockRO()
                    throws java.io.IOException
Release the beetree from your previously-applied RO thread lock.

Throws:
java.io.IOException

threadUnlockIfHolder

public void threadUnlockIfHolder()
                          throws java.io.IOException
Release the beetree from a previously-applied thread lock, iff the current thread has one to release. Mostly for BeeLine; application code which does similarly to BeeLine for itself may find it useful as well.

Throws:
java.io.IOException

threadUnlockROIfHolder

public void threadUnlockROIfHolder()
                            throws java.io.IOException
Release the beetree from a previously-applied thread lock, iff the current thread has one to release. Mostly for BeeLine; application code which does similarly to BeeLine for itself may find it useful as well.

Throws:
java.io.IOException

eviscerate

public void eviscerate(boolean isWipe)
                throws java.io.IOException
Remove every single object from the BeeTree in one fell swoop. Clearly this function should be used with great care, hence the rather sanguine name.

Parameters:
isWipe - Whether to attempt to wipe the underlying file. The wipe is a rather limited single-pass wipe; for greater security you should use BeeTreeCompactor with a multipass FileKiller mode, instead of this method.
Throws:
java.io.IOException

getTotalFree

public long getTotalFree()
                  throws ConsistencyException
Returns the total amount of free space inside the BeeTree. This space is created when blobs or indexing data are deleted. If it becomes a significant fraction of the size of the file, you should compact the BeeTree.

Note:This is only indicative, and may be inaccurate in the presence of concurrent access and immediately after compaction. If you need to ensure an accurate result, thread-lock the beetree and do any operation on it (e.g. findExact) immediately before calling this method.

Returns:
the total free space.
Throws:
ConsistencyException

getTotalSpaces

public int getTotalSpaces()
                   throws ConsistencyException
Returns the number of free spaces inside the BeeTree. These are created when blobs or indexing data are deleted. If this becomes large, (say a few thousand) you should compact the BeeTree.

Note:This is only indicative, and may be inaccurate in the presence of concurrent access and immediately after compaction. If you need to ensure an accurate result, thread-lock the beetree and do any operation on it (e.g. findExact) immediately before calling this method.

Returns:
the total number of free spaces.
Throws:
ConsistencyException - If the BeeTree is invalid.

setThreshold

public void setThreshold(int threshold)
Changes the swapping threshold for temporary space when marshalling. Blobs shorter than this are handled entirely in memory. Longer blobs may be swapped out to temporary disk files (which are automatically wiped clear when no longer required).

Parameters:
threshold - The threshold.

getContainingFile

public java.io.File getContainingFile()
Returns the File containing the beetree. Do not update this file directly - the results will be horrible.

Returns:
The disk file.

getPassword

public byte[] getPassword()
returns a copy of the current password, if any.

Returns:
the password, or null if it doesn't exist.

getCipherVersion

public int getCipherVersion()
Returns the current cipher version.


getCipherInstance

public Cipher getCipherInstance()
Returns a new, uninitialised instance of the Cipher corresponding to the current Cipher version.


getNullCipherInstance

public Cipher getNullCipherInstance()
Returns a new, uninitialised instance of the Cipher the tree uses when it's not actually encrypted.


getCipherFor

public Cipher getCipherFor(byte[] password)
                    throws java.io.IOException
Returns a configured Cipher corresponding to the current cipher version and the supplied password.

Throws:
java.io.IOException

setNodeCacheSize

public void setNodeCacheSize(int size)
Changes the size of the node cache. The default size of 64 should be fine for most apps; scale down if you're short of memory or using a huge number of BeeTrees, or scale up if your beetree is enormous.

Parameters:
size - The node cache size.

getFileID

public long getFileID()
Returns the file ID of the beetree. This identifier is globally unique with high probability but this is not guaranteed.

Returns:
The file ID.

backup

public void backup(java.io.File tgt)
            throws java.io.IOException
Backs up the beetree to the supplied file, in an safe way. You can do this manually, but precautions are required to ensure MT-safety, so it's better to let us do it for you. This may take some time for a large database, especially over a network. NOTE: This will thread-lock, unless you do it externally. When this BeeTree is part of a BeeLine, either do not backup while the current thread holds other thread-locks, or apply the thread-lock externally.

Parameters:
tgt - The file to backup to.
Throws:
java.io.IOException - if there was a problem during the process. The source file (i.e. the "live" database) will not be affected. The state of the target file following an exception is undefined.

restore

public void restore(java.io.File src)
             throws java.io.IOException
Deprecated. Considered dodgy! This has not been used in anger and so has not accrued the same level of confidence as everything else. Most applications won't need to do this except during disaster recovery; under these circumstances it's much safer to stop the application, restore manually and then restart the application.

Restores a backup taken with backup, in a safe way. NOTE: it is unsafe to attempt this manually even in a single-threaded application. Such an attempt will cause data corruption. Use the tool! NOTE: mayh not be safe with CDBeeTrees, especially if the set structure of the version to be restored is different from that being overwritten.

Parameters:
src - The source to restore from.
Throws:
java.io.IOException - If the restore failed. If this happens, then the backup will be OK, but the beetree will be corrupt. Close all applications that might be using the beetree and then copy the backup over the "live" beetree file using the operating system facility.

changeFileID

public static void changeFileID(java.io.File btf)
                         throws java.io.IOException
Change the file ID of a BeeTree. WARNING: This operation is extremely unsafe! To survive, you must ensure that no BeeTree object exists (anywhere) which is accessing this file. This method is only intended to be used by installer programs which are installing some sort of "skeleton" BeeTree (i.e. a BeeTree which contains a suitable structure for some application, and which will be filled out with user data as time goes by). In this case, the skeleton will have whatever file ID the application developer gave it, and when multiple skeletons are installed on the same host as different files (with the intention of putting different user data in each) then these ex-skeletons will lock each other unnecessarily - a performance loss. This method will block for a few ms after the change. In this way, you can be sure that multiple consecutive calls will not all change the file ID to the same value (thus replicating the problem).

Parameters:
btf - The file containing a beetree, associated with no BeeTree object, whose file ID is to be changed.
Throws:
java.io.IOException - if the operation fails.

setDeadlockTimeoutMillis

public static void setDeadlockTimeoutMillis(long millis)
Sets a deadlock timeout, in milliseconds. This is the maximum length of time a BeeTree will wait to obtain a lock on the disk file. Notes and caveats:


lock

public void lock()
          throws DeadlockException
Acquires a hold on the BeeTree's nestable (re-entrant) mutex. This does nothing to lock the disk file and isn't useful for most applications; use threadLock instead. Don't call this unless you know what you're doing.

Throws:
DeadlockException

unlock

public void unlock()
Releases a hold obtained by lock. Same comments apply as for lock.