Tokyo Cabinet attains improvement in retrieval by loading RAM with the whole of a bucket array. If a bucket array is on RAM, it is possible to access a region of a target record by about one path of file operations. A bucket array saved in a file is not read into RAM with the
read' call but directly mapped to RAM with the mmap’ call. Therefore, preparation time on connecting to a database is very short, and two or more processes can share the same memory map.
If the number of elements of a bucket array is about half of records stored within a database, although it depends on characteristic of the input, the probability of collision of hash values is about 56.7% (36.8% if the same, 21.3% if twice, 11.5% if four times, 6.0% if eight times). In such case, it is possible to retrieve a record by two or less paths of file operations. If it is made into a performance index, in order to handle a database containing one million of records, a bucket array with half a million of elements is needed. The size of each element is 4 bytes. That is, if 2M bytes of RAM is available, a database containing one million records can be handled.
(1-0.567)+(1-0.567)*(0.567) = 67%