Oblivious data structure explained

In computer science, an oblivious data structure is a data structure that gives no information about the sequence or pattern of the operations that have been applied except for the final result of the operations.

In most conditions, even if the data is encrypted, the access pattern can be achieved, and this pattern can leak some important information such as encryption keys. And in the outsourcing of cloud data, this leakage of access pattern is still very serious. An access pattern is a specification of an access mode for every attribute of a relation schema. For example, the sequences of user read or write the data in the cloud are access patterns.

We say a machine is oblivious if the sequence in which it accesses is equivalent for any two inputs with the same running time. So the data access pattern is independent from the input.

Applications:

Oblivious data structures

Oblivious RAM

Goldreich and Ostrovsky proposed this term on software protection.

The memory access of oblivious RAM is probabilistic and the probabilistic distribution is independent of the input. In the paper composed by Goldreich and Ostrovsky have theorem to oblivious RAM: Let denote a RAM with m memory locations and access to a random oracle machine. Then t steps of an arbitrary program can be simulated by less than steps of an oblivious . Every oblivious simulation of must make at least

max\{m,(t-1)log2m\}

accesses in order to simulate t steps.

Now we have the square-root algorithm to simulate the oblivious ram working.

  1. For each

\sqrtm

accesses, randomly permute first

m+\sqrtm

memory.
  1. Check the shelter words first if we want to access a word.
  2. If the word is there, access one of the dummy words. And if the word is not there, find the permuted location.

To access original RAM in t steps we need to simulate it with

t+\sqrtm

steps for the oblivious RAM. For each access, the cost would be O(

\sqrtmlogm

).

Another way to simulate is hierarchical algorithm. The basic idea is to consider the shelter memory as a buffer, and extend it to the multiple levels of buffers. For level, there are buckets and for each bucket has log t items. For each level there is a random selected hash function.

The operation is like the following: At first load program to the last level, which can be say has buckets. For reading, check the bucket from each level, If (V,X) is already found, pick a bucket randomly to access, and if it is not found, check the bucket, there is only one real match and remaining are dummy entries . For writing, put (V,X) to the first level, and if the first I levels are full, move all I levels to levels and empty the first I levels.

The time cost for each level cost O(log t); cost for every access is ; The cost of Hashing is .

Oblivious tree

An Oblivious Tree is a rooted tree with the following property:

The oblivious tree is a data structure similar to 2–3 tree, but with the additional property of being oblivious. The rightmost path may have degree one and this can help to describe the update algorithms. Oblivious tree requires randomization to achieve a running time for the update operations. And for two sequences of operations M and N acting to the tree, the output of the tree has the same output probability distributions. For the tree, there are three operations:

build a new tree storing the sequence of values L at its leaves.
  • insert a new leaf node storing the value b as the ith leaf of the tree T.
  • remove the ith leaf from T.Step of Create: The list of nodes at the ithlevel is obtained traversing the list of nodes at level i+1 from left to right and repeatedly doing the following:
    1. Choose d uniformly at random.
    2. If there are less than d nodes left at level i+1, set d equal to the number of nodes left.
    3. Create a new node n at level I with the next d nodes at level i+1 as children and compute the size of n as the sum of the sizes of its children.

    For example, if the coin tosses of d has an outcome of: 2, 3, 2, 2, 2, 2, 3 stores the string “OBLIVION” as follow oblivious tree.

    Both the and have the O(log n) expected running time. And for and we have:

    INSERT (b, I, CREATE (L)) = CREATE (L [1] + …….., L[i], b, L[i+1]………..) DELETE (I, CREATE (L)) = CREATE (L[1]+ ………L[I - 1], L[i+1], ………..)

    For example, if the or is run, it yields the same probabilities of out come between these two operations.

    References