In structural complexity theory, the Berman–Hartmanis conjecture is an unsolved conjecture named after Leonard C. Berman and Juris Hartmanis.[1] Informally, it states that all NP-complete languages look alike, in the sense that they can be related to each other by polynomial time isomorphisms.[2] [3] [4]
An isomorphism between formal languages L1 and L2 is a bijective map f from strings in the alphabet of L1 to strings in the alphabet of L2, with the property that a string x belongs to L1 if and only if f(x) belongs to L2.
A polynomial-time isomorphism, or p-isomorphism for short, is an isomorphism f where both f and its inverse function can be computed in an amount of time polynomial in the lengths of their arguments.
Berman and Hartmanis conjectured that all NP-complete languages are p-isomorphic to each other.
A formal language L is paddable if there is a polynomial time function f(x,y), with a polynomial time inverse, such that for every x and every y, the string x belongs to L if and only if f(x,y) belongs to L. That is, it is possible to pad the input x with irrelevant information y, in an invertible way, without changing its membership in the language. Berman and Hartmanis proved that all pairs of paddable NP-complete languages are p-isomorphic.
Since p-isomorphism preserves paddability, and there exist paddable NP-complete languages, an equivalent way of stating the Berman–Hartmanis conjecture is that all NP-complete languages are paddable.
Polynomial time isomorphism is an equivalence relation, and it can be used to partition the formal languages into equivalence classes, so another way of stating the Berman–Hartmanis conjecture is that the NP-complete languages form a single equivalence class for this relation.
A formal language is called sparse if the number of yes-instances of length n grows only polynomially as a function of n. The known NP-complete languages have a number of yes-instances that grows exponentially, and if L is a language with exponentially many yes-instances then it cannot be p-isomorphic to a sparse language, because its yes-instances would have to be mapped to strings that are more than polynomially long in order for the mapping to be one-to-one. Therefore, if the Berman–Hartmanis conjecture is true, an immediate consequence would be the nonexistence of sparse NP-complete languages.
The nonexistence of sparse NP-complete languages in turn implies that P ≠ NP, because if P = NP then every nontrivial language in P (including some sparse ones, such as the language of binary strings all of whose bits are zero) would be NP-complete. In 1982, Steve Mahaney published his proof that the nonexistence of sparse NP-complete languages (with NP-completeness defined in the standard way using many-one reductions) is in fact equivalent to the statement that P ≠ NP; this is Mahaney's theorem. Even for a relaxed definition of NP-completeness using Turing reductions, the existence of a sparse NP-complete language would imply an unexpected collapse of the polynomial hierarchy.[5]
As evidence towards the conjecture, showed that an analogous conjecture with a restricted type of reduction is true: every two languages that are complete for NP under AC0 many-one reductions have an AC0 isomorphism.[6] showed that, if there exist one-way functions that cannot be inverted in polynomial time on all inputs, but if every such function has a small but dense subset of inputs on which it can be inverted in P/poly (as is true for known functions of this type) then every two NP-complete languages have a P/poly isomorphism.[7] And found an oracle machine model in which the analogue to the isomorphism conjecture is true.[8]
Evidence against the conjecture was provided by and . Joseph and Young introduced a class of NP-complete problems, the k-creative sets, for which no p-isomorphism to the standard NP-complete problems is known.[9] Kurtz et al. showed that in oracle machine models given access to a random oracle, the analogue of the conjecture is not true: if A is a random oracle, then not all sets complete for NPA have isomorphisms in PA.[10] Random oracles are commonly used in the theory of cryptography to model cryptographic hash functions that are computationally indistinguishable from random, and the construction of Kurtz et al. can be carried out with such a function in place of the oracle. For this reason, among others, the Berman–Hartmanis isomorphism conjecture is believed false by many complexity theorists.[11]