TCS: Randomized Computation

Randomized Computation

Randomized Algorithm

A randomized algorithm outputs the correct value with good probability on every possible input.

Matrix multiplication

Input matrix A,B,CA, B, C, decide if C=ABC = AB

Obviously there is a deterministic and polynomial algorithm for this.

A random algorithm: Freivalds’ algorithm

  1. Repeat the following for kk times.
    1. Randomly choose v{0,1}nv\in \{0, 1\}^{n}
    2. Compute (d=A(Bv)Cv)(d = A(Bv) - Cv)
    3. Reject if d0d\neq 0
  2. Accept

We have that this algorithm can solve this problem in O(kn2)O(kn^{2}) time with
a probability of failure 2k\leq 2^{-k}

Proof:

If ABCAB \neq C, we prove P(d=0)12P(d = 0) \leq \frac{1}{2} for each time.
So D=ABC0D = AB - C \neq 0. Let Dij0D_{ij} \neq 0
The ii-th entry of dd holds that:

di=Dikvk=Dijvj+kjDikvkd_{i} = \sum D_{ik}v_{k} = D_{ij}v_{j} + \sum\limits_{k\neq j}D_{ik}v_{k}

Let s=kjDikvks = \sum\limits_{k\neq j}D_{ik}v_{k}, so:

P(di=0)=P(di=0s=0)P(s=0)+P(di=0s0)P(s0)P(vi=0)P(s=0)+P(vi=1)P(s0)12(P(s=0)+P(s0))12\begin{align*}P(d_{i} = 0) &= P(d_{i} = 0 \,|\, s = 0)P(s = 0) \\&\quad +P(d_{i} = 0 \,|\, s\neq 0)P(s\neq 0) \\&\leq P(v_{i} = 0)P(s = 0) + P(v_{i} = 1)P(s\neq 0)\\&\leq \frac{1}{2}(P(s = 0) + P(s\neq 0)) \leq \frac{1}{2}\end{align*}

So P(d=0n)P(di=0)12P(d = 0^{n}) \leq P(d_{i} = 0) \leq \frac{1}{2}

Maxcut Approximation

The MAX-CUT problem is NP-Complete. So our task is to find a cut CC whose size is not far from the optimal one CC^{*}.
If sizeCαsizeC\text{size}_C \ge \alpha\,\text{size}_{C^*}, we call CC is an α\alpha-approximation, then we have an easily way to find 12\frac{1}{2}-approximation, which is universal randomly distribute each vertex into set 00 or 11.

E(sizeC)=E{u,v}E1xuxv=12E12sizeC.\begin{equation*} \mathbb{E}(\text{size}_C) = \mathbb{E} \sum_{\{u, v\} \in E} 1_{x_u \ne x_v} = \frac{1}{2} |E| \ge \frac{1}{2} \text{size}_{C^*}. \end{equation*}

This is just sufficient expection, but we can give an always-large-enough cut by conditional expection if we can compute this equation efficiently.

E(sizeC(x1,,xi,Xi+1,Xn)),\begin{equation*} \mathbb{E}(\text{size}_C(x_1, \ldots, x_i, X_{i+1}, \ldots X_{n})), \end{equation*}

We maximize this in each choice.

Derandomize

Above algorithm uses nn random choices, covering 2n2^{n} possibilities. We can try to reduce the randomness to a polynomial number of possibilities, we can derandomize the algorithm.
Considering Universal hash function:

Consider a family of hash functions H={h:UR}\mathcal{H} = \{ h : U \to R \}. Universal hash functions are a family of functions with the random-like property while the size of the family is small. We can use a small seed to choose hash functions from the family.

Pairwise independent hash functions.

A family H={h:UR}\mathcal{H} = \{h : U \to R\} is called Pairwise independent if for any distinct x1,x2Ux_{1}, x_{2}\in Uand any y1,y2Ry_{1}, y_{2}\in R, we have:

PhH(h(x1)=y1 and h(x2)=y2)=1R2.\begin{equation*}P_{h \in \mathcal{H}} \bigl( h(x_1) = y_1 \text{ and } h(x_2) = y_2 \bigr) = \frac{1}{|R|^2}.\end{equation*}

A pairwise independent hash functions mapping {0,1}k\{0, 1\}^{k} to {0,1}\{0, 1\}.

H={h(x)=(ax+b)(mod 2)a{0,1}kb{0,1}}\mathcal{H} = \{ h(x) = (ax + b)(\text{mod }2) \,|\, a \in \{0, 1\}^{k}\quad b\in\{0, 1\} \}

This family size is H=2k+1|\mathcal{H}| = 2^{k+1}. Assign k=lognk = \lceil \log n\rceil, then UU can encoding each vertex in GG. So H2n|\mathcal{H}| \leq 2n, which means we can go through all the hash function in H\mathcal{H} and output the maximized cut.

BPP

Define Prob TM as follows:

A probabilistic Turing machine is a type of NTM in which each nondeterministic step is called a coin-flip step and has two legal next moves. We assign a probability 2k2^{-k} to each branch of the machine’s computation where kk is the number of coin flips occur in the branch.

The probability of the machine accepting the input is defined as

P(M accepts w)=b:b is acceptingP(b).\begin{equation*}P(M \text{ accepts } w) = \sum_{b:b \text{ is accepting}} P(b).\end{equation*}

This is equvilant to that each son of a vertex in NTM can be reach in the same probability.
Define the error probability ε\varepsilon:

  1. If wAw \in A, then P(M(w)=1)1εP(M(w) = 1) \geq 1 - \varepsilon
  2. If wAw\notin A, then P(M(w)=1)εP(M(w) = 1) \leq \varepsilon

Then we can define BPP\text{BPP} with error probability:

BPP\text{BPP} is the class of languages decided by probabilistic polynomial-time Turing machines with an error probability of 13\frac{1}{3}
Actually, the 13\frac{1}{3} can be replaced by any constant exactly greatly than 12\frac{1}{2}

BPP\text{BPP} can be also defined with verifier:

A decision problem AA is in BPP\text{BPP} if and only if there is a polynomial-time verifier VV such that for all xx, xAx\in A if and only if

Pr(V(x,r)=1)23.\begin{equation*}P_{r} \bigl(V(x, r) = 1 \bigr) \ge \frac{2}{3}.\end{equation*}

Error Reduction

Any decision problem ABPPA\in\text{BPP} has a polynomial-time randomized algorithm whose error probability is 2p(n)2^{-p(n)} where pp is a polynomial and nn is the input size.

This can be proved by Chernoff bound or Sampling Theroem

Circuits v.s. BPP

Define SIZEn(s)\text{SIZE}_{n}(s):

For a finite function g:{0,1}n{0,1}g: \{0, 1\}^{n}\rightarrow\{0, 1\}, gSIZEn(s)g \in \text{SIZE}_{n}(s) if there is a circuit of at most ss NAND gates computing gg.

And we define the restricted function:

Fn(x)=F(x) for x{0,1}n.\begin{equation*} F_{\restriction n} (x) = F(x) \text{ for } x\in \{0,1\}^n. \end{equation*}

Then FF is non-uniformly computable in T(n)T(n) size, as FSIZE(T)F\in\text{SIZE}(T) if there is a sequence C0,C1,C_{0}, C_{1}, \dots of NAND circuits such that:

  1. CnC_{n} computes FnF_{\restriction n}
  2. CnC_{n} has at most T(n)T(n) gates when nn is sufficiently large

So the non-uniform analog P\text{P}:

P/poly=cNSIZE(nc)\text{P}/\text{poly} = \bigcup\limits_{c\in\mathbb{N}}\text{SIZE}(n^{c})

Obviously, PP/poly\text{P}\subsetneq\text{P}/\text{poly} and it can be proved BPPP/poly\text{BPP}\subset\text{P}/\text{poly} as follows:

Due to error reduction, ABPPA\in \text{BPP} has a polynomial-time randomized algorithm whose error probability is less than 2n2^{-n}, which means there is a verifier VV, such that

xPy(V(x,y)A(x))<12n\forall x \,\, P_{y}(V(x, y) \neq A(x)) < \frac{1}{2^{n}}

So due to the union bound:

Py(xV(x,y)A(x))xPy(V(x,y)A(x))<1P_{y}(\exist x\,V(x, y)\neq A(x)) \leq \sum\limits_{x}P_{y}(V(x, y)\neq A(x)) < 1

As this probability is not 11, there must exist some yy^{*} for which xV(x,y)=A(x)\forall x\, V(x, y^{*}) = A(x).
Thus there exists a circuit with poly(n)\text{poly}(n) gates to caculate problem AA beacuse yy^{*} is polynomial

P = BPP <= P = NP

Sipser–Gács Theorem: BPPΣ2PΠ2P\text{BPP} \in \Sigma^{P}_{2} \cap \Pi_{2}^{P}, while the ΣP\Sigma^{P} and ΠP\Pi^{P} are defined as:

ΣiP=PΠiP=P \begin{align*} \Sigma_{i}^{P} &= \exists\forall\exists\dots \text{P} \\ \Pi_{i}^{P} &= \forall\exists\forall\dots \text{P} \end{align*}

And we have the following theroem

P=NP\text{P} = \text{NP} implies P=BPP\text{P} = \text{BPP}

The proof is diffcult with the technique ‘probabilistic method’

And there is also a theroem that reveals the relation between B\text{B} and BPP\text{BPP}

Relations with P NP EXP

We know PEXP\text{P} \subsetneq \text{EXP} and BPPEXP\text{BPP} \subseteq \text{EXP}

  • Expected: P=BPPNPEXP\text{P} = \text{BPP} \subsetneq \text{NP} \subseteq \text{EXP}
  • Extreme: PNPBPP=EXP\text{P} \subsetneq \text{NP} \subseteq \text{BPP} = \text{EXP}
  • Extreme also: P=BPP=NPEXP\text{P} = \text{BPP} = \text{NP} \subsetneq \text{EXP}