% Author: Samuel Kutin
% Quantum Lower Bound for the Collision Problem with Small Range

\documentclass{toc}

\tocdetails{%
  volume=1, number=2, year=2005, firstpage=29,
  received = {December 8, 2004},
  published = {April 14, 2005}}

\newcommand{\calA}{{\mathcal A}}
\newcommand{\calF}{{\mathcal F}}
\newcommand{\ceil}[1]{{\left\lceil #1 \right\rceil}}
\newcommand{\floor}[1]{{\left\lfloor #1 \right\rfloor}}
\newcommand{\E}{{\mathbf E}}
\newcommand{\qu}[1]{\left| #1 \right\rangle}

\runningauthor{Samuel Kutin}
\runningtitle{Quantum Lower Bound for the Collision Problem with Small Range}
\copyrightauthor{Samuel Kutin}

\begin{document}

\begin{frontmatter}[classification=text] 
  \title{Quantum Lower Bound for the Collision Problem with Small Range}
  \tocpdftitle{Quantum Lower Bound for the Collision Problem with Small Range}
  \tocpdfauthor{Samuel Kutin}
  \author[kutin]{Samuel Kutin}
  \begin{abstract}
    We extend Aaronson and Shi's quantum lower bound for the
    $r$-to-one collision problem.  An $r$-to-one function is one where
    every element of the image has exactly $r$ preimages.  The
    $r$-to-one collision problem is to distinguish between one-to-one
    functions and $r$-to-one functions over an $n$-element domain.
    
    Recently, Aaronson and Shi proved a lower bound of
    $\Omega((n/r)^{1/3})$ quantum queries for the $r$-to-one collision
    problem.  Their bound is tight, but their
    proof applies only when the range has size at least $3n/2$.  We
    give a modified version of their argument that removes this restriction.
  \end{abstract}
  \tockeywords{quantum, query, oracle, collision, polynomial method, lower bound, small range}
  \tocams{81P68, 68Q17}
  \tocacm{F.1.2} 
\end{frontmatter}

\section{Introduction}
How many quantum queries does it take to find a collision?
A \emph{collision} in a function is a pair of inputs that map to the
same value.  We consider the problem of finding a collision in an
$r$-to-one function; i.e., a function where every element of the image
has exactly $r$ preimages.  (We require that $r$ be a divisor of
$n$, the size of the input space.)  The difficulty of this problem
for a quantum computer has attracted much interest~\cite{Aar02, AS04,
Amb03-lower, Amb03-upper, BHT97, Shi01}.

In some cases, explicit information about a function may make it easier
to find collisions.  For example, if we know a function is periodic, we
can find a collision using Shor's algorithm~\cite{Shor}.  Rather than
use such explicit information, we focus on a \emph{black-box} model: our
only access to the function is as a quantum oracle.  Brassard, H{\o}yer, and
Tapp~\cite{BHT97} use Grover's search~\cite{G96} to find a collision in
an $r$-to-one function in $O((n/r)^{1/3})$ quantum queries,
an improvement over the $\Theta((n/r)^{1/2})$ classical queries needed.
In this note, we are concerned with the matching lower bound.

For a lower bound, it is easier to consider a decision problem:
the input function is guaranteed to be either one-to-one or
$r$-to-one, and our task is to distinguish between these two cases.
Aaronson \cite{Aar02} proved the first significant lower bound:
$\Omega((n/r)^{1/5})$ queries.

More recently, Shi~\cite{Shi01} proved a lower bound of
$\Omega((n/r)^{1/3})$, given the additional condition that the size
of the range of the function is at least $3n/2$.  (In the case where
the range is only $n$, Shi provides a lower bound
of $\Omega((n/r)^{1/4})$).
The proof is a novel application of the methods of
Nisan and Szegedy \cite{NS92} and Paturi~\cite{Pat92}
to the case where one cannot fully
symmetrize the multivariate polynomials.

Our main result is a new version of this theorem, but without the
additional constraint on the size of the range:

\begin{theorem}
\label{main-thm}
Let $n > 0$ and $r \geq 2$ be integers with $r \mid n$, and let a function
from $[n]$ to $[n]$ be given as an oracle with the promise that it is
either one-to-one or $r$-to-one.  Then any quantum algorithm for
distinguishing these two cases must evaluate the function
$\Omega\left((n/r)^{1/3}\right)$ times.
\end{theorem}

The argument is very similar to that of Aaronson and Shi.  (See~\cite{AS04}
for a combined version of~\cite{Aar02} and~\cite{Shi01}.)
As stated above, we
remove the requirement that the range be at least $3n/2$.  Our
proof is conceptually simpler for other reasons:
\begin{enumerate}
\item The natural automorphism group on the set of functions
from $[n]$ to $[N]$ is $S_n \times S_N$.  Our argument symmetrizes with
respect to the entire group.
\item For technical reasons, Shi introduces an additional decision problem
 called Half-$r$-to-one, where one must distinguish between $r$-to-one
functions and functions that are $r$-to-one on half the domain and
one-to-one on the other half.  We avoid using this Half-$r$-to-one problem.
\end{enumerate}

\subsection*{An independent approach}

Independent of this work, Ambainis~\cite{Amb03-lower} gave an alternate proof
of \thmref{main-thm}.  His approach is more general:  he shows that, given any
lower bound for a symmetric function property with a restriction on the size of
the range, we can remove that restriction.

Ambainis's work, together with Shi's paper, implies \thmref{main-thm}.
It is worth noting another consequence of those two papers: Aaronson
and Shi prove that, given a black-box function $f$ on $n$ inputs whose
range has size $\Omega(n^2)$, it takes $\Omega(n^{2/3})$ queries to determine
if $f$ is one-to-one.  \thmref{main-thm} implies a similar result; the
constant hidden in the $\Omega(n^2)$ term improves, but the dependence on
$n$ does not.  Neither Aaronson and Shi~\cite{AS04} nor this paper
gives a lower bound for element distinctness with small range.

However, Ambainis's work gives a lower bound of $\Omega(n^{2/3})$ without
any range restriction.  Ambainis has also given a matching upper
bound~\cite{Amb03-upper}.

\section{Preliminaries}

\subsection{Functions as quantum oracles}

Let $n, N > 0$ be integers.  Let $\calF(n,N)$ be the set of functions
from $[n]$ to $[N]$.

A function is given to us as a quantum oracle.  We can perform a
transformation $O_f$, which applies $f$ to the contents of some of the
quantum state:
$$
O_f \qu{i,j,z} = \qu{i, f(i) + j \pmod N, z}\enspace.
$$
Here $z$ is a placeholder for the unaffected portion of the quantum
state.

The query complexity of a quantum algorithm is the number of times it
calls $O_f$.  We think of our algorithm as alternating between $T+1$
unitary operators and $T$ applications of $O_f$.

Let $\delta_{i,j}(f)$ be $1$ when $f(i) = j$ and $0$ otherwise.  Then,
after $T$ queries, the amplitude of each quantum base state is a
degree-$T$ polynomial in these $\delta_{i,j}(f)$.  Hence, the acceptance
probability $P(f)$ is a polynomial over $\delta_{i,j}$ of degree at most
$2T$.  The connection between quantum complexity and polynomial degree
is due to Beals, et al.\ \cite{BBCMdW}; the application to functions
using variables $\delta_{i,j}$ is due to Aaronson~\cite{Aar02}.

Note that this polynomial $P(f)$ is constrained to be in the interval
$[0,1]$ whenever the $\delta_{i,j}$ correspond to a valid input; i.e.,
\begin{align}
\notag
\forall i, j, \qquad & \delta_{i,j} \in \{0, 1\} \enspace,\\
\label{delta-constraint} \forall i, \qquad & \sum_j \delta_{i,j} = 1\enspace.
\end{align}

The connection between polynomial degree and query complexity
was first made by Nisan and Szegedy \cite{NS92}.  In their applications,
they symmetrize over all permutations of the variables, reducing the
multivariate polynomial to a univariate polynomial.  They then apply
results from approximation theory to prove a lower bound on the
degree of the polynomial.  Beals, et al.\ \cite{BBCMdW} follow the
same approach.

A nice, general version of the approximation theory results was
shown by Paturi \cite{Pat92}.  Following Shi \cite{Shi01}, we use
a slight modification of Paturi's theorem:

\begin{theorem}[Paturi]
\label{Paturi-thm}
Let $q(\alpha) \in \R[\alpha]$ be a polynomial of degree $d$.
Let $a$ and $b$ be integers, $a < b$, and let $\xi \in [a,b]$ be a
real number.
If
\begin{enumerate}
\item $|q(i)| \leq c_1$ for all integers $i \in [a,b]$, and
\item $|q(\ceil{\xi}) - q(\xi)| \geq c_2$ for some constant $c_2 > 0$,
\end{enumerate}
then
$$
d = \Omega(\sqrt{(\xi - a + 1)(b - \xi + 1)})\enspace,
$$
where the hidden constant depends on $c_1$ and $c_2$.
\end{theorem}

Note that, if the conditions of the theorem are met for any $\xi$,
we have $d = \Omega(\sqrt{b - a})$.  If they are met for some
$\xi \approx (a+b)/2$, then $d = \Omega(b - a)$.

In our setting, the automorphism group for the variables $\delta_{i,j}$
is $S_n \times S_N$.  If we symmetrize with respect to this group, we
do not immediately obtain a univariate polynomial.  Hence, we will have
to work harder to apply \thmref{Paturi-thm}.

For $\sigma \in S_n$, $\tau \in S_N$, we define
$\Gamma^\sigma_\tau \colon \calF(n,N) \to \calF(n,N)$ by
$$
\Gamma^\sigma_\tau(f) = \tau \circ f \circ \sigma\enspace.
$$

Let $P(f)$ be an acceptance polynomial as above.  We can write
$P$ as a sum $\sum_S C_S I_S(f)$, where $S$ ranges over subsets
of $[n] \times [N]$ and
$$
I_S = \prod_{(i,j) \in S} \delta_{i,j}\enspace.
$$
By \eqref{delta-constraint}, we may assume that each pair
$(i,j) \in S$ has a distinct value of $i$; we thus write
\begin{equation}
\label{monomial-form}
I_S = \prod_{k=1}^t \prod_{i \in S_k} \delta_{i,j_k}\enspace,
\end{equation}
where the sets $S_k$ are disjoint.  The degree of the monomial
is $\sum_k |S_k|$.

\subsection{Some special functions}

We now define a collection of functions which are $a$-to-one on
part of the domain, and $b$-to-one on the rest of the domain.
(These will enable us to interpolate between one-to-one and
$r$-to-one functions.)

Fix $N \geq n > 0$.  We say that a triple $(m,a,b)$ of integers
is \emph{valid} if $0 \leq m \leq n$,
$a \mid m$, and $b \mid (n - m)$.  For any such valid triple, we
have a function $f_{m,a,b} \in \calF(n,N)$, given by
$$
f_{m,a,b} = \begin{cases}
\ceil{i / a} & 1 \leq i \leq m\enspace, \\
N - \floor{(n - i)/b} & m < i \leq n\enspace.
\end{cases}
$$
So $f_{m,a,b}$ is $a$-to-one on $m$ points, and
$b$-to-one on the remaining $n - m$ points.  (Since $N \geq n$,
the two parts of the range do not overlap.)

Note that our $f_{m,a,b}$ plays the same role as Aaronson and Shi's
$f_{m,g}$, with $a = g$ and $b = 2$.

We now examine the behavior of $f_{m,a,b}$ after we symmetrize by
all of $S_n \times S_N$.

\begin{lemma}
\label{Q-lemma}
Let $P(f)$ be a degree-$d$ polynomial in $\delta_{i,j}$.
For a valid triple $(m,a,b)$, define $Q(m,a,b)$ by
$$
Q(m,a,b) = \E_{\sigma, \tau}
\left[ P\left(\Gamma^\sigma_\tau(f_{m,a,b})\right)\right]\enspace.
$$
Then $Q$ is a degree-$d$ polynomial in $m,a,b$.
\end{lemma}

The key new step in this paper lies in the proof of \lemref{Q-lemma}.
To show that the expected value $Q(m,a,b)$ is a polynomial,
we break down $S_N$ into a union of disjoint events
$A_U$.  We then write $Q(m,a,b)$ as a sum over all $U$, and we
show that each term in the sum is a polynomial in $m$, $a$, and $b$.

\begin{definition}
For integers $k, \ell$, let $\ell^{\underline k}$ denote the falling
power $\ell(\ell - 1) \cdots (\ell - k + 1)$.
\end{definition}

\begin{proof}[Proof of \lemref{Q-lemma}]
It suffices to prove the lemma in the case where $P$ is a monomial
$I_S$.  We write $I_S$ in the form~\eqref{monomial-form}; then
$d = |S|$.  We write $s_k = |S_k|$.

For each subset $U \subseteq [t]$, let $A_U$ be the following
event:  for each $k \in U$, $\tau^{-1}(j_k) \leq m/a$;
for each $k \notin U$, $\tau^{-1}(j_k) \geq N - (n-m)/b + 1$.

Clearly the events $A_U$ are disjoint.  If $I_S\left(\Gamma^\sigma_\tau
(f_{m,a,b})\right)$ is nonzero, then every $\tau^{-1}(j_k)$ must lie
in the range of $f_{m,a,b}$, so some event $A_U$ must occur.  Hence,
we write

\begin{align*}
Q(m,a,b) & = \sum_{U \subseteq [t]} \Pr(A_U) Q_U(m,a,b)\enspace,
\intertext{where}
Q_U(m,a,b) & = \E_{\sigma, \tau}\left[
I_S\left(\Gamma^\sigma_\tau(f_{m,a,b})\right) \mid A_U \right]\enspace.
\end{align*}

Choose some $U$, and let $u = |U|$.  Then $\Pr(A_U)$ is given
by
$$
\Pr(A_U) = \frac{\left(\frac{m}{a}\right)^{\underline u}
\left(\frac{n - m}{b}\right)^{\underline{t - u}}}{N^{\underline t}}\enspace,
$$
which is a rational function in $m, a, b$.  The numerator has
degree $t$, and the denominator is $a^u b^{t-u}$.

Also,
$$
Q_U(m,a,b) = \frac{1}{n^{\underline d}}
\prod_{k \in U} a^{\underline {s_k}} \prod_{k \notin U} b^{\underline {s_k}}\enspace.
$$
This is a polynomial in $a,b$ of degree $d$; furthermore
$Q_U$ is divisible by $a^u b^{t-u}$.

Hence, for each $U$, $\Pr(A_U) Q_U$ is a degree-$d$ polynomial in
$m,a,b$.  Therefore $Q(m,a,b)$ is itself a degree-$d$ polynomial.
This concludes the lemma.
\end{proof}

\section{Main Proof}

We are now ready to prove \thmref{main-thm}.

\begin{proof}[Proof of \thmref{main-thm}]
Let $\calA$ be an algorithm which distinguishes one-to-one from
$r$-to-one in $T$ queries, and let $P(f)$ be the corresponding
acceptance probability.  $P(f)$ is a polynomial in $\delta_{i,j}$
of degree at most $2T$.  Let $Q(m,a,b)$ be formed from $P$ as in
\lemref{Q-lemma}, and let $d = \deg Q$; we have $d \leq 2T$.

For any $\sigma, \tau$, we know that $\Gamma^\sigma_\tau(f_{m,a,b})$
is a valid function.  If $a = b$, this function is $a$-to-one.
We conclude the following:

\begin{enumerate}
\item $0 \leq Q(m,a,b) \leq 1$ whenever $(m, a, b)$ is a valid triple.
\item $0 \leq Q(m,1,1) \leq 1/3$ for any $m$.
\item $2/3 \leq Q(m,r,r) \leq 1$ for any $m$ such that $r \mid m$.
\end{enumerate}

The remainder of the proof consists of proving that
$\deg Q = \Omega\left((n/r)^{1/3}\right)$.  We will take $M \approx m/2$,
and we will examine either the univariate polynomial $Q(M,1,rx)$ or
$Q(M,rx,r)$ (depending on the value of $Q(M,1,r)$).
If this polynomial remains bounded for large values of $x$, we can apply
\thmref{Paturi-thm}.  Otherwise, we can use \thmref{Paturi-thm}
on the first argument to $Q$.  Either way, we get a lower bound on $d$.

For simplicity of exposition, we begin with the case $r = 2$.
Let $M = 2\floor{n/4}$.  We ask:  is $Q(M,1,2) \geq 1/2$?
In other words:  does our algorithm accept (at least half the time)
an input which is one-to-one on half the domain, and
two-to-one on the other half?

Case I:  $Q(M,1,2) \geq 1/2$.  Let $g(x) = Q(M,1,2x)$, and let $k$ be
the least positive integer for which $|g(k)| \geq 2$.  Then we have $g(x)$ between
$-2$ and $2$ for all positive integers $x < k$, and $g(1) - g(1/2) \geq 1/6$
by assumption.  Let $c = 2k$.  By \thmref{Paturi-thm}, we have
\begin{equation}
\label{d-bound-1}
d = \Omega(\sqrt{k}) = \Omega(\sqrt{c})\enspace.
\end{equation}

Now, we consider the polynomial $h(i) = Q(n-ci, 1, c)$.  For any integer
$i$ in the range $0 \leq i \leq \floor{n/c}$, the triple $(n-ci, 1, c)$
is valid, so $0 \leq h(i) \leq 1$.
But
$$\left|h\left(\frac{n-M}{c}\right)\right| = |Q(M, 1, c)| = |g(k)| \geq 2\enspace.$$
We conclude, by \thmref{Paturi-thm}, that
\begin{equation}
\label{d-bound-2}
d = \Omega(n/c)\enspace.
\end{equation}

Case II:  $Q(M, 1, 2) < 1/2$.  Now, let $g(x) = Q(M,2x,2)$.  Let $k$ be
the least positive integer for which $|g(k)| \geq 2$, and let $c = 2k$.
We have $g(1) - g(1/2) \geq 1/6$; as in Case I, we obtain~\eqref{d-bound-1}
using \thmref{Paturi-thm}.

Next, we consider $h(i) = Q(ci, c, 2)$.  For any integer $i$ in the range
$0 \leq i \leq \floor{{n / c}}$, the triple $(ci, c, 2)$ is valid (both
$n$ and $c$ are even), so $0 \leq h(i) \leq 1$.
But $\left|h\left(M / c\right)\right| = |g(k)| \geq 2$.  Again,
as in Case I, we obtain~\eqref{d-bound-2} using \thmref{Paturi-thm}.

In either case, we use~\eqref{d-bound-1} and~\eqref{d-bound-2} to
obtain $d = \Omega(n^{1/3})$.  We could divide into cases (depending
on whether $c \geq n^{2/3}$), or we could simply
square~\eqref{d-bound-1} and multiply by~\eqref{d-bound-2} to obtain
$d^3 = \Omega(n)$.

For general $r$, the setup is almost identical: we let $M =
r\floor{\frac{n}{2r}}$ and split into cases based on whether $Q(M,1,r) \geq
1/2$.

Case I:  $Q(M,1,r) \geq 1/2$.  Let $g(x) = Q(M,1,rx)$,
let $k$ be the least positive integer for which $|g(k)| \geq 2$, and
let $c = rk$.  We have $g(1) - g(1/r) \geq 1/6$, so
\thmref{Paturi-thm} yields
\begin{equation}
\label{d-bound-1r}
d = \Omega(\sqrt{k}) = \Omega(\sqrt{c/r})\enspace.
\end{equation}

Next, we let $h(i) = Q(n-ci, 1, c)$.  As in the $r=2$ analysis above,
we conclude~\eqref{d-bound-2}.

Case II: $Q(M,1,r) < 1/2$.  Now, let $g(x) = Q(M, rx, r)$, let
$k$ be the least integer for which $|g(k)| \geq 2$, and let $c = rk$.
We have $g(1) - g(1/r) \geq 1/6$; as in Case I, we obtain~\eqref{d-bound-1r}
using \thmref{Paturi-thm}.

Next, we take $h(i) = Q(ci, c, r)$.  For any integer $i$ in the range
$0 \leq i \leq \floor{{n/ c}}$, the triple $(ci, c, r)$ is valid; note
that $n-ci$ must be a multiple of $r$.
But $|h(M/c)| = |g(k)| \geq 2$.  So, as in the $r=2$ analysis,
we get~\eqref{d-bound-2}.

In either case, we square~\eqref{d-bound-1r} and multiply by~\eqref{d-bound-2}
to obtain $d^3 = \Omega(n/r)$ as desired.
\end{proof}

\section*{Acknowledgments}
The author thanks L{\'a}szl{\'o} Babai, Vincent Nesme,
Natacha Portier, and the referees for helpful comments.

\bibliography{a002} 
\bibliographystyle{tocplain}

\begin{tocauthors}
\begin{tocinfo}[kutin]
Samuel Kutin \\ 
Center for Communications Research \\
805 Bunn Drive \\
Princeton, NJ 08540 \\
kutin\tocat{}idaccr\tocdot{}org \\
\url{http://www.kutin.com}
\end{tocinfo}
\end{tocauthors}
\begin{tocaboutauthors}
\begin{tocabout}[kutin]
Samuel (Sandy) Kutin received his Ph.D. from the University of Chicago
in 2002 under \href{http://www.cs.uchicago.edu/~niyogi}{Partha Niyogi}
and \href{http://www.cs.uchicago.edu/~laci}{L{\'a}szl{\'o} Babai}.  His
research interests include Boolean function complexity, computational
learning theory, and quantum computing.  Sandy and his family live
in Princeton, where he spends his free time petting his cats, playing
games, and participating in the \href{http://www.puzzlers.org}{National
Puzzlers' League}.  He is proud to support {\sf Theory of Computing}.
\end{tocabout}
\end{tocaboutauthors}

\end{document}