Quantum algorithms for attacking hardness assumptions in classical and post ‐ quantum cryptography

In this survey, the authors review the main quantum algorithms for solving the computational problems that serve as hardness assumptions for cryptosystem. To this end, the authors consider both the currently most widely used classically secure crypto-systems, and the most promising candidates for post ‐ quantum secure cryptosystems. The authors provide details on the cost of the quantum algorithms presented in this survey. The authors furthermore discuss ongoing research directions that can impact quantum cryptanalysis in the future


| INTRODUCTION
Quantum computers are a form of computers that leverage quantum-mechanical phenomena to perform computationsunlike today's classical computers that leverage classical physical phenomena.
Sufficiently capable large-scale quantum computers-that are either not prone to errors or error-corrected-would pose a threat to most currently widely deployed asymmetric cryptosystems. This is because Shor [1] has introduced polynomial time quantum algorithms for solving the Integer Factoring Problem (IFP) and the Discrete Logarithm Problem (DLP) in cyclic groups.
A quantum computer capable of executing Shor's algorithms for sufficiently large problem instances would, for example, be able to break RSA [2], which is based on the IFP, and DSA [3] and Diffie-Hellman (DH) [4], which are based on the DLP-mainly in multiplicative groups of finite fields, or groups of points of elliptic curves, in the case of elliptic curve cryptography (ECC) [5,6].
The aforementioned cryptosystems are currently used to secure most transactions that take place over the Internet.
They are used not only to provide confidentiality but also for authentication and to issue binding signatures for nonrepudiation purposes. Examples of use cases for these cryptosystems include, but are not limited to, a plethora of services dealing with sensitive personal, corporate or government data.
Needless to say, the aforementioned cryptosystems will need to be replaced with post-quantum secure alternatives prior to the advent of large-scale quantum computers. They will then become susceptible to attacks by adversaries with access to such computers. At first, the set of adversaries with this capability may admittedly be fairly limited, but over time it will likely grow larger.
Whenever a cryptosystem is used to protect confidentiality, it is critical to have idea of the length of time for which the information protected has to remain confidential: Say that the information needs to remain confidential for Δ years. Then the cryptosystem used to protect it must be replaced at the very latest Δ years before it can be broken-by any relevant adversary, quantum or classical. This is because we must assume the adversary to be able to record encrypted traffic sent today for decryption in the future.
For cryptosystems that are used for authentication, or for issuing binding signatures for non-repudiation, the situation is different: Such cryptosystems may be replaced much closer in time to the point when they are broken, even if mitigating actions may then have to be taken to extend the lifetime or signatures that need to remain binding in the long term. For instance, mitigation may be accomplished by a trusted party attesting to having seen the signature at a given point in time prior to when the cryptosystem originally used to issue the signature is broken. Similar mitigation is not possible in the context of providing confidentiality in the long term. The threat posed by the possible future advent of largescale quantum computers is sufficiently concerning to already have prompted standardisation processes for post-quantum secure cryptography to be initiated. For example, the National Institute for Standards and Technology (NIST) in the United States (US) has started a standardisation process for post-quantum secure cryptosystems [7]. This process was started after the National Security Agency (NSA) in the US announced back in 2015 that it would transition to postquantum primitives in its Suite B of cryptosystems [8].
Some standards are already available [9], and over the coming years more are projected to follow [10]. Once available, these standards will need to be integrated into other standards -for example, for protocols-prior to being adopted in products and finally being widely deployed on the Internet. This whole process will take time.
At the same time, it is prudent to begin the process of transitioning to post-quantum secure cryptosystems as soon as possible. In particular, use cases where confidentiality needs to be provided in the long term should be prioritised, for the reasons explained earlier.
Early adopters are wise to deploy post-quantum secure cryptosystems alongside existing classically secure cryptosystems, in such a way that both systems have to be broken simultaneously for the combined hybrid system to be broken. Furthermore, early adopters are wise to beware of the risk of introducing vulnerabilities-for instance in the form of side channels-when implementing new post-quantum secure cryptosystems for which no well-established industry best practices do as of yet exist.
In order to inform decisions on what post-quantum cryptography to standardise and the pace at which the transition must be executed, research into quantum cryptanalysis is needed. Such research allows the community to better understand the costs of attacking classically and post-quantum secure cryptosystems quantumly.
It is against this backdrop that we decided to jointly write this survey paper: In it, we-a subset of the participants at the 2021 Schloss Dagstuhl seminar on quantum cryptanalysiscome together to jointly review the current state of quantum cryptanalysis, with each expert focussing of her or his own area of expertise.
In particular, we survey quantum algorithms for solving the hard problems that underpin the currently most widely deployed classically secure cryptosystems, alongside the post-quantum secure cryptosystems currently primarily considered for standardisation.
The scope of this survey does not include discussing the impact of quantum computing on security notions. In particular, security proofs are impacted in a way that goes beyond the mere ability of a quantum adversary to solve some hard problems more efficiently than a classical adversary, yet such aspects of quantum cryptanalysis are not covered in this survey.
We provide asymptotic cost estimates for some of the quantum algorithms that we discuss in this survey. It is worth noting that such estimates can be used to derive security parameters for cryptosystems that are based on the hard problems that these algorithms solve. This is however not something that we would recommend, without first carrying out additional analyses, including a more fine-grained analysis of the quantum circuit for the algorithm in question and a careful review of the specific design of the cryptosystem in question-all of which is beyond the scope of this survey.

| Overview
This paper is organised as follows: In Section 2, we first review some background information on quantum computing and quantum algorithms. We furthermore discuss different methods and models for costing quantum algorithms.
Then, in Sections 3 and 4, we review the two main overarching families of quantum algorithms: search algorithms, which derive from the seminal work of Grover [11], and algorithms for finding a hidden subgroup inside of a control group, which derive from the seminal work of Shor [1].
Next, we survey concrete quantum algorithms for breaking currently widely deployed symmetric and asymmetric cryptosystems: In Section 5, we first survey quantum algorithms for breaking currently widely deployed asymmetric cryptosystems that are based on the IFP or DLP in some form. In particular, we survey Shor's algorithms for the IFP and DLP and their various derivatives. In Section 6, we briefly digress by discussing generalisations of Shor's algorithms, before surveying algorithms for breaking symmetric cryptosystems, such as hash functions and block ciphers, in Section 7.
Finally, in Sections 8−10, we survey the main quantum algorithms for attacking the hardness assumptions that underpin future lattice-based, code-based and isogeny-based cryptosystems considered for standardisation. For hardness assumptions that underpin hash-based cryptosystems, see instead Section 7.
Note that this survey does not cover all hardness assumptions that have been proposed as foundations for postquantum cryptography. Instead, we chose to focus on the computational problems that underpin some of the most promising candidates for future post-quantum secure cryptosystems. Additional problems relevant to the cryptanalysis of post-quantum cryptography that are not covered in this survey include the resolution of systems of quadratic equations, the conjugacy problem over certain algebraic groups, and the search for fixed-weight linear coefficients of a relation modulo a Mersenne prime.

| BACKGROUND
In this section, we briefly recall basic notions pertaining to quantum computing and quantum algorithms. We furthermore briefly discuss different methods and models for costing quantum algorithms.
For a thorough introduction to the subject, we recommend introductory books such as Nielsen and Chuang [12] or the more concise Kaye, Laflamme and Mosca [13].

| Qubits
The smallest unit of quantum information is the qubit-the name 'qubit' being a contraction of 'quantum bit'. The qubit may be perceived as the quantum analogue of the classical bit; the smallest unit of classical information. But whereas the classical bit is in either one of two states, denoted 0 or a 1, and a qubit is in one of two computational basis states, denoted j0〉 and j1〉, or in some superposition thereof:

Definition 1 (Qubit). A qubit is a two-level quantummechanical system. It is in a state jψ〉 given by a normalised sum
jψ〉 ¼ c 0 j0〉 þ c 1 j1〉; c 0 ; c 1 ∈ C; jc 0 j 2 þ jc 1 j 2 ¼ 1; where the two computational basis states j0〉 and j1〉 form an orthonormal basis for C 2 given by j0〉 ¼ A classical bit may be read without affecting its state. Such a read operation yields either 0 or 1, depending on the state of the bit.
A qubit may be observed in a measurement. The probability of observing j ∈ {0, 1} in such a measurement is |c j | 2 . If the qubit is in a superposition when it is measured, the superposition collapses to the basis state j j〉 conditioned on the measurement. The state of a qubit is defined up to a global phase exp iϕ which is not observable. Thus, in the definition above, c 0 can be made a real number without loss of generality.
The notion of the qubit is abstract. As is the case for the classical bit, there are possible physical realisations of the qubit. For example, the linear polarisation of a photon could be used to form a qubit (the two levels being the so-called 'up-down polarisation' ↕ and 'left-right polarisation' ↔). Another example are the spins of a spin-1/2 particle (the two levels are being ↑ (spin up) and ↓ (spin down)).

| Systems of qubits
A set of n qubits may be combined to form an n-qubit system. Such a system can be in one of 2 n computational basis states, denoted j j〉 for j ∈ ½0; 2 n Þ, or in some superposition thereof: Definition 2 (Quantum system). An n-qubit system is in a state jψ〉 given by a normalised sum jψ〉 ¼ X 2 n −1 j¼0 c j jj〉; c j ∈ C; X 2 n −1 j¼0 jc j j 2 ¼ 1; where the computational basis states j j〉 for j ∈ ½0; 2 n Þ form an orthonormal basis for C 2 n given by Two independent quantum systems, of n 1 and n 2 qubits, that are in the states � � ψ 1 〉 and � � ψ 2 〉, respectively, may be perceived as a single system. The resulting n 1 + n 2 -qubit system is then in the product state given by denotes the tensor product. For compactness, we write If a state is not a product state, then it said to be entangled. Like for an individual qubit, a global phase is irrelevant. An n-qubit system may be split into one or more n i -qubit sub-systems or registers. The computational basis states of such an n i -qubit sub-system may be indexed using any set, for as long as there is an injective map from the set to the 2 n i basis states.

| Measurements
An arbitrary n-qubit subsystem of an n + m-qubit system may be measured, in any orthonormal basis, by applying the below measurement postulate (up to re-ordering the qubits):

À �
� ψ i 〉 � i≤2 n be an orthonormal basis corresponding to an observable of an nqubit system. Assume that the state jψ〉 ∈ C 2 nþm of an n + mqubit system decomposes as where � � γ i 〉 is a state and hence of norm one. Then the measurement of the first n qubits yields the outcome i with probability |α i | 2 and collapses the system to the state � � ψ i 〉 � � γ i 〉.

| Quantum computation
In essence, the goal of a quantum algorithm is to evolve the state of a quantum system to a point where a measurement yields classical information on the solution to a computational problem of interest. Quantum theory postulates that the evolution over time of the state of a closed quantum system is described by a unitary operator: Definition 4 (Evolution postulate). The time evolution of the state of a closed quantum system is described by a unitary operator. For any evolution � � ψ 1 〉 → � � ψ 2 〉 of the closed system, there is a unitary operator U such that A consequence of this postulate is that quantum computing is always reversible, in contrast to classical computing. The reverse of a computation U is also called its uncomputation and denoted U † .
Quantum algorithms are compiled to quantum circuits. A circuit consists of a sequence of gates and measurements. A gate applies a unitary operator to one or more qubits in a quantum system. Some common single-qubit operators are the Pauli operators the Hadamard (H ) operator, and the T and S phase-shift operators The X operator is also known as the 'NOT' operator since it maps j0〉 onto j1〉, and vice versa.
Operators may be controlled, that is, applied to some qubit conditioned on other qubits being in either the j1〉 or j0〉 state. A common such operator is the two-qubit controlled-NOT or CNOT operator that maps ja; b〉 ↦ ja; a ⊕ b〉. It may be seen to perform a reversible XOR-operation (denoted ⊕). Similarly, there is a doubly controlled-NOT or Toffoli operator that maps ja; b; c〉 ↦ ja; b; c ⊕ ða ∧ bÞ〉. It may be seen to perform a reversible AND operation (denoted ∧). Any single-qubit operator U may be controlled by replacing X for U in the expression (1) for the CNOT operator.
There are methods for constructing more advanced controlled operators.
For U 1 an n 1 -qubit operator, and U 2 , and n 2 qubit operator, the n 1 + n 2 -qubit operator U = U 1 ⊗ U 2 applies U 1 to the first n 1 qubits of an n 1 + n 2 -qubit system and U 2 to the trailing n 2 qubits. Similarly, for U 1 and U 2 , two n-qubit operators, the product operator U = U 2 U 1 applies U 1 and then U 2 to an n-qubit system. This provides the necessary theoretical framework for building operators corresponding to large circuits from fewqubit operators.

| Universal quantum computation
Any n-qubit operator U can be reduced to an n-qubit circuit C containing only single-qubit gates and CNOT gates-although this reduction is not necessarily efficient-giving rise to one notion of universal quantum computation.
Furthermore, C can be efficiently approximated to any given degree of precision by a circuit C 0 that consists exclusively of H and S gates-that generate the Clifford group for single qubits-along with the CNOT operator and the non-Clifford T operator. This is a consequence of Solovay-Kitaev's theorem-if one glosses over some details, such as adjustments to the global phase and to determinants, and the need to include inverses in the gate set. The interested reader is referred to [Ref. 12,Appendix 3]. Note that there are many universal gate sets, but Clifford + CNOT + T (for Clifford = {H, S}) is arguably one of the most popular.
For f(x) any efficient classical function, we can use operators from a universal gate set to construct an efficient quantum circuit that takes ja; b〉 → ja; b ⊕ f ðaÞ〉. Indeed, generic techniques due to Bennett [14] convert any classical algorithm taking time T and space S into a reversible algorithm taking time T 1 + ϵ and space O(S log T). Note that this implies that an ancillary register, or ancilla, may be required for this to work, depending on f and the size of the two registers. If f(x) has an efficient classical inverse, then the above implies that we can construct a circuit that takes ja〉 → j f ðaÞ〉.

| Architectural constraints
Up to this point, we have essentially assumed that single-and two-qubit quantum gates may be freely applied to any of the qubits that make up our system. The same goes for measurements.
In practice, there are, however, often architectural constraints. For instance, the connectivity between qubits may be limited to some form of nearest-neighbour connectivity. There may, furthermore, be restrictions on which gates can be applied to which qubits, or to which pairs of qubits, and so forth. Such constraints can be overcome, for instance by routing, but routing comes at a cost.
When costing circuits for quantum algorithms in this survey, we do not account for architectural constraints, since we do not specify a specific architecture.

| Quantum error correction
Most quantum computers as currently envisaged are prone to errors arising-for instance when gates are applied or when measurements are performed-necessitating quantum error correction.
The basic idea in quantum error correction is to use multiple physical qubits to construct a logical qubit. The redundancy thus induced allows for errors to be detected and corrected.
The number of physical qubits required to construct a logical qubit depends on the number of operations that the logical qubit is required to be able to undergo in sequence without the risk of uncorrectable errors arising growing too large. It furthermore depends on the physical characteristics of the quantum computer, such as the physical error rates of operators and measurements, and of course on the choice of quantum error-correcting code to employ.
There are a number of proposals for quantum errorcorrecting codes. One of the key contenders is stabiliser codes, such as the surface code. For a good introduction to the surface code, see Ref. [15].
It is important to understand that the overhead induced by quantum error correction may be substantial. A distinction must therefore be made between physical and logical qubit counts and between physical and logical cost estimates for circuits in general.
When costing circuits for quantum algorithms in this survey, we do so without accounting for the need for error correction since we do not specify a specific architecture.

| Logical cost estimates
To estimate the post-quantum security level of a cryptosystem, we need to determine not only which quantum algorithm is currently the most efficient for breaking the system but also which metric to use to best cost the algorithm depending on the context at hand.
Metrics commonly used to coarsely quantify the cost of quantum algorithms at a logical level of abstraction include � the number of gates in some logical circuit for the algorithm, � the (maximum) depth in gates of said circuit, � the (maximum) width in logical qubits of said circuit, � the product of the (maximum) depth and width of said circuit, and � similar metrics considering only non-Clifford gates, since they are typically assumed to be harder to implement.
When using metrics such as the above, the gate set employed to lay out the circuit, and in particular the level to which the circuit is optimised when it is laid out, may impact the cost estimate, as may any architectural limitations that are taken into account.
The choice of a certain metric may be influenced by a hypothesis on the behaviour of quantum hardware. For example, one might focus on the execution time without considering costs relative to memory. This is done by using only the depth of the circuit. On the other hand, the quantification of the impact of memory requirement is influenced by assumptions on error correction. Indeed, as pointed out in Ref. [16], if we assume active error correction, then the depth-width product (hereinafter DW-cost) is a better measure of the cost than the gate count. To see why, note that the latter metric, for example, captures the cost of an idling qubit while the former completely ignores it.
To complicate matters further, qubits may be measured, left idle and later re-initialised when a circuit is executed. This is the case in particular for ancillary registers. This implies that there is not necessarily a completely unambiguous definition of the width and depth of the circuit. One option is to use the maximum depth and the maximum width at any one point during execution.
Alternative high-level metrics include, but are not limited to, counting the number of high-level operations that must be performed or the maximum depth of the circuit in such operations.
For instance, the number of group operations or oracle invocations may be counted. Such metrics are useful for comparing algorithms that use essentially the same basic building blocks, or for proving lower bounds on the cost of algorithms. They are fairly simple to understand and use in a meaningful way.

| Full-stack physical cost estimates
The derivation of full-stack physical cost estimates is beyond the scope of this survey, as it requires assumptions to be made on the future architecture and performance characteristics of large-scale quantum computers. It furthermore requires careful analysis and optimisation of all layers in the quantum stack, including of the algorithmic layer, the logical circuit layer, the error correction layer, and of any required classical pre-and post-processing.
This being said, we do reference recent full-stack cost estimates for Shor's algorithms and derivatives thereof where available. Such estimates may prove useful when seeking to quantify the feasibility of breaking currently widely deployed asymmetric cryptosystems quantumly in, for example, 10, 15, 20, 30, … years' time.

| Memory cost estimates
The number of computational qubits required is often taken as the amount of quantum memory required to execute a circuit. This being said, there are cost estimates such as Ref. [17] where BIASSE ET AL.
-5 quantum information is swapped out to non-computational quantum memory.
In general, a quantum memory access can be defined as the following unitary, which takes an index register i, an output register x, M memory registers y 0 , …, y M−1 , and writes in the output register the contents of register i: Such an operation can be implemented usingÕðMÞ gates of a universal gate set, width O(M) and depth O(log M ). It should be noted that using only bounded-arity gates, one cannot do asymptotically better. The qRAM model assumes that this operation can be implemented at low cost, typically O(log M) or O (1). Following the naming conventions in Ref. [18]: � quantumly addressable classical memory (QRACM) concerns the case of a classical memory (y 0 , …, y M−1 ) � quantumly addressable quantum memory (QRAQM) is the generic case where the memory can be in a superposition state as well It should be noted that if the index register is in a classical state, then it can be measured and the operation can be performed efficiently in the standard quantum circuit model (because we can apply quantum gates between any qubits at the same cost). Likewise, if all the registers are classical, then this model becomes classical random access memory (CRACM), which is a standard assumption in classical cryptanalysis.
Note that the QRACM and QRAQM assumptions are debatable. It is entirely possible that quantum memory in a physical realisation of a quantum computer could come at a significant cost. When costing quantum algorithms, it is therefore important to state which memory model is used and what the memory cost is in this model.

| Grover's algorithm
The most versatile tool from quantum computing in the scope of quantum cryptanalysis is probably Grover's search algorithm. In a nutshell, it allows to search for preimages of an unstructured Boolean function f : [1, N] → {0, 1}, assuming that there exists an efficient quantum algorithm that implements f. When a single preimage of 1 exists (a marked element), Grover's algorithm finds it at the cost of Oð ffi ffi ffi ffi N p Þ applications of the inspection function f, whereas a classical circuit would need to inspect at least O(N) elements to succeed with constant probability. Therefore, many techniques use Grover either as the main routine or as a sub-routine. In particular, Grover's search in theory enhances all brute-force search procedures.
More generally, let S ⊆ [1, N ] be the set of preimages of 1, and assume |S| = M. To start the search, we create the state jψ〉 ≔ 1 ffi ffi ffi N p P x∈½1;N� jx〉. In the case where N = 2 n , this is done by applying H ⊗n to the input state j0〉 ⊗n . At this point, the measurement of the state of the system would yield a marked element with probability M/N. Grover's algorithm will repeatedly act on the state to increase those odds. Note that Grover's search is often described as searching a database, which fits in the definition above if we think of f as performing a memory access at a given index. However, in order to implement this efficiently, one may need the quantum random-access model presented in Section 2.10.
The first building block of Grover's algorithm is an oracle that implements the inspection function f (See Figure 1). Intuitively, this state acts on a superposition of basis states by multiplying the phase of all the component jx〉 corresponding to the index x of a marked element by −1. The iteration of the Grover algorithm uses O f alongside a similar operator O ϕ which satisfies Note that O f can be efficiently implemented from a circuit that implement f, while O ϕ is efficiently implementable with Clifford + T + CNOT gates. An iteration of Grover's algorithm is shown in Figure 2.
The initial state jψ〉 is in the span of the vectors jα〉; jβ〉 defined as Proof : We first show that the Grover iterate operator is equal to G ¼ ð2jψ〉〈ψj − IÞO f . Then we apply this operator on the two basis vectors jα〉 and jβ〉, and we observe that the matrix of G has the claimed shape. □ Starting with jψ〉, the state we reach after k iterations of G is We need to select k so that this state is as close as possible to jβ〉 (whose measurement would yield a marked element with probability 1). This means aiming for 2kþ1 2 θ ≈ π 2 .

| Amplitude amplification
Grover's search algorithm can be generalised to include a subroutine to replace the creation of the uniform superposition over all elements in [1, N] with another algorithm with better odds of leading to the measurement of a marked element. This could typically be used to implement nested searches. For example, the inner search might identify marked elements in a set S 0 such that S ⊂ S 0 . In the Oracles with costs framework of Kimmel et al. [19], this strategy is used to take advantage of the situation where the oracle O f used to mark elements in S 0 is significantly less expensive to implement than the oracle O g that marks elements in S. Formally, in amplitude amplification, we assume the knowledge of an algorithm A that produces a superposition over all possible outcomes with certain weights Here junk(x) is a state resulting from the computation of A (i.e. a collection of intermediate values that are kept due to reversibility). In the case of Grover's algorithm, A = H ⊗n , but in general, the measurement of jψ〉 yields x ∈ S with probability that is not necessarily M/N. The amplitude amplification circuit is almost identical to that of Grover's search except that calls to H ⊗n are replaced by A. It consists in the repetition of the iterate shown in Figure 3. Similar to the analysis of Grover's algorithm, we define the states . This means that Q acts as a rotation of angle θ.
We start the procedure with the state jψ〉 ¼ cosðθ=2Þjα〉 þ sinðθ=2Þjβ〉. The state we reach after k iterations is

| Random walks
Grover's search algorithm is generalised by the notion of random walk on a graph. We assume that a graph G is given by a set of vertices V and edges E, and we assume that we are looking for a marked element in M = f −1 ({1}) for some f : V → {0, 1}. The general strategy of a random walk is to start from a vertex x ∈ V, check if f(x) = 1, and if not, then walk in the graph by sampling neighbours uniformly at random long enough to ensure the new vertex x 0 attained is distributed almost uniformly at random in V and then test if f(x 0 ) = 1. This is repeated until a marked element is found. In addition to running O f , there are two main steps in a quantum walk that contribute to the overall cost: � Setup: sampling the first vector and initialising the data structure. � Update: sampling a neighbour and updating the data structure (we need to update the current node and its neighbours).
Each of the aforementioned steps have a cost that depends on the data structure that is chosen to navigate the graph (note that depending on the model of computation chosen, memoryintensive data structures can penalise the cost). Moreover, the cost is impacted by the shape of the transition matrix M. In the case of a d-regular graph (which is relevant to many computational problems), M ¼ 1 d A where A is the adjacency matrix of the graph. The number of update steps required to reach a node almost uniformly distributed isÕ the eigenvalues of M not equal to 1. A similar approach can be used quantumly [20]. The cost becomes Many computational problems relevant to cryptanalysis reduce to a walk in the Johnson graph of a set. In general, a Johnson graph J(n, r) is an undirected graph whose vertices and the subsets of size r of a given set U of size n. There is an edge between vertices S ⊆ U and S 0 ⊆ U if and only if |S ∩ S 0 | = r − 1 (i.e. they differ by only 1 element). The Johnson graph J(n, r) has jV j ¼ À n r � vertices, is r(n − r)-regular and its spectral gap is The product J m (n, r) of m copies of J(n, r) is the graph whose vertices are of the form (v 1 , …, v m ) where each v i is a vertex of J(n, r), and there is an edge between (v 1 , …, v m ) and

| Quantum backtracking
Relevant to cryptanalysis of algorithms for lattices is an extension of Grover's search where instead of searching for a marked element in a list, the task is to find a marked leaf in a (large) tree. Let us describe the setup. We assume the following query access to a tree T: for a given node v, we have an oracle that returns the number of its children; also we have an oracle that for a node v and index i, returns the ith child of v. The maximal degree of T is the largest number of children among all nodes. Backtracking algorithms is a classical method to solve problems where we can partially enumerate solutions and check whether the current sub-solution can be extended to the actual solution. Hence, a backtracking algorithm constructs a tree in depth-first manner where leaves represent solutions. An example for such problem is SAT. Backtracking requires an oracle P T that operates on nodes s.t. given a leaf v it tells whether v is a solution ('marked') and given any other node the oracle returns 'intermediate'. Classically, finding a 'marked' leaf can be done on time OðnodesÞ. Montanaro in Ref. [21] gives a Grover-like speed-up for this task: ). There is a quantum algorithm that given a query access to a tree T as described above with maximal degree Oð1Þ, an oracle P T , n-an upper bound on the depth of T and ɛ > 0, outputs either a marked leaf x or ⊥ if no marked x exists, by making O � ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi #nodes p ⋅ polyðnÞlogð1=εÞ � queries to T and to P T . The algorithm uses polyðnÞ qubits and is correct with probability larger than 1 − ɛ.

| HIDDEN SUBGROUP PROBLEMS
Hidden Subgroup Problems (HSP) consist in the search for a secret subgroup H inside of a control group G, given access to an oracle function f: G → X for some set X of quantum states that satisfies f(x) = f(y) if and only if x ∈ y + H. Many interesting computational problems reduce to an HSP instance, including factoring and the discrete logarithm problem, as shown in Section 5.

| The abelian quantum Fourier transform
Solving the HSP in an abelian group in quantum polynomial time is usually done by using the Quantum Fourier Transform (QFT). The QFT depends on the control group G we are working with.
To simplify this introduction, we start with the QFT used in Shor's original work [1] which applies to G ¼ Z 2 n .
Definition 10 (QFT over G ¼ Z 2 n ). The QFT circuit over G ¼ Z 2 n performs the following operation on the basis states: Such a circuit solves the phase estimation problem which, given an input state jψ〉 ¼ 1 ffi ffi ffiffi 2 n p P 2 n −1 y¼0 e 2iπωy jy〉, asks to find an approximation of ω ∈ R. One can show that the measurement of the state QFT 2 n jψ〉 yields y such that A circuit realising the exact QFT over Z 2 n cannot be implemented with Clifford + T + CNOT gates. Instead, it should be approximated. The essential component for this QFT that does not belong in our gate set are the controlled rotations. Given k ≤ n, these gates realise the operation j0〉ja〉 ↦ j0〉ja〉 and j1〉ja〉 ↦ j1〉R k ja〉 where The QFT 2 n circuit is realised with a combination of Hadamard and controlled rotations shown in Figure 4. We use the notation 0.
The QFT can be generalised to any finite abelian group G. This uses the notion of the character group b G of G of group morphisms f : G → C * . Elements in y ∈ G are in correspondence with characters χ y ∈ b G. In a cyclic group Z d , and g = (g 1 , g 2 ) ∈ G, then χ g ¼ χ g 1 χ g 2 , which allows us to define χ g by induction for any finite abelian group. For example, when G ¼ Z 2 n , we have χ y : x ↦ e Similar to the case G ¼ Z 2 n , the QFT over an arbitrary finite abelian group can be efficiently approximated by a polynomial size circuit of Clifford + T + CNOT gates.

| Solving the HSP in a finite abelian group
In this section, we assume that G is finite and abelian and that we have an implementation of a function f that satisfies f(x) = f(y) if and only if x − y ∈ H ⊆ G where H is a secret subgroup. The goal of the HSP algorithm we present is to recover H by measuring elements of b H . When a generating set for b H is known, we use classical methods to recover H. We initiate this process by creating a uniform superposition over the elements of G: this is done by using H ⊗n while efficient methods exist for other groups as well. We then use the function implementing f to obtain the state Now, we measure the second register, which yields y = f(x) in the range of f and collapses the state into a superposition of all preimages of y. By assumption on the periodicity of f, we know that f( This means that the system is left in the state The next stage consists in applying the QFT to the state � � ψ 2 〉. This yields the following state: Since |χ y (x)|, we see that the only possible measurements are y such that χ y (H) = 1, that is, H ⊆ ker(χ y ). □ The strategy to compute H consists in running the above procedure several times and to compute ∩ y measured ker(χ y ). With high probability, this yields H in O(log(|G|)) steps.

| HSP in D N
In general, the HSP in non-abelian groups can be a hard problem. Still, subexponential algorithms for the dihedral group (D N ) have been proposed [18,22,23]. These algorithms have a time complexity in 2 (log(N)) , and Ref. [23] has polynomial memory. In this section, we introduce Kuperberg's method [22] to solve the HSP in D N , which is a non-abelian group. Given a positive integer N, we can define the dihedral group D N by is the identity and ϕ(1) is the inversion. Concretely, this means that the elements of D N are those of ðZ=NZÞ � ðZ=2ZÞ and that the group law is given by Assume H is a subgroup of D N . Let us show how the HSP instance defined by H can be reduced to a simpler instance of the HSP where the secret subgroup has the shape H 0 = {(0, 0) (k, 1)}. We can define H 1 ≔ H ∩ Z=NZ � f0g ¼ fða; bÞ∈ H with b ¼ 0g. This subgroup H 1 is isomorphic to a subgroup of Z=NZ, which is of the form MðZ=NZÞ for M | N. The subgroup H 1 is normal in D N , and we have the following identity. So, given a function f: D N → X that hides H, our strategy is to first find H 1 that is hidden by the function f 0 : D N ∩ Z=NZ � f0g → X, which is the restriction of f. This is done by re-casting this as an instance of the HSP in the abelian group G ¼ Z N . Once this is done, we learn the integer M and we turn our attention to the resolution of the HSP instance defined by the hidden subgroup H 2 of D M .

Proposition 14 With the notations above, the hidden subgroup of D M is either H
Therefore, we look for a secret s ∈ Z N , which defines a hidden subgroup H = {(0, 0) (s, 1)} ⊆ D N . We describe Kuperberg's sieve algorithm from a high level perspective by restricting ourselves to the case of N = 2 n . There are two main ingredients to this method: first, the creation of so-called coset states and then the sieve itself that recombines the coset states together to learn information about the secret s. To create a coset state, we compute a uniform superposition of all elements of D N : ja〉jb〉: Then, we use a circuit U f : ja; b〉jc〉 ↦ ja; b〉jc þ f ða; bÞ〉 for the function f that hides H on the input state jϕ 0 〉j0〉, thus creating the state We measure the second register and learn f(a, b) for some a, b. This leaves the state in the superposition of all elements (x, y) such that f(x, y) = f(a, b). As f hides the subgroup H = {(0, 0) (s, 1)}, the sum should have two elements, one with b = 0 and one with b = 1. Let t = a for the pair (a, b) with b = 0, then the other pair is (t, 0) ⋅ (s, 1) = (t + s, 1) so that the resulting state is We apply QFT N ⊗ I 2 to jϕ 1 〉 (i.e. we apply the QFT to the first register and leave the second one alone), thus producing the state We measure the first register, thus obtaining a value k drawn uniformly at random in [0, N − 1] and leaving the second (single-qubit) register in the state which is one of the N possible coset states.
We now assume that we have a circuit that can produce coset states . Note that the classical information of the label k is known as well. Our goal is to create the coset state we perform the circuit described in Figure 5. The input state is the product state � � ψ k ; ψ l 〉, and after the CNOT gate, the system is left in the state Hence, with probability 1/2, the measurement of the second qubit is 1 and the system is left in a state that is proportional (up to a global phase) to We look for indices k, l whose first m ≔ ⌈ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi n − 1 p ⌉ binary digits are the same. Then, the first m binary digits of k − l will be 0. We need to create a large enough initial list L 0 of coset states so that enough of the corresponding indices have their first m binary digits in common. The resulting coset states (which have indices with m initial zeros in their binary decomposition) will be in a list L 1 . This process is then repeated with the next m bits in the binary decomposition of the indices. The last coset state should be � � ψ 2 n −1 〉. We begin with jL 0 j ¼ 2 n 0 random coset states, and at each stage jL iþ1 j ≥ jL i j 8 , which means that jL m j ≥ 2 n 0 −3m . Starting with n 0 ≥ 4m guarantees enough coset states after the mth stage of the sieve to obtain � � ψ 2 n −1 〉.
Once we know b ∈ {0, 1} such that s = 2s 0 + b, we need to repeat the sieve in D N/2 to learn the bit of s 0 of lowest order. We can immediately see that the function . So, the above procedure can be repeated to learn the first bit b 0 of s 0 . Eventually, after log(s)/log(2) ≤ log(N)/log(2) steps, the process allows us to learn all the bits of the binary decomposition of secret s. This can be generalised to arbitrary N, and even to the HSP in G⋊ ϕ Z=2Z for an arbitrary finite abelian group G.

| FACTORING AND DISCRETE LOGARITHM PROBLEMS
Shor's algorithm [1,24] for the IFP splits any integer N-that is odd and not a perfect prime power-into two non-trivial factors. To completely factor N, the algorithm may be applied recursively. There exist efficient classical algorithms for testing primality [25,26] and for reducing perfect powers. The restrictions imposed by Shor, therefore, do not imply a loss of generality.

| Shor's original factoring algorithm
Shor's algorithm works by first classically reducing the IFP to an order-finding problem (OFP) in a cyclic subgroup of Z * N . This OFP is then solved quantumly using an order-finding algorithm.
It should be noted that there exist other potential applications for efficient order-finding algorithms, besides factoring integers: Shor's order-finding algorithm computes orders in any cyclic group for which the group arithmetic may be efficiently implemented.

| Reducing the IFP to an OFP
First, an element g is selected uniformly at random from Z * N . In practice, this may be accomplished by selecting an integer g uniformly at random from the interval (1, N ) and testing if g is coprime to N. If it is not, we will have found a non-trivial factor of N. Otherwise, we will have successfully selected g.
The order r of g is then computed quantumly. The order is the least positive integer such that g r ≡ 1 (mod N ). If r is even, so N | (g r − 1), whilst N ∤ (g r/2 − 1) by the definition of r. If, furthermore, N ∤ g r/2 + 1, it must be that gcd(N, g r/2 � 1) are non-trivial factors of N. Specifically, it must be that N = ab, for some a, b ≠ 1 where a | (g r/2 + 1) and b | (g r/2 − 1). Shor cites Miller [25] for this randomised reduction. See also Long [27].
Shor proves [Ref. [1], p. 1498] that the probability of r being even, and of the above condition being met, is at least 1/2. If either condition is not met, the whole algorithm may simply be re-run. After n runs, the failure probability is then at most 2 −n .

| Improved reductions
Ekerå [28] has recently shown how the complete factorisation of N may be computed efficiently classically, with very high probability, given the order r of g, and in a follow-up work [29], he gave probability estimates that account for the possibility of failing to recover r.
In the worst case, for N an m-bit integer with n distinct prime factors, the failure probability when using this approach is at most by [Ref. [28], Th. 1], where c, k ≥ 1 are parameters that may be freely selected. Increasing c, k increases the success probability at the expense of having to perform more classical postprocessing work.
In practice, c, k may be selected so as to guarantee a very low failure probability without compromising efficiency. Furthermore, for nearly all integers N, the failure probability is much smaller than the bound indicates: In general, it is expected to be insignificant.
Note that the failure probability tends to zero asymptotically as N → ∞, if c = 1 and we, for example, let k depend on m such as k ≤ 3 log 2 m.

| Quantum order finding
Shor's order-finding algorithm-if slightly generalised, for reasons that will soon become clear, and with notation from Ref. [30]-uses two registers: a control register of ν : = m + ℓ qubits-for m the bit length of the order and ℓ~m-and a work register of some t qubits-for t sufficiently large to allow group elements to be represented and group operations to be performed.
The work register may be initialised to any value. For simplicity, we initialise it to j1〉, to have it represent the identity in Z * N . The control register is initialised to j0〉, after which H gates are independently applied to the ν qubits in the register, yielding ja; 1〉: As may be seen above, the effect of applying the H gates is to induce a uniform superposition over all possible states in the control register. Next, we compute g a to the work register, yielding ja; g a 〉: Above, and in what follows, we perceive g as an element of Z * N and forego writing out mod N. In practice, the exponentiation would typically be performed by classically precomputing and multiplying g 2 i into the work register if the ith control qubit a i = 1 in what amounts to the square-and-multiply algorithm: Note that multiplication by powers of g mod N is an invertible operation. This is what allows us to multiply precomputed powers of g directly into the work register. Note, furthermore, that the set of powers of g may be efficiently computed by repeated squaring.
We now have where m e ≔ ⌊ð2 ν − e − 1Þ=r⌋, and r is the order of g. If, at this point, we measure the work register and obtain g e for some e ∈ ½0; rÞ, this leaves the system in the state 1 ffi ffi ffi ffi ffi ffi m e p X m e −1 b¼0 jrb þ e; g e 〉: As may be seen, the control register is now periodic in r, with an unknown offset e. To eliminate e, we apply QFT 2 ν to the control register, leaving the system in the state 1 ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 2 ν m e p and where {u} n is u reduced mod n constrained to ½n=2; n=2Þ. The probability of measuring a given j (that yields θ ≠ 0) is then sin 2 ðθ r =2Þ : The measured value j implicitly defines α r which is unknown to the user. If α r ∈ [ − r/2, r/2], Shor uses that for some unknown the convergent z/r will be found in the continued fraction expansion of j/2 ν (see Hardy and Wright [[31], thus enabling us to find the convergent z/r when ℓ~m. We may increase the success probability by increasing ℓ.
It may be shown that we expect to observe j that produces α r such that |α r |~2 m , efficiently yielding ℓ bits of information on r.
In [Ref. [32], App. A], the probability of observing α r such that |α r | is of length τ bits is shown to be exponentially suppressed in τ as τ grows larger than m. Furthermore, the probability distribution in log(|α r |) is plotted in [Ref. [33], Fig. A1], for maximal r = 2 m − 1 and for specific choices of m and ℓ. This figure is, however, representative also for other choices of parameters, for as long as r is not divisible by a very large power of two: Picking a smaller r ∈ ½2 m−1 ; 2 m Þ shifts the distribution slightly. Varying m and ℓ has no visible effect for as long as m and ℓ remain sufficiently large.
In practice, it is more efficient-see [Ref. [1], p. 1501]-to try to solve not only j but also j � 1, j � 2, … for the convergent z/r instead of increasing ℓ (This is when assuming classical computation to be cheap compared to quantum computation).
Since we expect to observe j yielding α r such that |α r | 2 m as stated above, we expect α r ¼ frjg 2 ν to increase or decrease by r~2 m if we increase or decrease j. Solving for small offsets in j, therefore, yields the convergent z/r with high probability.
A further concern when performing order finding is that factors may cancel between r and z. Shor [1] points out that the probability of r and z being coprime is ϕ(r)/r = O(1/log log r). Hence, the expected number of runs is O(log log r). Odlyzko furthermore states in private communication to Shor [1] that this may be reduced to O(1) runs by searching for the missing factor d = gcd(z, r).
In fact, assuming we also solve for offsets in j, it is shown in Ref. [32] that z mod r is selected essentially uniformly at random from ½0; rÞ via j. The probability of a d not being cmsmooth, for c some small constant, may then be upperbounded and shown to be small-see the main result in Ref. [32]. And if d is cm-smooth, d may be found efficiently classically as is explained in for example, Ref. [32] and [Ref. [33], Section 6.2.4].

| Practical implementation and control qubit recycling
In practice, the circuit is typically not implemented as described in the previous section: It is not necessary to actually measure the work register. If one foregoes this measurement and interleaves the QFT with the modular multiplications, then the QFT may be performed semi-classically [34]. A single qubit, that is recycled [35,36], then suffices to implement the control register in practice. In total, only t + 1 qubits are then required to implement the circuit. Seifert [37] has proposed to pick ℓ ≈ m/s for some s ≥ 1. The quantum circuit then still leaks~ℓ bits of information on r, so a total of n ≥ s runs are then required to solve for r.

| Tradeoffs and lattice-based postprocessing
Seifert states that his aim is to save control qubits, but in a more modern interpretation in which control qubits are recycled, Seifert reduces the circuit depth from~2m multiplications in Shor's algorithm to m + m/s multiplications. The reduction comes at the expense of having to perform n ≥ s runs, so Seifert effectively makes a tradeoff between the circuit depth and the number of runs.
As for post-processing, Seifert generalises continued fractions to higher dimensions using simultaneous Diophantine approximation techniques. Ekerå [33, App. A and Section 6.2] describes a similar lattice-based post-processing technique: Specifically, let L be the integer lattice spanned by the rows of where {j 1 , …, j n } are the measurement results in the n runs. Then, the short vector u ¼ ∈ L may be found by enumerating short vectors in L, as is shown in Ref. [33]. For s ≥ 1, we require about s runs to solve; see, for example, [Ref. [33]., Table A2] for estimates of the number of runs n required to solve for r = 2 m − 1 as a function of m and s. (Solving for smaller r, that is not partially very smooth, is in general easier.) BIASSE ET AL. -13

| Specialised algorithms for RSA and DH
If the goal is to factor RSA integers N = pq, for p, q, two random primes of equal bit length, the algorithm of Ekerå and Håstad [30,38] outperforms factoring via Shor's original orderfinding algorithm and via Seifert's algorithm when making tradeoffs [30].
Ekerå and Håstad use that for g ∈ Z * N , it holds that If g is selected uniformly at random from Z * N , it holds with overwhelming probability that p + q < r for r the order of g. The idea is, hence, to compute the logarithm d = log g x = p + q quantumly. Given N = pq and d = p + q, it is then trivial to solve for p, q.
To compute d, Ekerå and Håstad introduce a quantum algorithm that is derived from Shor's algorithms [1] and that efficiently computes short discrete logarithms in groups of unknown order. A logarithm d is said to be short if it is smaller than the order r of g by some order of magnitude.
For m the bit length of d and ℓ~m/s for s ≥ 1 the tradeoff factor, the algorithm first induces the state and then applies QFT 2 ν and QFT 2 ℓ to the first and second control registers, respectively, to obtain The probability of observing (j, k) and g e for e = a − bd is then [30], Section 3], for b 0 (e) and b 1 (e) as given in [Ref. [30], Section 3.1], and that is exact and on closed form. It may be shown [Ref. [30], Lemma 2] that we expect to observe ( j, k) yielding |α d |~2 m , leaking~ℓ bits of information on d. To recover d, we use that the known vector is close to the unknown vector u = (dj 1 , …, dj n , d) ∈ L for L the same lattice as in Section 5.5 spanned by the rows of (2). By the analysis in Ref. [30], we expect to recover u, and hence d, by enumerating all vectors in L within a ball centred on v.
For RSA, Ekerå-Håstad performs between 3/4 and 1/4 as many modular multiplications as Shor's order-finding algorithm, as a function of s. For DH, the constant factor advantage is greater when comparing to Shor's algorithm for the DLP. For details, see [Ref. [30], Tables 2-4].

| Generalisations and extensions
If, in the algorithm as presented in the previous section, we take m equal to the bit length of r and ignore the requirement that d must be short, we obtain an algorithm [33] that computes both the order r and the logarithm d. It has potential cryptanalytic applications in the computation of discrete logarithms in random Schnorr groups of unknown order, since Shor's DLP algorithm requires r to be known.
The analysis in Ref. [33] shows that this generalised algorithm induces a two-dimensional probability distribution in (θ r , θ d ): The probability distribution induced by Shor's orderfinding algorithm [33, Fig. A], and by Ekerå-Håstad's algorithm for short discrete logarithms [[33], Figure 5], re-appear as marginal distributions of this two-dimensional distribution [ [33], Figure 4], indicating that the requirement that d is short may be relaxed without impacting the distribution.

| General discrete logarithms
Shor's algorithm for the general DLP [1] in groups of known order is a good option for solving the elliptic curve DLP (EC-DLP), to, for example, break EC-DSA and EC-DH. Proos and Zalka [42] provide an early implementation of the group arithmetic. Shor's DLP algorithm is also a good option for solving the DLP in safe-prime groups with full-length exponents and in Schnorr groups of known order. If Shor's DLP algorithm is modified as in Ref. [43] to induce the state 1 2 2ðmþςÞ X 2 mþς a;j¼0 for m the bit length of the order r and ς ≥ 0, some small constant, when solving g and x = g d for d ∈ ½0; rÞ, then the qubit recycling optimisations described in Section 5.4 are directly applicable. In Ref. [43], it is shown heuristically that this modified algorithm achieves a very high success probability [43, Table 1], if ς is selected to slightly increase the lengths of the control registers compared to what Shor originally proposed and if a small search is performed in Shor's classical post-processing when solving (j, k) for d given r.
It is furthermore explained in [Ref. 43, Section 5.2] how tradeoffs may be achieved by instead using lattice-based postprocessing. The idea is very similar to that in Section 5.6, but it requires r to be known.
Kaliski [44] has proposed a different kind of tradeoffs, the idea being to compute the logarithm d, one half-bit at the time via the Blum-Micali [45] reduction. This achieves a maximal tradeoff, at the expense of performing very many runs of the quantum algorithm.

| Simulations
As previously stated, the probability distributions induced by Shor's algorithms, Seifert's algorithm, and the various algorithms by Ekerå and Håstad may be captured exactly, as error bounded approximation or-for Shor's algorithm for the DLP -heuristically.
This in turn implies that the algorithms may be simulated classically [30,33,43], when the solution to the problem instance is known, for example, enabling the characteristics of various post-processing strategies to be evaluated in practice.

| Cost estimates
The hard part when implementing the aforementioned algorithms is to implement the group arithmetic in a fault-tolerant manner. There are a number of studies in the literature that propose efficient circuits for the group arithmetic and that cost them in various models.
For recent full-stack physical cost estimates for the aforementioned algorithms when working in Z * N , see Refs. [17,46]. For recent optimised logical circuits and cost estimates for the EC-DLP, see Refs. [47,48]. For a recent physical cost estimate, see Ref. [49].
For physical cost estimates that cover both RSA and the EC-DLP, alongside some widely used symmetric cryptosystems, see Ref. [50].
For a recent survey that compiles expert opinions on the impact of some of these algorithms and cost estimates on the migration timeline, see Ref. [51]. For broader studies, see for example, Refs. [52,53].

| GENERALISING SHOR's ALGORITHM
Shor's factoring algorithm can be viewed as the resolution of the HSP problem on Z. Namely given f : Z → Z N defined by f (x) = a x (mod N), which has period r and is injective on Z=rZ, find r. Hallgren [54] generalised Shor's algorithm by considering HSP on R. Since quantum computers are digital, additional conditions are introduced on the oracle function to facilitate finding the hidden subgroup once the function is discretised. Efficient quantum algorithms for several basic numbertheoretical problems were then devised, thereby reducing them to the HSP on R, including solving Pell's equation, whereas the best classical algorithms need superpolynomial time. This was later extended to R c with constant dimension, which allowed the design of efficient quantum algorithms for computing the unit group and the ideal class group of a number field and finding a generator of a principal ideal (a.k.a. the principal ideal problem PIP), in number fields [55,56]. These algorithms were proven to run in quantum polynomial time in infinite classes of number fields of fixed degree.
However, this approach is inefficient in families of number fields of arbitrary degree. Indeed, it is not clear how to solve an analogous HSP on R n efficiently, given the exponentially growing error with the dimension due to discretisation. This is a consequence of the fact that a unique representation of the HSP oracle function's output is essential. However, when reducing the number-theoretical problems to HSP, the output of the oracle is typically a real-valued lattice, lacking a canonical representation. The methods of Refs. [55,56] designed for families of fields of constant degree do not satisfy these requirements, and computing a suitable oracle function in polynomial time poses a roadblock.
These difficulties were overcome by Eisenträger, Hallgren, Kitaev and Song [57]. They introduced a restricted HSP in R n , which imposes a Lipschitz continuity condition on the oracle function. The methods for solving this HSP problem introduced in Ref. [57] successfully take advantage of this property of the oracle and achieve a polynomial run time. In Ref. [57], the authors also gave an efficient reduction from computing the unit group in high-degree number fields to the continuous HSP they introduced. A key ingredient is a quantum encoding of a real-valued lattice, addressing the issue of unique representation. Later, Biasse and Song [58] followed this framework and constructed reductions that enable the computation of any S-unit group in arbitrary-degree number fields, which as a consequence gave efficient quantum algorithms for PIP and computing the class group as well. Their quantum algorithm for the PIP turned out to be useful to break several cryptosystems based on a specialised lattice problem [59,60]. This BIASSE ET AL.
Here, we give a brief overview of the continuous HSP and the quantum algorithm for solving it. The exposition is adapted from Ref. [57].
Problem 1 (The continuous HSP over R m ). Assume that the unknown subgroup L ⊆ R m is a full-rank lattice satisfying the following promise: there are positive parameters (a, r, ɛ) and a function f : R m → S, where S is the set of unit vectors in some Hilbert space, such that Under these conditions, the continuous HSP problem is to compute a basis of L.
The resolution of Problem 1 relies on the ability to make (efficient) oracle calls jx; 0〉 ↦ jx〉 ⊗ jf ðxÞ〉. The quantum algorithm presented in Ref. [57] first creates a superposition of points in R m of the form P x∈R m wðxÞjx; 0〉 (normalisation factor omitted) with a sufficiently broad wave function w. This is similar to the first stage of Shor's algorithm. Here, w effectively truncates the function to a finite domain. Then, the oracle f is applied in superposition. In fact, since quantum computers are digital, we can only prepare the superposition over a fine discrete grid that approximates wðR m Þ. Then, we would like to measure the state in the Fourier basis. Shor's algorithm would perform the Quantum Fourier transform over a finite group Z N for a large enough N and measure in the standard basis. However, the resulting approximation errors would be difficult to analyse. In Ref. [57], the authors develop a variation on the phase estimation technique, which may be viewed as approximately performing Fourier sampling over Z. To see how this helps reveal the hidden lattice, note that loosely speaking, in the Fourier domain, the HSP function would be peaked at points in the dual lattice L * ≔ fx ∈ R m | ∀ y ∈ L; 〈x; y〉 ∈ Zg. In the course of the algorithm, the truncation only disturbs the Fourier domain lightly for the window function w we choose. Then the disturbance due to discretisation can also be controlled due to the Lipschitz condition of the function.
In slightly more technical terms, the subsequent measurement yields u ∈ L* with the probability distribution Repeating the procedure sufficiently many times, we obtain an approximate set of generators for L*, from which we can compute an approximate basis for L efficiently.

| SYMMETRIC CRYPTOGRAPHY
While not entirely immune to the quantum computing threat, symmetric cryptography is expected to retain a significant level of security in face of the threat. Indeed, the computational assumptions used in symmetric cryptography usually do not exhibit a structure that could be exploited by a quantum adversary, as is the case for the IFP and DLP hardness assumptions and Shor's algorithm.

| Block ciphers
A block cipher E k with block size n and key size κ is a family of permutations of {0,1} n indexed by k ∈ {0,1} κ . The first and most important security feature expected from a block cipher design is security in the secret-key setting: Problem 2 (Secret key recovery). Given access to encryption and decryption oracles for E k , with a key k chosen uniformly at random, recover k.
In practice, an adversary can also try to recover one key among multiple targets, classically as well as quantumly. We focus here on the simplest case of a single key. It is expected that the best algorithm for recovering k is generic exhaustive search. This attack is applicable to any block cipher. It performs ⌊n/κ⌋ þ 1 queries with arbitrary plaintexts p i , stores the obtained plaintext-ciphertext pairs (p i , c i ), and searches for a candidate key k such that E k (p i ) = c i for all the pairs. Note that false positives (wrong keys satisfying this condition) may occur. But if we assume that all permutations E i are drawn uniformly at random, the choice of ⌊n/κ⌋ þ 1 pairs ensures that the probability of false positives is exponentially small in n. Thus, we can safely assume that there is a single solution to the exhaustive search problem. However, this cannot be proven from the specification of E.
Usually, ⌊n/κ⌋ is constant, and this algorithm requires O (2 κ ) evaluations of the cipher. Any algorithm performing better is considered as a valid break, even if it is computationally infeasible to mount the attack in practice.
Since being broken in this definition is a one-bit information, cryptanalysts usually consider reduced versions as attack targets. Let us take as an example the well-studied, 20-year old block cipher standard AES [63]. AES comes in three versions, AES-128, AES-192, and AES-256, with a block size n = 128 and a key size κ ∈ {128, 192, 256}. It is a Substitution-Permutation Network (SPN) with, respectively, r = 10, 12, 14 rounds for the three versions. To date, the full AES has withstood cryptanalysis in the secretkey model: Only 7, 8 and 9 rounds have been successfully attacked [64].
Key search with Grover's algorithm: The problem of recovering the key, given a few plaintext-ciphertext pairs (p i , c i ), is an instance of unstructured search, to which Grover's algorithm can be applied.
The search space is the set of all keys {0,1} κ , and as noticed above, we can ensure that there is only one marked element.
The oracle O f evaluates the function In general, the cost is reduced from O(2 κ ) to O(2 κ/2 ) evaluations. This is why it is often recommended to double the bit length of keys when aiming for long-term security, when the design permits it, which is the case for AES-128. For other block ciphers, doubling the key length would require proposing a new design, whose security would need to be studied. Note that as mentioned in Section 3.2, Grover's search parallelises inefficiently (contrary to classical search). This means that doubling the key size is actually a very conservative measure, especially if one considers a realistic adversary limited in time.
Several authors have studied the exact cost of the quantum circuits involved in Grover search, starting with Ref. [65]. As with the classical exhaustive search, one encryption costs more than one gate; instead of 2 64 basic operations, the Grover search on AES-128 has been optimised so far to about 2 80 quantum gates [66].
Quantum break of block ciphers: The secret-key cryptanalysis of block ciphers can be naturally extended to the quantum setting. Instead of a key-recovery algorithm faster than the classical exhaustive search, the goal becomes to find a quantum key-recovery algorithm faster than Grover's search. These procedures usually combine classical cryptanalysis techniques with amplitude amplification and quantum walks [67]. For example, the best (but only) results obtained with AES so far [68] attack, respectively, 6, 7 and 8 rounds instead of 7, 8 and 9.
Since attacks are always studied with respect to a generic algorithm, the availability of Grover search, which is a powerful generic algorithm, makes many of them relatively less powerful. However, it is still necessary to study them, as classical attacks do not give meaningful information for the expected quantum security (see the example of SPHINCS-Simpira below).
Finally, note that block ciphers do not exist in a vacuum: They are used as building blocks in operation modes, whose security is discussed in Section 7.3.

| Hash functions
A hash function H is a one-way function that transforms a message of any size into a digest of fixed bit size n. The following problems are expected to be difficult for a secure hash function: In the second step, the probability that a random y collides on one of the precomputed values is 2 n=3 � 1 2 n ¼ 2 −2n=3 , which leads to a time complexity O(2 n/3 ). This algorithm is optimal for a random function [70]. However, not only does it reach a less than quadratic speedup but also consumes a considerable amount of memory: the By reducing the number of elements below 2 n/3 , one obtains the time-memory trade-off curve T 2 � M = 2 n , which contains all quantum collision search algorithms known to date. When M ≤ 2 n/5 , the QRACM requirement can be dropped [71]. The memory storage becomes classical memory with sequential access, which is easy to implement.
For comparison, classical collision search can be parallelised with the time-space trade-off T � S = 2 n/2 [72]. Here 'space' includes both memory and parallel processors. No quantum collision algorithm known to date reaches below this curve, so they remain only applicable if a large memory is considered cheaper than a comparable amount of processing units.
Quantum break of hash functions: Although generic quantum collision search suffers from less speedup and memory usage, this actually makes non-generic quantum collision search algorithms more appealing. Hosoyamada and Sasaki [73] demonstrated that some collision attacks on hash functions could benefit from quadratic speedups, yielding new and improved attacks in the quantum setting. Indeed, suppose that a classical algorithm of time complexity 2 n/2 < T < 2 2n/3 finds a collision of H. Its time complexity is too large to qualify as a classical attack. However, if it benefits from a quadratic acceleration, then we have ffi ffi ffi ffi T p < 2 n=3 , meaning that it yields a quantum attack.
There are two important conclusions to draw from this: First, quantum collision attacks on hash functions can be stronger (in terms of rounds attacked) than the classical ones. Second, one should not assume a post-quantum security level for collision search only based on the generic BHT algorithm. BIASSE ET AL.

-17
For example, in the proposal Gravity-SPHINCS [74], a modified version of the post-quantum hash-based signature scheme SPHINCS, the hash functions used in the scheme need collision resistance (instead of only preimage resistance for SPHINCS). The authors considered a generic level of security 2 n/2 for quantum collision search, equal to the quantum preimage security, due to the time-memory tradeoff of collision algorithms detailed above. But there could exist hash functions which, although considered classically secure, would invalidate these security claims.

| Structured constructions and superposition attacks
Contrary to what is often perceived, symmetric cryptographic designs are not devoid of structure. Most symmetric cryptography algorithms exhibit, in fact, a strong algebraic structure, as they need to build high-level functionalities (encryption, authenticated encryption etc.) from very small components (e.g. block ciphers of fixed block size). However, this structure is not necessarily exploitable by a quantum adversary. In this section, we look at structural attacks without classical equivalent (contrary to the examples presented above).
The first structural attacks on (classically secure) symmetric designs were published by Kuwakado and Morii in Refs. [75,76]. Notably in Ref. [76], they remarked that the key-recovery in the Even-Mansour block cipher could be solved as an instance of Boolean period-finding. The Even-Mansour cipher is a generic construction of a block cipher E k 1 ;k 2 from an n-bit public permutation Π and two n-bit keys k 1 , k 2 ( Figure 6): Kuwakado and Morii's attack defines the following function: which is such that f(x ⊕ k 1 ) = f(x). Finding the secret k 1 becomes an instance of the following problem: Problem 3 (Boolean period-finding). Given access to a two-toone function f such that ∀x, y, f(x) = f(y) ⇔ y ∈ {x, x ⊕ s} for some value s (the period), then find s. This problem is a special case of the Hidden Subgroup Problem in the group G ¼ ðZ 2 Þ n with the subgroup H = {0, s}. It is solved by Simon's algorithm [77], which was an inspiration for Shor's, in about O(n) queries to f (thus, to E k 1 ;k 2 ), breaking the cipher. However, it requires superposition queries to the cipher, that is, the ability to create a state of the form 1 2 n=2 P x∈f0;1g n � � x; E k 1 ;k 2 ðxÞ〉. Therefore, the implementation of the attack must use a quantum embedding of E k 1 ;k 2 , which means that a black-box that contains the secret keys must be available.
Later on, it was shown that such superposition attacks can target many constructions that are known to be classically secure [67], using Simon's algorithm, but also Kuperberg's algorithm [78] and even Shor's algorithm itself [79]. The practical implications of these attacks remain debated, since without superposition queries, they are inapplicable.
Structured attacks with classical queries: It is now known that the structure exploited by some superposition attacks can also be exploited by attacks making only classical queries, that is, by a standard quantum attacker listening to today's classical communications. Though these attacks do not lead to polynomial-time breaks, they allow one to obtain significantly better time-memory tradeoffs [80] and a more-than-quadratic quantum time speedup on a key-recovery attack [81]. We will now review the principle of the offline-Simon algorithm, on which they are based. A typical target example is the FX construction (Figure 7), which increases the key length of a cipher E k by XORing two additional n-bit keys k 1 and k 2 . If k is known, the FX cipher becomes an Even-Mansour cipher. Using Kuwakado and Morii's attack on Even-Mansour, k is found by looking for a value z such that f(z) = 1, where The oracle function O f is computed in superposition over z by calling Simon's algorithm. By using a Grover search on the space {0,1} 2n , if k is of 2n bits, this requires O(2 n ) iterations. This is the Grover-meet-Simon approach of Ref. [82]; a superposition attack so far, since each iteration needs to call FX in superposition.
In Ref.
[80], the authors introduced the offline-Simon algorithm, which performs the same attack, but makes only O(n) superposition queries to the FX quantum oracle at the beginning of the algorithm. Intuitively, the queries done at each iteration in Grover-meet-Simon are actually redundant and can be removed, leaving us only with a single layer of offline queries. These queries are stored using polyðnÞ qubits only, kept in a database and reused at each iteration.

F I G U R E 6
The Even-Mansour cipher [85] F I G U R E 7 The FX cipher [123] Since only O(n) superposition queries are now made, it becomes possible to replace them entirely with classical queries. Indeed, one can construct a superposition 1 2 n=2 P x∈f0;1g n jx; FXðxÞ〉 using 2 n classical queries to the black-box FX (the whole codebook) andÕð2 n Þ quantum time. One starts with a state 1 2 n=2 P x∈f0;1g n jx; 0〉, then for each i, XORs FX(i) to the second register if the first one is equal to i. Note that the memory used is still polyðnÞ qubits, since the classical queries to FX are consumed online and do not need to be stored. There is also no memory access here, since the database is not accessed in a classical sense; it is only reused to perform instances of Simon's algorithm.
Then, the resulting algorithm runs in quantum timeÕð2 n Þ, with a first step ofÕð2 n Þ computations to prepare the database, and a Grover search of timeÕð2 n Þ. It does not have a classical equivalent, though a classical attack in time O(2 2n ) does exist.
In Ref. [81], the offline-Simon algorithm was extended to target more constructions and in particular, constructions of the form: with two independent block cipher calls keyed by k. Any classical key-recovery attack requires time Ω(2 5n/2 ), and the best attack requires also Ω(2 n/2 ) memory. But the offline-Simon attack uses a quantum time O(2 n ), still with polyðnÞ qubits. It shows that in this case, doubling the key length would not restore the initial level of security. These recent results show that there exist inherently quantum key-recovery attacks, even in the classical query scenario. Although only polynomial gains are expected in this case, the scope of application of these attacks is not fully understood yet, as they have been discovered recently.

| Open problems
The most important open questions here are related to the quantum attacks without classical equivalents, which so far happened on structured constructions.

Question 1
Can we extend the scope of attacks on structured constructions, in order to target dedicated designs such as AES? Question 2 Concerning superposition queries, could there exist security arguments of ciphers against superposition attacks, the same way there exist security arguments against standard classical cryptanalysis techniques (differential, linear, algebraic etc.)? Question 3 Finally, concerning classical queries, what is the limit of quantum speedups? For example, is there a fixed exponent α such that if there is no classical attack of complexity T, then there will be no quantum attack of complexity T α ?

| Definitions
For m ≥ 1, a lattice L is a discrete finitely generated additive subgroup of R m . Equivalently, for a linearly independent set of vectors (called a basis) fb 1 ; …; b n g ⊂ R m , a lattice L is the set of all integer linear combinations of b i 's, that is, where the matrix B ∈ R m�n has b i 's as columns. The number of basis vectors, n, is the rank of L. In the rest, we shall be concerned with the case m = n (such lattices are called full-rank), as all the algorithms we described here can be easily adapted to the general case.
Associated to a lattice are its invariants: the determinant and the first successive minimum.
This volume is independent of the basis B. All lattice bases are related by unimodular transformations U (i.e. det U = �1), that is, B 0 = BU is another basis of L(B).

Definition 16 (Minimum distance of a lattice). The first successive minimum (or the minimum distance) λ 1 (L) of lattice L is the Euclidean length of its non-zero vector
In general, the ith successive minimum of a lattice, λ i (L), is the smallest r, s.t. L contains i linearly independent vectors of norms at most r. Minkowski's inequality states that for n-rank lattice L, λ 1 ðLÞ ≤ ffi ffi ffi n p det ðLÞ 1=n . It is tight up to a constant and is usually treated as equality to approximate the length of the shortest vector. The is obtained iteratively by setting b ⋆ 1 ¼ b 1 , and b ⋆ i as the orthogonal projection of b i on ðb 1 ; …; b i−1 Þ ⊥ for i = 2, …, n. This orthogonalisation process can be described via matrix-decomposition B ¼ B ⋆ μ t , where μ is a lower-triangular matrix with μ i;j ¼ 〈b i ; b ⋆ j 〉=kb ⋆ j k 2 for i ≥ j. Hard problems on lattices: There are several fundamental problems related to lattices. Some of them are used in the security proofs of some of the most promising proposals for quantum-safe cryptography.

Problem 4 (Closest Vector Problem (CVP)). Given a (target) point t ∈ R n and a basis for a lattice L, find v ∈ L closest to t.
In the promise variant of CVP, the Bounded Distance Decoding (BDD) problem, we know in addition that ‖t − v‖ < R where R ≪ λ 1 (L). In this case, the solution v is unique.
Note that a solution of the SVP is not unique: kvk = k − vk. We can relax the above and ask for a vector v s.t. kvk ≤ γλ 1 (L). This problem is called the approximate Shortest Vector Problem (γ-appSVP). The approximation factor γ can be a function of n, in particular, γ ¼ polyðnÞ is relevant for security of lattice-based cryptographic constructions. We should also emphasise that many of known SVP algorithms actually solve a variant of SVP, called Hermite SVP (or Hermite appSVP)-the problem that asks to find v s.t. ‖v‖ ≤ γ det (L) 1/n . This variant of SVP is stated relative to the determinant of the lattice, not to λ 1 (L). Again, this version of SVP is more relevant in cryptanalysis. Also, finding the exact value of λ 1 (L) is by itself a hard problem (as opposed to computing det(L)), so we are interested in algorithms that solve Hermite SVP.
Solving appSVP and BDD: lattice basis reduction and embedding techniques: Lattice basis reduction aims at improving the quality of the input basis, where the 'quality' of a basis is measured by the orthogonality of its vectors (it also translates into the slow decay of kb ⋆ i k's). There are several notions of reducedness of a basis ranging from fast but weak (in terms of quality of the output) LLL reduction due to A. Lenstra, H. Lenstra, and L. Lovász [83] to strong but computationally inefficient Hermite-Korkine-Zolotarev reduction [84]. The trade-off between the output quality and the runtime is achieved by a so-called BKZ reduction (short for Block Korkine-Zolotarev [85]). Together with a lattice basis, it receives as input integer parameter β and produces a basis with the first (i.e. the shortest) vector satisfying In other words, β-BKZ solves e O � 2 β 2n log β � -appSVP. BKZ works by calling an SVP-solver on certain (projected) sublattices of L of dimension β. In Ref. [86] it was shown that after polyðnÞ number of SVP-calls, the guarantee defined in Equation (3) is achieved. Hence, if the running time of an SVP solver for dimension n is T SV P ðnÞ, the running time of BKZ is T BKZ ðβÞ ¼ polyðnÞ ⋅ T SV P ðβÞ.
An SVP solver can also be used to solve CVP (and hence, BDD). In Ref. [87], Kannan shows that given a BDD instance for a lattice L in dimension n, one can construct an (n + 1)dimensional lattice L 0 such that a solution to (a slightly modified) SVP problem in L 0 gives a solution to the original BDD problem in L. Therefore, the most important algorithmic task is SVP, and it will be our focus in the rest of this section.

| Algorithms for SVP
Assume that we are given as input a lattice represented by a basis B ∈ Q n�n (we chose to work with rational bases in order not to deal with approximation issues, see Ref. [88] for realvalued input bases) with entries of bit-sizes polyðnÞ. Our task is to find a polyðnÞ approximation to the shortest vector in L(B), that is, solve polyðnÞ-appSVP. This is the most relevant setting in lattice-based constructions of signatures and encryption schemes [89,90]. More exotic constructions such as FHE schemes [91] reside on the hardness of Ω À 2 log c n � -appSVP, which is even easier for known algorithms. Furthermore, we shall be focussing mostly on heuristic algorithms, which may not be able to find exactly the shortest vector, but a polyðnÞ approximation to it with a small-degree polynomial. In practice [92], only heuristic versions of SVP algorithms are currently competitive for solving SVP in high dimensions.
State-of-the-art SVP solvers are presented in Table 1. Let us take a closer look at it. Runtimes T are given on the lg-scale relative to the dimension n with smaller order terms omitted, that is, the best known runtime for provable sieving is 2 2.465n + o(n) . The same is for the memory complexities M. Quantum algorithms may require classical random access memory (CRACM), quantumly addressable classical memory (QRACM), or quantumly addressable quantum memory (QRAQM). Most of the quantum algorithms mentioned in this section require QRACM.
Based on asymptotic time and memory complexities (in terms of the lattice dimension n), SVP solvers can be classified into two groups: (1) algorithms requiring super-exponential time 2 ω(n) and polyðnÞ space and (2) algorithms requiring both exponential time and space 2 Θ(n) . The former includes the family of so-called enumeration algorithms [93,94] that run in time 2 Θ(n lg n) . Recent result of Albrecht et al. [95] shows the runtime exponent of 1 8 n lg n þ oðn lg nÞ beating the longstanding exponent 1 2e n lg n þ oðn lg nÞ from Ref. [86]. The latter has been quantumly sped up in Ref. [96] using the quantum backtracking techniques (see Theorem 9) achieving the square-root improvement. It is not obvious that this backtracking technique immediately applies to the result of Albrecht et al.; hence, we do not put the potential exponent of 1 16 n lg n in Table 1. We give details on enumeration algorithms later in this section.
Single-exponential time and space algorithms vary in their underlying ideas. Sieving algorithms have received arguably most of the attention. Since their introduction in 2001 [97], a series of studies [98][99][100][101][102] have proposed various improvements culminating in the currently best known heuristic runtime and memory exponents of 2 0.292n + o(n) and 2 0.2075n + o(n) , respectively [103]. There is a significant gap between provable and heuristic sieving algorithms, and closing this gap is an interesting open problem. A very good introduction to probable sieving algorithms can be found in Ref. [104]. In this survey, we shall concentrate on heuristic sieving and make the assumption explicit when we describe the algorithms.
Another approach to solve SVP are algorithms that either use specific properties of discrete Gaussian distribution over a lattice (such as algorithms from Ref. [105] (called BDD-based in Table 1) and Discrete Gaussian Sampling (DGS) algorithms from Ref. [106]), or rely on the properties of the Voronoi cell of a lattice [107]. We are not aware of any quantum speed-ups reported on DGS or Voronoi cell SVP algorithms (hence, we put '-' in Table 1).

BIASSE ET AL.
Classical enumeration: We now give a high-level description of enumeration-based SVP. The algorithms rely on the process that given a radius R searches for all points in L ∩ B n ðRÞ, where B n ðRÞ denotes the n-dimensional ball of radius R centred at 0. Given a lattice basis B together with its GSO B ⋆ (recall that B ¼ B ⋆ μ t ), some t ∈ Span(L(B)) and R > 0, the algorithm will enumerate candidates by enumerating their coefficients x n , …, x 1 as described below: � Take all x n ∈ Z s.t. jx n j ≤ R=kb ⋆ n k. These x n 's will be the candidates for the last coefficients of x's from the ball. Note that the projection of x orthogonally to Span(b 1 , …, b n−1 ) is x n b ⋆ n , and it is contained in the ball B 1 ðRÞ. � For every x n from the previous step, take all x n−1 ∈ Z s.t. jx n−1 j ≤ −x n μ n;n−1 þ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi Enumeration thus constructs a tree whose nodes of level i = n, …, 1 are labelled by the possible values of x i , and where an edge between a node of level i + 1 and a node of level i exists if the corresponding value for x i is a possible choice given the values x n , …, x i+1 . A path from level n down to the level 1 gives a lattice vector x in B n ðRÞ. Certain paths do not lead to a leaf of level 1, because Ineq. (4) might not be satisfied for certain integers. The runtime of the enumeration is determined by the size of the tree. Traversing the tree can be done in the depth-first manner requiring only polyðnÞ space: we do not keep all the leaves and paths to them, but only the leaf that gives the current shortest solution.
Let us estimate the size of the tree. Let N k be the expected number of nodes on some level k of the tree that maximises N k . By construction, N k is the number of points in the ball B ðn−kþ1Þ ðRÞ that belong to L projected to the orthogonal complement of Span(b 1 , …, b k ). According to Gaussian heuristic, N k ≈ VolB ðn−kþ1Þ ðRÞ kb ⋆ n k⋯kb ⋆ n−k k , where the denominator is the determinant of the projected lattice. Computations show [108] that if the input basis B is BKZ-preprocessed (i.e. the shape of kb ⋆ i k's can be controlled), then N k = 2 n/2e lg n + o (n lg n) . Using a more involved preprocessing and tricks [95], one can further improve the runtime exponent to the currently best known In order to set up the radius R, one can, for instance, use Minkowski's upper bound on λ 1 . Such enumeration is rather costly, and one can hope that the projection of the shortest vector might be shorter than λ 1 . This idea lies behind the pruned enumeration [93] strategy, where instead of a ball, the enumeration chooses a different shape (e.g. a cylinder), thus pruning some branches of the tree. There is a trade-off between the success probability and the size of the tree for pruned enumeration that offer practical improvements. Asymptotically, pruning strategies affect the o(n lg n) term. lattice basis B and start by sampling an exponentially large list A of (long) lattice vectors from L(B). Sampling relatively long lattice vectors can be done in polyðnÞ time [109]. The elements of A are then iteratively combined to form shorter lattice vectors, x 0 = x 1 � x 2 � … � x k such that ‖x 0 ‖ < max i ≤ k ‖x i ‖, for some k ≥ 2. Newly obtained vectors x 0 are stored in a new list, and the process is repeated with this new list of shorter vectors. It can be shown [101] that after polyðnÞ such iterations we obtain a list that (heuristically) contains a shortest vector. Here we shall focus on the case k = 2, also known as 2-Sieve, and refer the reader to Ref. [110,111] for k ≥ 2 classical sieving and to Ref. [112] for k ≥ 2 quantum sieving. We only note here that varying k gives timememory trade-offs: For larger k's, the runtime increases but the algorithm requires less memory.
Following the analysis of 2-Sieve from Ref. [101], we can state the central heuristic assumption: all the lists of lattice points appearing during the sieve can be thought of as independently chosen uniform vectors on the unit sphere S n−1 . In reality, however, we deal with lattice vectors of Euclidean norm larger than 1 and, furthermore, a lattice may not even (and it likely does not) have exponentially many vectors of the same norm. Yet, we imagine that if we normalised all the vectors in the list A, then they would behave like independently chosen uniform vectors on S n−1 . The heuristic analysis proceeds as if we dealt with such normalised identically distributed uniform vectors. The main purpose of introducing this assumption is to aid the complexity analysis, although we cannot show the correctness of heuristic sieve. In particular, we cannot prove that it does not output 0. This is in contrast to provable algorithms [113,114] that make a lot of analysis effort to show that the output is not 0.
Under this heuristic, we estimate the size of the list A. Two vectors x 1 ; x 2 ∈ A ⊂ S n−1 satisfy ‖x 1 � x 2 ‖ < 1 − ɛ for some small ε ¼ 1=polyðnÞ, if the angle between them is less than π/3. This means that if we place a vector x 1 on S n−1 , any other vector x 2 with angular distance from x 1 being less than π/3 will produce the pair (x 1 , x 2 ) whose sum/difference is a shorter vector. In other words, x 2 belongs to the surface of the spherical cap 'centred' at x 1 -the area then contains all unit vectors s.t. the angle between them and x 1 is ≤ π/3, see Figure 8. The area of this surface relative to the area of the unit sphere is sin n (π/3).
Thus, in order to cover the unit sphere with such spherical caps, we need on expectation jAj ≈ 1=sin n ðπ=3Þ¼ ð ffi ffi ffi 3 p =2Þ n ¼ 2 0:2075nþoðnÞ vectors (as we assume that the list vectors are independently uniform on S n−1 ). By increasing this value by a polyðnÞ factor, we can heuristically guarantee that almost all spherical caps will contain at least a constant number of list elements, outputting new shorter vectors. We expect the output to be of size 2 0.2075n + o(n) . This allows us to repeat the whole process with these new shorter vectors being the new list A. After polyðnÞ repetitions of sieving, we end up with a non-empty list of short(est) lattice vectors.
We have just established the memory requirement for heuristic 2-Sieve. If we perform the search for all pairs x i , x j that sum to a shorter vector, the runtime will be jAj 2 ¼ 2 0:415nþoðnÞ .
A faster approach is built on the observation that the problem of searching for x i , x j is an instance of the near neighbour problem on the unit sphere. The near neighbour search technique of Becker-Ducas-Gama-Laarhoven [103] leads to the fastest 2-Sieve known today with 2 0.292n + o(n) runtime and 2 0.2075n + o(n) memory complexities. The algorithm starts by creating a set of 'buckets' indexed by vectors v ∈ S n−1 not from the given lattice. In the sequel, we make use of the following definition.
Definition 17 (Angular distance). We say that two vectors v, w ∈ S n−1 are α-close for some α ∈ [0, 1], when 〈v, w〉 ≥ α. The definition is driven by the distance measure on the unit sphere: The closer the angular distance between v and w to 0, the more orthogonal the vectors are. For the 2-Sieve, 1/2-close vectors produce a short sum/difference. Identical vectors are at distance 1. We shall drop the word 'angular' in the rest of the section.
For a given set of buckets, a list A, and closeness parameters 0 ≤ α, β ≤ 1, the search for close pairs proceeds in two steps: 1. for each x ∈ A, search for all buckets v's that are α-close to it and put x into the v-bucket; 2. for each x ∈ A, search for all buckets v's that are β-close to it and then check if these buckets contain a vector close to x.
The idea behind this bucketing is that vectors that end up in the same bucket are more likely to give a shorter sum than just two random unit vectors, so we do not have to search for a close pair through the whole list A but rather through a much shorter bucket. The relation between the parameters α and β is as follows: The closer α to 1 is, the more restrictive the condition on being in the bucket becomes; hence, the buckets contain fewer vectors, so the first step of the above procedure is fast. However, since we intend to find almost all close pairs, we are required to search through more buckets, or equivalently, β should be smaller.
To solve the task of finding relevant buckets for a given x, vectors v are chosen with some structure. For example, Ref. [103] proposes to choose v to be a concatenation of codewords from a specially crafted spherical code. The advantage of choosing such v is that it enables us to find relevant buckets for a given x in time proportional to the number of such buckets, which is (up to lower order terms) optimal. For further details on this technique, we refer the reader to Ref. [103]; here we simply assume that we can find all relevant buckets for a given vector in time equal to the size of the output.
Let us now analyse the algorithm. We use the following theorem proven in Ref. [115]. We give here its simplified version, where the '≈' sign hides polyðnÞ factors.
Theorem 18 (adapted from Th. 1 of Ref. [115]). If, for some constant k < n, x 1 , …, x k are independently uniformly distributed on S n−1 , then the probability that their pairwise inner-products satisfy 〈x i , where C ∈ R k�k is a symmetric positive-semidefinite matrix that stores the pairwise inner-products of x i 's.
� From Theorem 18, the probability that x ∈ A belongs to a fixed bucket is det � If we let |V| be the number of buckets, then for each x ∈ A, finding all its relevant α-close buckets takes time jV jð1 − α 2 Þ n=2 , which is the expected number of buckets a point will be put into. � Analogously, for each x ∈ A, inspecting all its β-close buckets takes time jV jð1 − α 2 Þ n=2 ⋅ jAjð1 − α 2 Þ n=2 .
� Finally, the number of needed buckets |V| can be computed from the probability of the event that a triple (x, x 0 , v) satisfies 〈x, v〉 ≥ α, 〈x 0 , v〉 ≥ β provided that 〈x, x 0 〉 ≥ cos(π/3) = 1/2. The inequalities can be treated as equalities [111,Theorem 3], leading to (up to polyðnÞ factors) The optimal values that balance the costs are α = β = 1/2, resulting in jV j ¼ À 3 2 � n=2 ≈ 2 0:292nþoðnÞ . This value determines the runtime. There are techniques [103] that allow not to store all the buckets at once at the price of a slight increase in runtime, which does not affect the leading order term. We conclude on the time and space complexities of classical 2-Sieve: Quantum sieving: One can apply amplitude amplification techniques to the naive 2-sieve to speed up the 2 0.415n + o(n) algorithm to 2 0.312n + o(n) , [116]. Assuming that we can store the (classical) list A in QRACM enables us, for each x ∈ A, to find a y ∈ A s.t. ‖x � y‖ < max{‖x‖, ‖y‖} in time roughly ffi ffi ffi ffi ffi ffi ffi jAj p (here we also assumed that we have 1 yper x on expectation). Following the notations from Section 3.2, we can efficiently implement an algorithm A that produces a superposition over the pairs from A, that is, (up to normalisation) We then apply the amplitude amplification procedure to jψ〉 with the checking function f(x 1 , x 2 ) that evaluates to 1 if We only need polyðnÞ-size registers to perform the algorithm in addition to a QRACM of size OðjAjÞ. A quantum version of the near neighbour-based sieve is proposed by Laarhoven in Ref. [117]. The idea is to modify the second step of the 2-sieve algorithm with near neighbour search, namely 1. classically store x's in their α-close buckets. As in Section 7.2, the memory model used is QRACM: classical memory with quantum random access. 2. for each x ∈ A, find classically all its β-close buckets, create a superposition over all these relevant buckets, and apply amplitude amplification to only those buckets that contain vectors close to x.
As in the classical 2-Sieve with near neighbour search, Laarhoven's algorithm profits from the fact that the buckets we run Grover on are much smaller than the whole list A, and yet, due to the way we construct these buckets, we do not miss the solutions (a rigorous proof of this statement can be found in [Ref. 111,Theorem 3]).
In more details, for each classically known x ∈ A, we can design an algorithm A x that uses the buckets stored in QRACM and satisfies A x j0〉 ⊗n ¼ jψ〉 with where the outer sum ranges over all buckets (indexed by v) that are β-close to x, and the inner sum ranges over all elements y from each bucket Bucket v . We apply amplitude amplification with A x and the checking function f x (y) that outputs 1 if ‖x ± y‖ < max{‖x‖, ‖y‖}. Using the above analysis of the classical 2-Sieve, it is not difficult to see that Laarhoven's quantum 2-Sieve algorithm is optimised for α ¼ β ¼ Here again, we only need polyðnÞ-sized quantum registers. Note that we decreased α and β in comparison to the classical near neighbour 2-Sieve, since now we can allow for larger buckets as the search inside the buckets is improved. In turn, it means that in total we need fewer buckets. Recall that the total number of buckets determined the runtime. However, contrary to classical sieve, we must store all the buckets because we run Grover search over them. Hence, the space complexity S Q 2Sieve ðnÞ is asymptotically the same as the time complexity.
Recent results of Refs. [118,119] further improve quantum 2-Sieve by either exploiting quantum random walks such as in Ref. [118] or in Ref. [119] by running amplification not only over the β-close buckets found classically but over all buckets using as the 'checking' oracle a circuit that samples a β-close bucket with a potential close vector from its bucket. This algorithm sets α = 1/2 and β ≈ 0.44, which results in It is more subtle to describe memory requirements of the result from Ref. [118]. As it is based on quantum walks, it requires QRAQ, QRAC, and classical memories, see Table 1.

| Average-case problems: LWE and SIS
So far, we have been considering the so-called worst-case lattice problems, that is, the problems where the input lattice may be arbitrary. Cryptographically relevant problems are average-case problems where an instance is generated using some random (known) process. The purpose of this subsection is to describe two main average-case problem on lattices: the Learning with Errors (LWE) problem and the Short Integer Solution (SIS) problem. We refer the reader to the comprehensive survey [120] about the hardness of these two problems and their use in cryptography. We do not consider here the NTRU problem; instead, we refer the reader to the recent survey on NTRU by Albrecht and Ducas [121].
The Learning with Errors problem: introduced by Regev in Ref. [122]; LWE is an average-case instance of BDD on the lattice where A ∈ Z m�n q is a uniform random matrix, q > 1 is a modulus, and m ≥ n ≥ 1 are integers. With overwhelming probability, L q (A) has rank m, and det(L q (A)) = q m−n . A BDD instance requires a target vector, and in case of LWE, it is for the secret vector s ∈ Z n q and the 'noise' vector e ∈ Z q , where kek ≤ ffi ffi ffi ffi m p αq for some α ∈ (0, 1). We can think of n as of the main security parameter, and the other parameters m, q, α are the functions of n, e.g., n ¼ Θðbit securityÞ, m = Θ(n log q), q = n Θ(1) , and α ¼ ffi ffi ffi n p =q. According to Minkowski's bound, λ 1 À L q ðAÞ � ≤ ffi ffi ffi ffi m p q 1−n=m , which is much larger than ‖e‖ in the LWE setting; thus, we have a valid BDD instance (L q (A), b).
The beauty of LWE lies in its versatile use in cryptography: the constructions based on the hardness of LWE range from 'standard' cryptographic primitives such as encryption schemes [122,123] to fully homomorphic encryption [124] and attribute-based encryption [125]. From the complexity perspective, LWE is particularly attractive due to what is called a worst-case guarantee: Regev [122] shows a quantum reduction from the worst-case problem called the Short Independent Vector Problem (SIVP) to LWE. Problem 6 (Approximate SIVP). Given a lattice L and γ ≥ 1, the approximate shortest independent vector problem asks to find n linearly independent vectors from L s.t. their norms are at most γλ n (L).
Regev gives a polynomial time quantum reduction from SIVP with parameter γ on an arbitrary n-dimensional lattice to LWE with m ¼ polyðnÞ; q < 2 polyðnÞ and αq > 2 ffi ffi ffi n p for the LWE error e following the 'discrete Gaussian' distribution. We shall not define this distribution formally here; it is enough to think about this distribution as of discrete analogue of the Gaussian distribution over R restricted to a discrete set, say Z, and extended to Z n by sampling each coefficient independently.
Regev's reduction tells us that any algorithm for LWE, be it classical or quantum, efficiently transfers to a quantum algorithm for SIVP, which we believe to be a hard problem. Therefore, a reasonable question is, how hard is LWE?
Since LWE is a BDD instance, a natural approach is to use Kannan's embedding and apply BKZ reduction. Under Kannan's embedding, an LWE instance (A, b) translates into a O � q 1−n=m αq � -appSVP instance. In order to solve this appSVP, we run BKZ reduction with parameter β set to be of order This value for β LWE is obtained by solving for β the equation 2 m 2β log β ¼ q 1−n=m αq and minimising it for m. We can plugin β LWE into any complexity of SVP solver (classical or quantum) for a β LWE -dimensional lattice, thus obtaining an estimate on the asymptotic hardness of LWE.
More elaborate analyses [126] and LWE estimators [127,128] refine the behaviour of BKZ reduction and consider the evolution of the norms of Gram-Schmidt vectors during BKZ, leading to a more accurate estimate aiding to set the security levels of concrete LWE parameters. We do not detail this technique here, but refer the reader to the above references.
Do there exist specific non-lattice-based attacks on LWE? First, there are classical combinatorial approaches to solve LWE [129][130][131]. Second, there is a method (also classical) due to Arora-Ge [132], see also Ref. [133], that uses Gröbner-basis solvers to attack LWE. These methods are currently inferior to lattice-based attacks for all practically relevant parameter sets, partially due to fact that they perform poorly (or even do not work at all) when m is as small as Θ(n), or even polyðnÞ. We are not aware of any reported quantum speed ups specific to these attacks.
There are also interesting results, when 'LWE is given as quantum states'. One can define at least two versions of what it means. First, following Ref. [134], one can ask whether for a fixed s and e, given a superposition over all possible rows of A as 1 q n=2 P a∈F n q ja〉j〈a; s〉 þ e〉; one can efficiently find s and e. The authors in Ref. [134] give affirmative answer, noticing that QFT, applied to the above state, reveals the solution.
In another formulation of "quantum LWE", stated in Ref. [135], one is given a uniform random A ∈ Z m�n q with a i 's being the rows of A, and 1 ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi P i∈Z p jf ðiÞj 2 mod q〉, where f() can be any function over Z s.t. P i∈Z i ⋅ jf ðiÞj 2 < þ∞ (for example, f can be a pdf of discrete Gaussian distribution or an indicator function of a finite set). Now finding s, e in this variant of quantum LWE appears to be hard, as we do not know any efficient algorithm for it [135], apart for some specific cases of f and of LWE parameters [136].
The Shortest Integer Solution problem: for n > 0, m ≥ n, q > 1, and b > 0, asks, given a uniform matrix A ∈ Z m�n q , to find x ∈ Z n , s.t. 1) x t A ¼ 0 mod q, and 2) 0 < ‖x‖ ≤ b.
As in LWE, one can think of n as of the main security parameter, m = Θ(n log n), q ¼ polyðnÞ.
The problem has been introduced by Ajtai in Ref. [137], where he showed a classical worst-case reduction from SIVP with approximation factor γ ¼ b ⋅ polyðnÞ on an arbitrary ndimensional lattice to SIS with m ¼ polyðnÞ, q > b ⋅ polyðnÞ.
The relation to lattices is immediate once we look at the socalled orthogonal lattice This lattice is of dimension m, and with overwhelming probability its determinant is det ffi ffi ffi ffi m p q n=m , which asymptotically belongs to Θð ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi n log n p Þ when m = Θ(n log n), q ¼ polyðnÞ. Analogous to LWE, SIS is the appSVP problem, but now on the lattice L ⊥ q ðAÞ with approximation factor β Θð ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi n log n p Þ . Exactly the same arguments as for LWE lead to the estimate on the BKZ parameter β needed to solve SIS: Again, as in LWE, one can run BKZ with either classical SVP oracles, thus having 2 0:292β SIS þoðβ SIS Þ as the currently best achievable runtime, or with quantum SVP oracle improving the 0.292 constant to 0.2571.
There is much less non-lattice-based approaches to solve SIS than for LWE. A recent result from Ref. [136] gives a quantum algorithm for SIS in the infinity norm, that is, when the SIS solution x is required to be bounded in ℓ ∞ -norm. The algorithm achieves polynomial time when = (q − c)/2, and m = Ω(q 4 n c+1 log q) for some constant c. Note that SIS (in ℓ ∞ -norm) is easy when b = q/2; thus, this result gives an efficient algorithm for a non-trivial range, yet it is far from what is used in cryptographic applications. In particular, certain signature schemes, such as Ref. [90], rely on the hardness of SIS in ℓ ∞ for b ≈ q/8. Apart from this quantum algorithm, Ref. [136] gives a classical algorithm that solves SIS in time O À n log q � when m ¼ O À n log q � .

| Open problems
The results presented in this section are purely asymptotic and it is far from an easy task to estimate the small order terms. Currently to get a 'feeling' on how the algorithms perform we actually implement and run them in practice. While this is possible for classical algorithms [138], at this stage it seems very unlikely that we shall be able to run quantum sieve (or BIASSE ET AL. -25 even low-memory enumeration) in the near future. The result from Ref. [139] suggests that speed-ups for memory-intense SVP approaches are rather dubious under our today's understanding of quantum gate complexity. Hence, Question 1 are known quantum speed-ups for SVP algorithms plausible, or do the hidden terms and (quantum) memory overheads diminish them? Switching to a more optimistic side and assuming that we shall be able to run SVP algorithms quantumly, the question of designing efficient quantum circuits for polynomial-time algorithm like LLL is open. Therefore, Question 2 what are potential quantum speed-ups for 'easy' lattice-reductions like LLL or BKZ with small block-size?
Turning to the average-case problem, like LWE and SIS, the currently most efficient approach to solve these problems for cryptographically relevant parameters is to run the appSVP solvers, that is, BKZ reduction. So, Question 3 are there any non-lattice-based approaches (classical or quantum) to solve LWE/SIS that are faster, may be even for some limited parameter ranges, than the approach via lattice-reduction?
The last question is not limited to the ℓ 2 -norm: LWE is also interesting when the error vector comes form a uniform distribution. Respectively, SIS is of importance as well [90] when the solution is required to be bounded in ℓ ∞ -norm.

| Definitions
Throughout this section F denotes a finite field.

Definition 19 (Hamming weight)
. For x ∈ F n , the Hamming weight, denoted by ω(x), is the number of non-zero coordinates of x. Definition 20 (Linear code). For integers 1 ≤ k ≤ n and d < n, a linear [n, k, d]-code C is a subspace of F n of dimension k, where d ≔ min c≠0;c∈C ωðcÞ is called the minimal distance of the code.
Associated to a linear code C are its generator matrix G ∈ F k�n and its parity-check matrix H ∈ F n−k�n . A code C can be equivalently defined as the row space of G or as a kernel of H. Decoding (ISD)). The Information Set Decoding problem asks to find the error-vector e ∈ F n given a parity-check matrix H ∈ F ðn−kÞ�n of a linear [n, k, d]-code C, a vector s ¼ He t called the syndrome, and w-the Hamming weight of e.

Definition 21 (Information Set
If w ≤ ⌊ d−1 2 ⌋, then the solution of ISD is unique. The weight w can be unknown, in which case one can simply guess it as it lies in the known small range. Let us make some specific instantiations of the ISD parameters relevant to (post-quantum) cryptography:

� Classic McEliece KEM. The security of Classic McEliece Key
Encapsulation Mechanism (KEM) [140] is based on the ISD problem over F 2 , where H is systematic form of the paritycheck matrix of a Goppa code, and w = (n − k)/⌈ lg n⌉. Among the existing post-quantum cryptographic schemes, Classic McEliece is considered to be the most conservative from the security perspective. � BIKE [141] is another code-based key encapsulation mechanism over F 2 , where n = 2k, the parity-check matrix H is of the form ½rotðhÞjI k �, where rotðhÞ is a circulant matrix consisting of the coefficients of cyclic rotations of public polynomial h, and w ≈ ffi ffi ffi n p . In essence, BIKE is a binary version of the NTRU cryptosystem [142]. � WAVE signature [143], contrary to the above examples, is based on a version of the ISD problem that asks for a dense error solution, not a sparse one. This is partially driven by the fact that the problem is formulated over F 3 . Relevant to WAVE is the setting w ≈ 0.95n and with multiple solutions (intuitively, there usually exists many signatures for a message, hence the non-uniqueness of ISD solutions).
From the above examples, the hardness of ISD directly impacts the security of various cryptographic scheme. In the rest of the section, we specialise to the case F ¼ F 2 . The methods we describe generalise to the F q setting, see Ref. [144] and, for cryptanalysis of the WAVE setting see Ref. [145].
In order to simplify the analysis of ISD algorithms, it is usually assumed that the matrix H is drawn uniformly at random from F n−k�n 2 . Despite the fact that some cryptographic schemes use structured H, such as quasi-cyclic H in BIKE, we are not aware of significant speed-ups that use this structure (the speed-ups we know are at most linear in n, see [Ref. 141, Sec. B.2.1]). With this assumption on H, the hardness of ISD resides on two parameters: n and ω(e). The dimension k is usually related to n as k = Θ(n), often k ≈ n/2.
All known classical ISD algorithms [146][147][148][149] try to find e by enumerating its search space, which is of size À n w � . The difference between various ISD algorithms lies in the way this enumeration is performed. All known quantum algorithms [150][151][152] for ISD are quantum versions of the existing classical algorithms, in which some routines are sped up by the quantum methods such as amplitude amplification or quantum random walks. Below, we describe known ISD algorithms and their quantum speed-ups.
We should warn the reader that for simplicity of the exposition we omit O or e O terms. We do not present the complexities of the form 2 cnþoðnÞ for some constant c, as it is often done in the cited studies. Instead, we let the reader be able to optimise the complexity formulas by themselves for the concrete parameters they are interested in.

| ISD algorithms
Prange's algorithm: In the 60s, Prange showed [148] how a simple linear algebra trick can improve the enumeration of e. Notice first that permuting the columns of H is equivalent to permuting the positions of 1's in e. Prange's algorithm consists in finding a permutation π such that π(e) has exactly zero 1's on the first k coordinates and all the weight w is distributed over the last n − k coordinates.
To check whether a candidate π is good, we transform π (H) into systematic form [Q|I n−k ] (provided the last n − k columns of π(H) form an invertible matrix, which happens with constant probability) for Q ∈ F n−k�k 2 . The same transformation is applied to the syndrome s giving a new syndrome s and a new decoding equation Qe 1 þ e 2 ¼ s, where e = [e 1 | e 2 ] and ω(e 1 ) = 0, ω(e) 2 = w. It follows that for such π, e 2 ¼ s, and one checks whether ωðsÞ ¼ w.
We expect to find a 'good' permutation π (i.e. a π that gives the correct weight distribution and makes the last n − k columns of π(H) invertible) after trials, which is the inverse of the probability of finding a 'good' π.
To speed-up Prange's algorithm quantumly, Bernstein in Ref. [150] uses Grover's search over the space of permutations, where Grover's function f evaluates to 1 on a 'good' permutation π, that is, Cost Classical and quantum memory complexities of Prange's algorithm is polyðnÞ as we only store one matrix H and one vector s. Such a low memory requirement is a significant advantage of this algorithm over the other memory challenging ISD solvers we discuss next.
Stern's algorithm: It is convenient to work with the ISD . Rather than restricting the weight of e to be 0 on the first k coordinates, Stern in Ref. [149] proposed to allow p > 0 non-zero coordinates in the first k indices at the price of a more expensive check for π. It was later improved in Ref. [153] and also in Ref. [154].
Stern's idea can be viewed as a meet-in-the-middle technique: Assume a good permutation π gives us e with (w − p) 1's on n − k coordinates that correspond to I n−k , and with p 1's on k coordinates that correspond to Q. Then represent e as e = (e 1 ‖0) + (e 2 ‖0) + (0‖e 3 ), where e 1 ∈ F k=2 2 � 0 k=2 , e 2 ∈ 0 k=2 � F k=2 2 , e 3 ∈ F n−k 2 . Hence, the ISD equation rewrites as where the ≈ sign indicates that the right-hand side of Equation (5) differs from the left-hand side only on ω(e 3 ) = w − p coordinates. Stern's algorithm enumerates two lists, L 1 = {Qe 1 } and L 2 = {s + Qe 2 }, and searches for two ele- The corresponding error vectors e 1 , e 2 are implicitly stored alongside in the lists. This is an instance of the so-called near neighbour problem for the Hamming distance: Given two lists L 1 , L 2 , the problem asks to find (almost) all pairs (v 1 , v 2 ) ∈ L 1 � L 2 that are close under the Hamming distance. For instance, it can be solved by testing whether on certain fixed ℓ coordinates, the vectors from L 1 and L 2 match, that is, equal to each other, exactly. If so, they are then tested for approximate match on the remaining coordinates. In the near neighbour literature, this method was put forward by Indyk-Motwani [155].
Finiasz and Sendrier in Ref. [154] proposed an improvement to Stern's algorithm: They introduced the ℓ-length 0window into the systematic form of H, so now it is of the . This shape can be reached by applying a partial Gaussian elimination on the right-hand upper square (n − k − ℓ) � (n − k − ℓ) submatrix of H, giving us I n−k−ℓ . Then we force the bottom ℓ rows to be 0 by adding the appropriate rows of I n−k−ℓ . From now on, we will be working with the ISD equation of the form . If e ¼ ðe 1 k0Þ þ ðe 2 k0Þ þ ð0ke 3 Þ ∈ F n is solution, then This means that Q 0 e 1 and s + Q 0 e 2 are necessarily equal on their last ℓ coordinates. We search for two vectors, v 1 ∈ L 1 = {Q 0 e 1 } and v 2 ∈ L 2 = {s + Q 0 e 2 }, that are equal on this ℓ-window. We expect to find jL 1 j⋅jL 2 j 2 ℓ such pairs, one of them being our solution. It turns out that the additional parameter ℓ gives a better handle for balancing the time for finding a good permutation and enumerating the lists. In particular, this approach leads to classical time complexity where the first multiple is the expected number of permutations we need to choose before we hit the needed weight distribution, the arguments inside max{} are the complexity of creating L 1 , L 2 and sorting L 2 , and the expected number of pairs from L 1 � L 2 that are equal on the ℓ-window, which we check for a solution. Note that by construction The space complexity of this algorithm is S C Stern ¼ jL 1 j. The parameters p, ℓ are subject to optimisations for concrete n, k, w. It might be instructive to view Prange's algorithm as a one for-loop procedure searching for a good permutation. Then, Stern's algorithm is a two for-loop procedure with the outer loop searching for a permutation and the inner loop checking whether the permutation is correct. The purpose of the parameters p, ℓ is to remove the workload from the outer loop to the inner loop, thus rebalancing the overall cost. It turns out that in the ISD setting when w = Θ(n), we have p, ℓ = Θ(n), hence improving the decoding asymptotically.
We describe quantum speed-ups for Stern's algorithm (and its modifications) after we explain another useful technique to solve ISD: the representation-based algorithms.
Representation-based techniques: The ideas from Refs. [146,156] enable us to obtain the lists L 1 , L 2 faster by first enlarging the enumeration space for e, thus creating many solutions, and then only looking for specific ones. Concretely, The key observation is that now there are R MMT ≔ � p p=2 � ways to represent the target e as e = e 1 + e 2 . Hence, on average, it is enough to construct only an R MMT −fraction of L 1 , L 2 . We do so by restricting the elements in L 1 , L 2 to be 0 on ⌊log 2 ðR MMT Þ⌋ coordinates (these coordinates are subsets of the ℓ coordinates we aim to match on at the end). We construct such L 1 by merging in the meet-in-the-middle way yet another two smaller lists L 1,1 and L 1,2 . Henceforth, we refer to the term 'merging' as a process of creating one list out of another two given lists (vectors) by summing only those elements from the given lists that satisfy a certain relation. In ISD algorithms, this relation is equal to 0 on some coordinates. In the MMT algorithm, due to May-Meurer-Thomae [146], L 1,1 , L 1,2 are of the form We require that during the merge the sum Q 0 [e 1,1 ‖0 (k+ℓ)/2 ] + Q 0 [0 (k+ℓ)/2 ‖e 1,2 ] is zero on ⌊log 2 ðR MMT Þ⌋ last coordinates.
The list L 2 is constructed analogously except that, similar to Stern, we include the syndrome s to L 2,2 . Pictorially the algorithm has a tree-structure with each node being a list, and such a view is provided in Figure 9.
The correctness of this algorithm can be found in Ref. [146]. Let us analyse its complexity. The number of necessary permutations we need to try before we have the correct weight distribution on e is To check whether a permutation π is good, we attempt to find e by constructing the lists from Figure 9. Provided the lists on the same level are of the same excepted size, the complexity of this routine is given by max ( jL 1;1 j; This quantity is the maximum between (I) the size of top 4 lists (II) the size of the output after the first merge on ⌊log 2 R MMT ⌋ coordinates, and (III) the size of the output after merging on the remaining ℓ − ⌊log 2 R MMT ⌋ coordinates. Since , we obtain Representation-based ISD algorithm due to Ref. [146] 28 - The memory complexity of this algorithm is (we omit the size of the output list since we do not have to store it).
Becker-Jeux-May-Meurer in Ref. [156] notice that we can increase the number of representations by splitting zerocoordinates of e not only as 0 + 0 but also as 1 + 1. It turns out that constructing longer top-level lists L i,j with e i,j of weights ω(e i,j ) = p/2 + ɛ for some integer ɛ > 0 improves the algorithm as it significantly increases the number of repre- where the second multiple is the number of ways we can choose ɛ 1's out of k + ℓ − p coordinates. Intuitively, the strategy allows for a better balance between the two merges: the first merge on ⌊log 2 R BJMM ⌋ coordinates and the second on ℓ − ⌊log 2 R BJMM ⌋ coordinates. The expected running time of the BJMM algorithm is given by . For ɛ = 0, we recover the MMT algorithm. In fact, the authors in Ref. [156] propose to construct trees of depth higher than 2, merging on each level to 0 on the appropriate number of coordinates, thus removing the representations. We shall not describe this extension here, but note that this depth is yet another parameter to be optimised and the optimal value differs for concrete ISD parameters.
Quantum walks for the list matching problem: At the heart of the above ISD algorithms (except Prange's) is the search for tuples of vectors from given lists, where a good tuple should satisfy a certain relation. This task can be generalised to the list matching problem (also known as the k-list problem [157], but we decided to remove k not to be confused with the code dimension). We have already met some instances of this problem: Stern's algorithm is an example for m = 2, where g decides for a 'match' whenever a pair (v 1 , v 2 ) ∈ L 1 � L 2 is equal on certain fixed ℓ coordinates. In representation-based algorithms such as Refs. [146,156] there are (at least) 4 lists L 1 , …, L 4 , and function g decides for the match if v 1 þ v 2 , v 3 þ v 4 are equal on certain coordinates (first merge) and, in addition, if v 1 þ v 2 þ v 3 þ v 4 is 0 on the designated ℓ coordinates (second merge). In decoding, g additionally checks if the weight of the sum is correct.
A special version of the list matching problem, where g = 1 ⇔ v 1 = ,…, = v m , called the Element distinctness problem, has its history in quantum computing [158] since it serves as an illustrative application of quantum random walk techniques. Let us specialise this technique to the ISD setting.
The setup phase of the walk consists in preparing a uniform superposition over all r-size subsets (optimal value for r will be discussed later) S i ⊂ L i together with an auxiliary register jAux〉 (normalisation omitted): The auxiliary register jAux〉 contains information needed to decide whether S 1 , …, S m contains a match. In the ISD setting, jAux〉 stores intermediate and output lists of the matching process. For example, in Stern's algorithm ðm ¼ 2Þ jAux〉 contains all pairs (v 1 , v 2 ) ∈ S 1 � S 2 that match on ℓ coordinates. In case the merge is done in several steps such as in MMT (m = 4), the intermediate lists are also stored in jAux〉. Hence, when we talk about quantum memory of an ISD algorithm in this section, we mean the size of the jAux〉 register.
The running time and the space complexities of the Setup phase are essentially the running time and the space complexities of the corresponding classical ISD algorithm with the input lists of size |S i | = r instead of |L i |.
The Setup phase finishes with a superposition over all rsublists S 1 , …, S m of L 1 , …, L k , where each (S 1 , …, S m ) is entangled with the register jAux〉 that contains the result of merging (S 1 , …, S m ) into the final output list S output .
The Update phase consists in choosing a sublist S i and replacing one of its element v i ∈ S i by v 0 i ∉ S i . This is one step of a walk on the Johnson graph, see Section 3.3 for the relevant definitions. We update the data stored in jAux〉: Remove all the pairs in the merged lists that involve v i and create possibly new matches with v 0 i . We assume that throughout the walk the sublists S i 's are kept sorted and stored in a data-structure that allows fast insertions/removals (e.g. radix trees as proposed in Ref. [159]). We also assume that elements in S 1 , …, S m that result in a match store pointers to their match to be able to quickly update the output of the checking function.
After we have performed Θð1= ffi ffi ffi δ p Þ updates, where δ is the eigenvalue gap of the Johnson graph J (N, r), we check if the updated register jS 1 〉 ⊗ … ⊗ jS m 〉 ⊗ jAux〉 gives a match. We give a lower bound on δ below. This is the Checking phase of the walk.
Thanks to the quantum walk framework described in Section 3.3, once we know the costs of the Setup phase T S , the Update phase T U , and the Checking phase T C , we know that after T QW many steps, we measure a register jS 1 〉 ⊗ … ⊗ jS k 〉 ⊗ jAux〉 that contains the correct errorvector with overwhelming probability, where where ɛ is a fraction of vertices in J(N, r) that contain the correct error vector. For a fixed m, we have ɛ ≈ r m /N m where N = |L 1 |. Strictly speaking, the walk we have just described is a walk on an m-Cartesian product of Johnson graphs-one for each sublist S i , so the value δ in Equation (7)  Quantum walks speed-ups for high-memory ISD algorithms: The quantum walk search algorithm described above solves the ISD problem provided we have found a permutation π that gives the desired distribution of 1's in the error-vector. Kachigar and Tillich in Ref. [151] suggest to run Grover's algorithm for π with the 'checking' function for Grover's search being the quantum walk routine for the List matching problem. In particular, we operate on the quantum state (normalisation omitted): |ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl {zffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl ffl } Quantum Walk ¼ Check for the outer Grover ⊗j Is π good?〉: The outer summation is performed over a set of all permutations from π i ∈ S n . Let N denote the total number of permutations and M-the number of marked permutations, that is, those give the desired weight distribution of the error. If we can check that π is a marked permutation, then after O  (6). The check if a permutation π is good is realised via quantum walk search for m vectors v 1 , …, v m ∈ S 1 � ⋯ � S m that match on certain coordinates and lead to the correct error vector. Note an important difference between classical and quantum settings: during the quantum walk we search over size-r sublists S i ⊂ L i , which are exponentially shorter than L i . After T QW steps, the register jAux〉 contains an m-tuple (v 1 , …, v m ) that leads to the correct error vector provided a permutation π is good. Hence, after ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi #permutations p ⋅ T QW steps, the measurement of the first register gives a good π with constant success probability. The resulting state will be entangled with registers that store S 1 , …, S m together with the pointers to the matching elements. Once we measure S 1 , …, S m , we retrieve these pointers and, finally, reconstruct the error vector as in the classical case.
With all the above, we are ready to analyse quantum versions of ISD algorithms that use the list matching problem as a subroutine. For Stern's algorithm, the quantum random walk will have the following complexities: where the max{} is taken between time to construct r-size sublists S 1 , S 2 and the expected output size of pairs from these sublists that match on ℓ coordinates; � T U ¼ r=2 ℓ , which is the expected number of elements we will need to update if we change one element in S 1 ; � T C ¼ log r.
Assuming r 2 /2 ℓ ≥ r, the value for T QW from Eq. (7) During the quantum walk we store the r-size sublists on quantum registers, so the quantum space complexity of Stern's algorithm is S Q Stern ¼ jL 1 j 2=3 . Similar arguments apply to the ISD algorithms that use the representation technique. Let us consider the quantum walk complexities of the BJMM algorithm, since MMT can be viewed as a special case of BJMM. Similar to Stern, we have Assuming the second level dominates the list construction, that is, the max in T S is achieved by r 2 =2 ⌊log 2 R BJMM ⌋ and by r=2 ⌊log 2 R BJMM ⌋ in T U , T QW from Eq. (7) is minimised when r = | as in the classical BJMM.
Taking into account the square-root speed-up for the number of permutations P MMT given in Eq. (6), we obtain The quantum space complexity of BJMM is S Q BJMM ¼ jL 1;1 j 4=5 .

| Open problems
Despite a lot of effort that has been put into lowering the complexity of ISD over the last 50 years, the picture is far from satisfactory. Essentially all the speed-ups we know and have described here are asymptotically applicable to the dense error regime w = Θ(n), which is less cryptographically relevant. It was shown in Ref. [160] that the improvements starting from Stern until the very recent ones [147,161] vanish when w = o (n) for n going to infinity. So, for sparse errors, we are in the strange situation when asymptotically the best known attack remains Prange's algorithm. Hence, Question 1 can we improve over-Prange's algorithm for sparse errors both classically and quantumly?
One might argue that perhaps asymptotic complexity is not the only metric one should care about when evaluating security levels for specific ISD settings. And this is certainly true: Hidden low-order terms may have serious impact on concrete hardness. The fact that the fastest known algorithms are memory intensive makes the actual costs even less predictable from the asymptotics. Hence, Question 2 what are the precise (classical and quantum) costs of solving ISD for concrete parameters?
First steps in this direction in the classical setting have been done in Ref. [162], where the authors provide an estimator for concrete ISD parameters. In the quantum setting it seems to be a much harder task to give any meaningful statement about the concrete costs as one would need to analyse in details the complexity of implementing quantum walks.
Instead of analysing the general ISD problem, one can turn their attention to actual cryptographic schemes. The assumption that all known code-based constructions use is that the structure hidden in the parity-check matrix H does not impact security. While we are not aware of any attack that exploits either the structure of the Goppa code in McEliece KEM or the cyclic structure in BIKE KEM, it is still natural to ask Question 3 can we speed up ISD routines using the knowledge that the parity-check matrix H is not chosen uniformly at random?
Continuing analysing concrete schemes, the hardness of ISD over F 3 has only recently started drawing attention. To gain our confidence in the security of the code-based signature WAVE, the decoding problem over F 3 for dense error requires more investigations.

| ISOGENIES
An elliptic curve E defined over a finite field F q of characteristic p ≠ 2, 3 is a projective algebraic curve with an affine plane model given by an equation of the form y 2 = x 3 + ax + b, where a, b ∈ F q and 4a 3 + 27b 2 ≠ 0. The set of points of an elliptic curve is equipped with an additive group law. Details about the arithmetic of elliptic curves can be found in many references, such as [Ref. [163], Chap. 3]. Isogenies between elliptic curves are non-zero maps that are given by rational functions, and that are group homomorphisms. The main problem we are interested in is: Problem 7 (Main isogeny problem). Given elliptic curves E, E 0 defined over a finite field F q , find an isogeny between E and E 0 .
Problem 7 is deliberately phrased in an ambiguous way. First, we do not specify the representation of the solution φ: E → E 0 required. Indeed, the natural representation of an isogeny is via polynomials f ; g; h ∈ F q ½x� such that � : The degree of φ is its degree as a rational function. Therefore, such a representation cannot be efficient when the degree is large. On the other hand, we can decide to accept a representation as a composition of small degree isogenies. Indeed, the degree is multiplicative: deg(φ°ψ ) = deg(φ) deg(ψ); therefore, it might be possible to represent a degree-2 n as a composition of n degree-2 isogenies, thus trading an exponential representation for a polynomial-sized one. Another aspect of Problem 7 that could be further specified is whether the existence of an isogeny of small degree between E and E 0 is known. Indeed, depending on the properties of the curves, generic bounds are known on the existence of an isogeny of bounded degree between the input curves. However, certain popular cryptosystems deliberately choose curves with an unusually low degree isogeny between them. This naturally influences the performances of the computational methods to find the secret isogeny.
In this section, we avoid presenting extensive background on isogenies in order to focus on the properties that are relevant to the state-of-the-art quantum algorithms. We need however to introduce the following important fact. The search for an isogeny between two curves can be reduced to the search for a possible kernel of a map. Indeed, an isogeny φ : E → E 0 is always the composition φ = α°φ 0 of a purely inseparable isogeny α and an inseparable isogeny φ 0 . Purely inseparable isogenies are the composition of the Frobenius endomorphism π: (x, y)↦(x p , y p ) where p = char(q) and an isomorphism. In particular, ker(α) = {∞} is trivial, and the kernel of α 0 determines φ 0 uniquely up to isomorphisms. There is an explicit way to construct a separable isogeny from the points of its kernel using Vélu's formulas [164]. All the algorithms we present in this section focus on computing separable isogenies.

| Isogenies arising as group actions
We begin our overview of quantum algorithms for solving the isogeny problem with the most structured one, namely when we know the action of a finite abelian group G on isomorphism classes of elliptic curves. In this context, we assume that given g ∈ G, such that g * E ¼ E 0 (where E denotes the isomorphism class of E, i.e. all curves isomorphic to E), and we can efficiently derive a (separable) isogeny φ : E → E 0 . This is the framework of hard homogeneous spaces described by Couveignes [165] and of cryptographic group actions [166]. This motivates the formulation of the problem of inverting a group action on isomorphism classes of elliptic curves: Problem 8 (Group action on curves). Let G be an abelian group acting faithfully and transitively on a set X of isomorphism classes of elliptic curves defined over F q , and E; E 0 ∈ X. Find g ∈ G such that g * E ¼ E 0 .
The hardness of this problem is the security assumption of multiple isogeny-based schemes [165,167,168], including the key exchange mechanism CSIDH [169].
To understand the relevance of Problem 8 to the computation of isogenies, we need to introduce the distinction between ordinary and supersingular curves. An elliptic curve defined over F q of characteristic p is said to be supersingular if p|E À F q � − q − 1 where EðFÞ denotes the group of points of E defined over the field F. If an elliptic curve is not supersingular, then it is ordinary. The ring of endomorphisms End(E) of an ordinary elliptic curve is an order O within the quadratic field QðπÞ where π is the Frobenius endomorphism. The embedding of π into a quadratic number field is done by noticing that it satisfies the equation . From a high level standpoint, the takeaway is that the group G ¼ ClðOÞ with size jGj ∈ Oð ffi ffi ffi q p Þ acts faithfully and transitively on isomorphism classes of elliptic curves, and that there is a natural correspondence between an element g ∈ G such that g * E ¼ E 0 and the kernel of an isogeny φ : E → E 0 . Therefore, in ordinary elliptic curves, Problem 7 reduces to Problem 8.
The group action framework also applies to certain supersingular elliptic curves. Typically, by E, we mean the isomorphism class of E for isomorphisms defined over F q . In this case, there is always a curve E 0 ∈ E defined over F p 2 . It might also be possible that E admits a representative E″ that is defined over F p . In any case, End(E) is isomorphic to an order in a quaternion algebra, which is a 4-dimensional noncommutative ring. When E is defined over F p , the ring EndðEÞ F p of endomorphisms of E that are defined over F p is isomorphic to an order O in the quadratic number field Kð ffi ffi ffi ffi ffi ffi −p p Þ. In this case, ClðOÞ acts faithfully and transitively on classes of F p -isomorphisms of curves defined over F p (we denote the class of F p -isomorphisms of E by E p ). As above, the action of (the class of) an ideal a ⊆ O is through a separable isogeny of degree N ðaÞ. Therefore, when E, E 0 are defined over F p , Problem 7 also reduces to Problem 8. In this case, G ¼ ClðOÞ has size jGj ∈ Oð ffi ffi ffi p p Þ, and as before, the knowledge of g such that g * E p ¼ E 0 p yields an isogeny φ : E → E 0 . This framework can be partially extended to supersingular curves defined over F p 2 when an orientation is known, that is, an injective morphism ι : O → EndðEÞ where O is an imaginary quadratic order [ [171], Def. 2].
Problem 8 subsequently reduces to the dihedral hidden subgroup problem [172] for which we presented an algorithm in Section 4.3. Assume we are looking for a such that ½a� * E 1 ¼ E 2 . Let A ¼ Z=d 1 Z � ⋯ � Z=d k Z ≃ ClðOÞ be the elementary decomposition of ClðOÞ. Then we define a quantum oracle f : Z=2Z ⋉ A → fquantum statesg by where � a y � is the element of ClðOÞ corresponding to y ∈ A via the isomorphism ClðOÞ ≃ A. Let H be the subgroup of This yields solutions to the isogeny problem in time 2Õ ð ffi ffi q p Þ between ordinary elliptic curves with the same endomorphism ring and in time 2Õ ð ffi ffi p p Þ between two supersingular curves defined over F p . A non-asymptotic analysis of these algorithms and of the cost of the attack against the initial parameters of the scheme CSIDH can be found in Refs. [173,174]. In another recent work [175], safer parameter sizes were proposed.

| Memoryless algorithms for smalldegree isogenies
We now assume that no group action is known. Otherwise, the best known algorithms for solving the isogeny problem are always those of the previous section. This means that we are considering the case of supersingular curves defined over F p 2 . In this section, we focus on the sub-case where we know the existence of a bounded-degree isogeny between the two input curves. This is the case for the prominent isogeny-based cryptographic scheme SIDH [176] which resulted in the SIKE submission [177] of NIST standardisation of postquantum KEM protocols. This leads us to the formulation of the following problem: Problem 9 (Small-degree isogeny). Set E, E 0 be two elliptic curves over F q and ℓ be a prime. Suppose that there is a degree ℓ k -separable isogeny φ: E → E 0 for some k. Find φ.
In the SIKE system, ℓ = 2 or 3, and k ¼ 1 2 logðpÞ. These are the typical parameters of interest. The secret isogeny could be viewed as a walk in the ℓ-isogeny graph, which is an ℓ + 1-regular graph where nodes are isomorphism classes of elliptic curves, and there is an edge between E 1 and E 2 if there is a degree-ℓ isogeny φ : E 0 1 → E 0 2 for some E 0 1 ∈ E 1 and E 0 2 ∈ E 2 . The choice of a non-backtracking walk of length k originating from E in the ℓ-isogeny graph can be mapped to a bit string in f0; 1g ⌈klog 2 ℓ⌉ . Such a choice represents the choice of one of the possible kernels. To efficiently compute the map corresponding to a given bit string, we follow the approach of Ref. [176,Sec. 4.2.2]. In a nutshell, it consists in identifying a cyclic kernel 〈R〉 ⊆ E [ℓ k ] = {P ∈ E|[ℓ k ]P = 0} and computing the corresponding isogeny with Vélu's formulas. Instead of directly applying the formulas using all the points of 〈R〉, we rather only compute ℓ-isogenies. By interleaving ℓ-isogeny computation/evaluations and multiplications of the points defining the kernel by ℓ, we can compute an ℓ k isogeneous curve E″ in time O(k log k log q) operations. Then we need to decide whether E 00 ∈ E 0 . If that is the case, then we have solved our problem.
To decide if two curves are isomorphic, we compute their socalled j-invariant, which is a value of F q for which we have a closed formula from the coefficient of the curve. If these match, the curves are isomorphic (i.e. the j invariants identify isomorphism classes of curves).
The natural strategy is therefore to perform a quantum search on all possible strings in f0; 1g ⌈klog 2 ℓ⌉ that encode a length-k walk in the ℓ-isogeny graph until one of them yields a curve whose j-invariant matches that of E 0 . This can be done by using Grover's search with oracle f0; 1g ⌈k log 2 ℓ⌉ → f0; 1g where φ s is the degree-ℓ k isogeny obtained from s by the procedure described above.
Proposition 24 Using Grover's search algorithm there is an algorithm to solve Problem 9 in quantum time O(ℓ k/2 k log k log q) with a polynomial amount of memory.
Interestingly, the above approach is the best quantum attack against SIDH/SIKE with low memory. Marginal improvements are known (to improve on the polynomial factors). For example, it is possible to reduce the length of the walk by pre-computing all k 0 -length paths from E 0 and storing all the corresponding j-invariants in the circuit of the oracle which makes a comparison between the j-invariant obtained after a walk of length k − k 0 originating from E and all the precomputed ones [178].

| Large-memory algorithms for smalldegree isogenies
The search for a path in the ℓ-isogeny graph between E and E 0 can be done with a meet-in-the-middle strategy. Indeed, finding a path of length k from E to E 0 can be done by fixing 1 < k 0 < k and finding two paths: the one of length k 0 originating from E and the other of length k − k 0 originating from E 0 that land in the same isomorphism class, that is, on curves that share the same j-invariant. Typically, k 0 = k − k 0 = k/2 (i.e. the meeting point is really in the middle), but it is not fundamentally required. Once the two paths are found, we have a path from E to E 0 (note that we know how to backtrack the length k − k 0 path all the way to E 0 using dual isogenies). This process fits the framework of claw finding which, given f : X − f0; 1g k 1 → Z and g : X ¼ f0; 1g k 2 → Z for some set Z, consists in looking for x, y such that f(x) = g(y). Such a pair is called a claw. In the context of isogenies of small degree (parameterised by k), we assume that the claw is unique.
To solve the claw finding problem with optimal circuit depth (at the cost of increased quantum memory), we can use Tani's algorithm [179]. It consists in a quantum random walk in the product of the two Johnson graphs, J f ≔ J(|X|, (|X‖Y|) 1/3 ) and J g ≔ J(|Y|, (|X‖Y|) 1/3 ) (provided that |Y| ≤ |X| 2 , which we will assume since our target case is |X| = |Y|). A vertex (F, G) ∈ J f � J g is marked if there is a pair (x, y) ∈ F � G such that f (x) = f(y) (i.e. F � G contains a claw). To check if (F, G) is a marked vertex, we sort all elements of F ∪ G with respect to their function value (for us the function values are j-invariants in Z ¼ F p 2 ). Therefore, Cost(Setup) is the cost of evaluating f and g on X ∪ Y and then sorting all the elements. However, Cost (Update) only consists in the deletion of one element and the insertion of a new element on an already sorted list, which is efficient. Note that the memory used by this algorithm is QRAQM (due to the quantum walk framework). (Tani). The claw finding problem can be solved by a quantum algorithm using a circuit O When searching for a degree-ℓ k isogeny between representatives of E; E 0 , we can choose |X| = |Y| = ℓ k/2 , which yields a circuit with depth and width inÕ � ℓ k=3 � (to simplify notations, we incorporate the cost of the comparison oracle inÕ). In the case of SIDH/SIKE, ℓ = 2 and k ¼ 1 2 logðpÞ, which yields depth and width inÕ À p 1=6 � , but a Depth � Width cost ofÕ À p 1=3 � , which is worse than a Grover search that achievesÕ À p 1=4 � . There exists a natural time-memory tradeoff, which interpolates between Tani's algorithm and Grover search: One simply sets the vertex size of the Johnson graphs to a given parameter R ≤ (|X‖Y|) 1/3 , so that only R elements of X and R elements of Y are stored. This reduces the probability to be More tradeoffs between time, depth and width can be done, and optimisations of the resulting Depth � Width and Gate costs (thus without QRAM) have been studied in Refs. [16,180]. In all cases, due to the techniques being used, the quantum time complexity achievable for a given amount of quantum hardware (e.g. memory or processors) achieves at most a square-root improvement on the classical time complexity achievable with the corresponding amount of classical hardware.

| Algorithm for generic isogenies
Assuming we want to find an isogeny between two supersingular curves E, E 0 defined over F p 2 without any guarantee of existence of a short degree map between them, there is a nontrivial method to achieve a speed-up over Grover's search without incurring the memory costs of Tani's algorithm. The generic isogeny problem is less relevant for cryptography than Problems 8 and 9, which gave rise to the most popular isogenybased cryptosystems, but it enables the search for collisions for the Charles-Lauter-Goren hash function [181]. Aside from this, the generic isogeny problem is a fundamental problem for which quantum computers provide a non-trivial speedup.
We combine techniques from Sections 10.1 and 10.2 to compute an isogeny between two given supersingular curves defined over F p 2 without any particular property. The high level strategy we follow was first described by Delfs and Galbraith [182] and then adapted to the quantum setting by Biasse, Jao and Sankar [183]. It consists in searching for an isogeny path between E and a curve E 1 defined over F p and for an isogeny path between E 0 and a curve E 2 defined over F p . Then in a second stage, we find an isogeny between E 1 and E 2 using the methods of Section 10.1. Altogether this yields an isogeny between E and E 0 . This approach is illustrated by Figure 10.
The cost of computing an isogeny from E 1 to E 2 is negligible and immediately derives from the analysis presented in Section 10.1. Now, we turn to the computations of isogenies E → E 1 and E 0 → E 2 . There are O(p) isomorphism classes of supersingular curves containing a representative defined over F p 2 , among whichΩð ffi ffi ffi p p Þ have a representative defined over F p . This means that a fractionΩ � 1 ffi ffi p p � of the isomorphism classes contain a representative defined over F p . The ℓ-isogeny graph for a prime ℓ ∤ p is a Ramanujan graph [176,Sec. 2]. This property allows us to evaluate the probability that an ℓisogeny path of a given length reaches a certain subset of the vertices. If the length is long enough, then the distribution of the end point of the walk is close to uniform at random. Proposition 26 (Prop 2.1 of Ref. [176]). Let G be a k-regular graph such that the eigenvalues λ of the non-constant eigenvectors of its adjacency matrix satisfy jλ ≤ c for some c < k. Let S ⊆ G be a subset of vertices and x ∈ G. Then a random walk of length at least logðð2jSj=jGjÞ 1=2 Þ logðk=cÞ starting from x lands in S with probability at least jSj 2jGk .
The ℓ-isogeny graph is k = ℓ + 1-regular, and the nontrivial eigenvalues satisfy jλj ≤ c ¼ 2 . So if we choose ℓ = 3, G the 3-isogeny graph, and S the set of isomorphism classes with a representative over F p , we obtain that a random walk of length in O(log(p)) hits a supersingular curve defined over F p with probability at leastΩ . Using a Grover search over all 3-isogeny paths of the length given in Proposition 26, we obtain a cost ofÕ À p 1=4 � . As mentioned before, this cost is the bottleneck of the computation of an isogeny between E and E 0 since it is asymptotically larger than the cost of the computation of E 1 → E 2 .

| Open problems
In the case of SIDH, there is additional information available to the adversary that the algorithms presented so far in this section do not exploit. Indeed, the security of SIDH relies not only on the difficulty of finding isogenies but also on finding them given some additional information (the evaluation of the isogeny on a torsion subgroup) that is necessary for the key exchange.
Problem 10 (SIDH). Let E be an elliptic curve. Let N 1 , N 2 be two smooth coprime integers (powers of 2 and 3 in the case of SIKE). Let K be a cyclic subgroup of order N 1 of E chosen uniformly at random. Let ϕ : E → E/K. Given the supersingular elliptic curves E and E/K, given the restriction of ϕ to E[N 2 ], compute K.

Question 1
Can we design methods that take advantage of the information of the restriction of ϕ to E[N 2 ] to solve Problem 10 more efficiently than the best methods to solve Problem 9? 1 In SIKE, one has N 1 ≃ N 2 . But the torsion points allow for some attacks in the case of unbalanced or overstretched (N 1 , N 2 ), as seen, for example, in Ref. [184], which includes quantum attacks based on claw-finding. In Ref. [185], the authors showed the following. Theorem 27 (Theorem 4.27 in [185]). If N 2 > pN 4 1 , then the SIDH problem can be reduced (with additional heuristics) to the abelian hidden shift problem, and solved in quantum subexponential time.
This is, to the best of our knowledge, the first application of quantum hidden shift algorithms in dedicated cryptanalysis of SIDH, outside the setting where a group action is already given. Another natural angle to extend the scope of quantum hidden shift algorithms would be to take advantage of orientations, which are injective morphisms ι : O → EndðEÞ where O is an imaginary quadratic order [ [171], Def. 2]. It seems like in the case of two curves sharing the same orientation, the isogeny problem reduces to Problem 8.

Question 2
Can the quantum hidden shift framework be applied to more general classes of supersingular curves defined over F p 2 by using the concept of orientation?