Papers

Static cost analysis

Denotational semantics as a foundation for cost recurrence extraction for functional languages (with Daniel R. Licata). Manuscript.
A standard method for analyzing the asymptotic complexity of a program is to extract a recurrence that describes its cost in terms of the size of its input, and then to compute a closed-form upper bound on that recurrence. In practice there is rarely a formal argument that the recurrence is in fact an upper bound; indeed, there is usually no formal connection between the program and the recurrence at all. Here we develop a method for extracting recurrences from functional programs in a higher-order language with let-polymorphism that provably bound their operational cost. The method consists of two phases. In the first phase, a monadic translation is performed to extract a cost-annotated version of the original program. In the second phase, the extracted program is interpreted in a model. The key feature of this second phase is that different models describe different notions of size. This plays out specifically for values of inductive type, where different notions of size may be appropriate depending on the analysis, and for polymorphic functions, where we show that the notion of size for a polymorphic function can be described formally as the data that is common to the notions of size of its instances. We give several examples of different models that formally justify various informal extract-a-recurrence-and-solve cost analyses to show the applicability of our approach.

Availability: arXiv.

Denotational recurrence extraction for amortized analysis (with Joseph W. Cutler and Daniel R. Licata). To appear in ICFP 2020.
A typical way of analyzing the time complexity of functional programs is to extract a recurrence expressing the running time of the program in terms of the size of its input, and then to solve the recurrence to obtain a big-O bound. For recurrence extraction to be compositional, it is also necessary to extract recurrences for the size of outputs of helper functions. Previous work has developed techniques for using logical relations to state a formal correctness theorem for a general recurrence extraction translation: a program is bounded by a recurrence when the operational cost is bounded by the extracted cost, and the output value is bounded, according to a value bounding relation defined by induction on types, by the extracted size. This previous work supports higher-order functions by viewing recurrences as programs in a lambda-calculus, or as mathematical entities in a denotational semantics thereof. In this paper, we extend these techniques to support amortized analysis, where costs are rearranged from one portion of a program to another to achieve more precise bounds. We give an intermediate language in which programs can be annotated according to the banker’s method of amortized analysis; this language has an affine type system to ensure credits are not spent more than once. We give a recurrence extraction translation of this language into a recurrence language, a simply-typed lambda-calculus with a cost type, and state and prove a bounding logical relation expressing the correctness of this translation. The recurrence language has a denotational semantics in preorders, and we use this semantics to solve recurrences, e.g., analyzing binary counters and splay trees.

Availability: coming soon, once we make a few revisions.

Recurrence extraction for functional programs through call-by-push-value (with G. A. Kavvos, Edward Morehouse, and Daniel R. Licata). Proceedings of the ACM on Programming Languages 4(POPL), Article 15, 2019. DOI: 10.1145/3371083.
The main way of analyzing the complexity of a program is that of extracting and solving a recurrence that expresses its running time in terms of the size of its input. We develop a method that automatically extracts such recurrences from the syntax of higher-order recursive functional programs. The resulting recurrences, which are programs in a call-by-name language with recursion, explicitly compute the running time in terms of the size of the input. In order to achieve this in a uniform way that covers both call-by-name and call-by-value evaluation strategies, we use Call-by-Push-Value (CBPV) as an intermediate language. Finally, we use domain theory to develop a denotational cost semantics for the resulting recurrences.

Availability: POPL 2020 proceedingsarXiv.

Denotational cost semantics for functional languages with inductive types (with D. Licata and R. Ramyaa). In Fisher, K. and Reppy, J. (eds.), Proceedings of the 20th ACM SIGPLAN International Conference on Functional Programming (ICFP 2015), pp. 140-151, 2015.  DOI: 10.1145/2784731.2784749.
A central method for analyzing the asymptotic complexity of a functional program is to extract and then solve a recurrence that expresses evaluation cost in terms of input size. The relevant notion of input size is often specific to a datatype, with measures including the length of a list, the maximum element in a list, and the height of a tree. In this work, we give a formal account of the extraction of cost and size recurrences from higher-order functional programs over inductive datatypes. Our approach allows a wide range of programmer-specified notions of size, and ensures that the extracted recurrences correctly predict evaluation cost. To extract a recurrence from a program, we first make costs explicit by applying a monadic translation from the source language to a complexity language, and then abstract datatype values as sizes. Size abstraction can be done semantically, working in models of the complexity language, or syntactically, by adding rules to a preorder judgement. We give several different models of the complexity language, which support different notions of size. Additionally, we prove by a logical relations argument that recurrences extracted by this process are upper bounds for evaluation cost; the proof is entirely syntactic and therefore applies to all of the models we consider.

Availability: ICFP 2015 proceedings – arXiv.

A static cost analysis for a higher-order language (with J. Paykin and J.S. Royer). In Might, M. and Van Horn, D. (eds.), Proceedings of the 7th Workshop on Programming Languages Meets Program Verification, pages 25-34. ACM Press, 2013.
We develop a static complexity analysis for a higher-order functional language with structural list recursion. The complexity of an expression is a pair consisting of a cost and a potential. The former is defined to be the size of the expression’s evaluation derivation in a standard big-step operational semantics. The latter is a measure of the “future” cost of using the value of that expression. A translation function \Vert\cdot\Vert maps target expressions to complexities. Our main result is the following Soundness Theorem: If t is a term in the target language, then the cost component of \Vert t\Vert is an upper bound on the cost of evaluating t. The proof of the Soundness Theorem is formalized in Coq, providing certified upper bounds on the cost of any expression in the target language.

Availability: PLPV 2013 proceedingsarXiv.

Implicit computational complexity

Two algorithms in search of a type system (with J. Royer). Theory of Computing Systems 45(4):787-821, 2009. This is the full version of our CIE’07 paper.
The authors’ \mathsf{ATR} programming formalism is a version of call-by-value \mathsf{PCF} under a complexity-theoretically motivated type system. \mathsf{ATR} programs run in type-2 polynomial-time and all standard type 2 basic feasible functionals are \mathsf{ATR}-definable (\mathsf{ATR} types are confined to levels 0, 1, and~ 2). A limitation of the original version of \mathsf{ATR} is that the only directly expressible recursions are tail-recursions. Here we extend \mathsf{ATR} so that a broad range of affine recursions are directly expressible. In particular, the revised \mathsf{ATR} can fairly naturally express the classic insertion- and selection-sort algorithms, thus overcoming a sticking point of most prior implicit-complexity-based formalisms. The paper’s main work is in refining the original time-complexity semantics for \mathsf{ATR} to show that these new recursion schemes do not lead out of the realm of feasibility.

Availability: Theory of Computing SystemsarXiv.

Time-complexity semantics for feasible affine recursions (with J. Royer). In Cooper, S.B., Lowe, B.L., and Sorbi, A. (eds.), Computation in the Real World (Proceedings of Computability in Europe 2007, Siena), vol. 4497 of Lecture Notes in Computer Science, Springer-Verlag, Berlin, 2007. This is the conference version of our Theory of Computing Systems paper.
The authors’ \mathsf{ATR} programming formalism is a version of call-by-value \mathsf{PCF} under a complexity-theoretically motivated type system. \mathsf{ATR} programs characterize the type-level \leq 2 basic feasible functions (\mathsf{ATR}-types are confined to levels 0, 1, and 2). A limitation of the original version of \mathsf{ATR} is that the only directly expressible recursions are tail-recursions. Here we extend \mathsf{ATR} so that a broad range of affine recursions are directly expressible. In particular, the revised \mathsf{ATR} can fairly naturally express the classic insertion- and selection-sort algorithms, thus overcoming a sticking point of most prior implicit-complexity-based formalisms. The paper’s main work is in extending and simplifying the original time-complexity semantics for \mathsf{ATR} to develop a set of tools for extracting and solving the higher-type recurrences arising from feasible affine recursions.

Availability: CIE 2007 proceedingsarXiv (extended version).

Adventures in time and space (with J. Royer). Logical Methods in Computer Science 3(1):1-53, 2007. This is the full veresion of our POPL ’06 paper.
This paper investigates what is essentially a call-by-value version of \mathsf{PCF} under a complexity-theoretically motivated type system. The programming formalism, \mathsf{ATR}, has its first-order programs characterize the polynomial-time computable functions, and its second-order programs characterize the type-2 basic feasible functionals of Mehlhorn and of Cook and Urquhart. (The \mathsf{ATR}-types are confined to levels 0, 1, and 2.) The type system comes in two parts, one that primarily restricts the sizes of values of expressions and a second that primarily restricts the time required to evaluate expressions. The size-restricted part is motivated by Bellantoni and Cook’s and Leivant’s implicit characterizations of polynomial-time. The time-restricting part is an affine version of Barber and Plotkin’s DILL. Two semantics are constructed for \mathsf{ATR}. The first is a pruning of the naive denotational semantics for \mathsf{ATR}. This pruning removes certain functions that cause otherwise feasible forms of recursion to go wrong. The second semantics is a model for \mathsf{ATR}‘s time complexity relative to a certain abstract machine. This model provides a setting for complexity recurrences arising from \mathsf{ATR} recursions, the solutions of which yield second-order polynomial time bounds. The time-complexity semantics is also shown to be sound relative to the costs of interpretation on the abstract machine.

Availability: Logical Methods in Computer Science.

Adventures in time and space (with J. Royer). In Jones, S.P. (ed.), 33rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL 2006), pp. 168-179, 2006. This is the conference version of our LMCS 2007 paper.
This paper investigates what is essentially a call-by-value version of $\PCF$ under a complexity-theoretically motivated type system. The programming formalism, \mathsf{ATR}_1, has its first-order programs characterize the poly-time computable functions, and its second-order programs characterize the type-2 basic feasible functionals of Mehlhorn and of Cook and Urquhart. (The \mathsf{ATR}_1-types are confined to levels 0, 1, and 2.) The type system comes in two parts, one that primarily restricts the sizes of values of expressions and a second that primarily restricts the time required to evaluate expressions. The size-restricted part is motivated by Bellantoni and Cook’s and Leivant’s implicit characterizations of poly-time. The time-restricting part is an affine version of Barber and Plotkin’s DILL. Two semantics are constructed for \mathsf{ATR}_1. The first is a pruning of the naive denotational semantics for \mathsf{ATR}_1. This pruning removes certain functions that cause otherwise feasible forms of recursion to go wrong. The second semantics is a model for \mathsf{ATR}_1‘s time complexity relative to a certain abstract machine. This model provides a setting for complexity recurrences arising from \mathsf{ATR}_1 recursions, the solutions of which yield second-order polynomial time bounds. The time-complexity semantics is also shown to be sound relative to the costs of interpretation on the abstract machine.

Availability: POPL 2006.

Minimization and \mathbf{NP} multifunctions (with C. Pollett). Theoretical Computer Science 318(1-2):105-119, 2004.
The implicit characterizations of the polynomial-time computable functions \mathbf{FP} given by Bellantoni-Cook and Leivant suggest that this class is the complexity-theoretic analog of the primitive recursive functions. Hence it is natural to add minimization operators to these characterizations and investigate the resulting class of partial functions as a candidate for the analog of the partial recursive functions. We do so in this paper for Cobham’s definition of \mathbf{FP} by bounded recursion and for Bellantoni-Cook’s safe recursion and prove that the resulting classes capture exactly \mathbf{NPMV}, the nondeterministic polynomial-time computable partial multifunctions. We also consider the relationship between our schemes and a notion of nondeterministic recursion defined by Leivant and show that the latter characterizes the total functions of \mathbf{NPMV}. We view these results as giving evidence that \mathbf{NPMV} is the appropriate analog of partial recursive. This view is reinforced by earlier results of Spreen and Stahl who show that for many of the relationships between partial recursive functions and r.e. sets, analogous relationships hold between \mathbf{NPMV} and \mathbf{NP} sets. Furthermore, since \mathbf{NPMV} is obtained from \mathbf{FP} in the same way as the recursive functions are obtained from the primitive recursive functions (when defined via function schemes), this also gives further evidence that \mathbf{FP} is properly seen as playing the role of primitive recursion.
Ramified recurrence with dependent types. In Abramsky, S. (ed.), Typed \lambda-Calculi and Applications (TLCA 2001), vol. 2044 of Lecture Notes in Computer Science, Springer-Verlag, Berlin, pp. 91-105, 2001.
We present a version of Godel’s system T in which the types are ramified in the style of Leivant and a system of dependent typing is introduced. The dependent typing allows the definition of recursively defined types, where the recursion is controlled by ramification; these recursively defined types in turn allow the definition of functions by repeated iteration. We then analyze a subsystem of the full system and show that it defines exactly the primitive recursive functions. This result supports the view that when data use is regulated (for example, by ramification), standard function constructions are intimately connected with standard type-theoretic constructions.

Anonymity

Effectiveness and detection of denial-of-service attacks in Tor (with S. DeFabbia-Kane, D. Krizanc, and M. Liberatore). Transactions on Information Systems Security 15(3):11:1-11:25, 2012.
Tor is one of the more popular systems for anonymizing near real-time communications on the Internet. Borisov et al. proposed a denial of service based attack on Tor (and related systems) that significantly increases the probability of compromising the anonymity provided. In this paper, we analyze the effectiveness of the attack using both an analytic model and simulation. We also describe two algorithms for detecting such attacks, one deterministic and proved correct, the other probabilistic and verified in simulation.

Availability: Transactions on Information Systems SecurityarXiv.