Domains for Higher-Order GamesMatthew Hague and Roland Meyer and Sebastian MuskallaRoyal Holloway University of London and TU Braunschweig and TU Braunschweig |
Abstract: We study two-player inclusion games played over word-generating higher-order recursion schemes. While inclusion checks are known to capture verification problems, two-player games generalize this relationship to program synthesis. In such games, non-terminals of the grammar are controlled by opposing players. The goal of the existential player is to avoid producing a word that lies outside of a regular language of safe words.We contribute a new domain that provides a representation of the winning region of such games. Our domain is based on (functions over) potentially infinite Boolean formulas with words as atomic propositions. We develop an abstract interpretation framework that we instantiate to abstract this domain into a domain where the propositions are replaced by states of a finite automaton. This second domain is therefore finite and we obtain, via standard fixed-point techniques, a direct algorithm for the analysis of two-player inclusion games. We show, via a second instantiation of the framework, that our finite domain can be optimized, leading to a (k+1)EXP algorithm for order-k recursion schemes. We give a matching lower bound, showing that our approach is optimal. Since our approach is based on standard Kleene iteration, existing techniques and tools for fixed-point computations can be applied.
Inclusion checking has recently received considerable attention [55, 23, 1, 2, 37]. One of the reasons is a new verification loop, which invokes inclusion as a subroutine in an iterative fashion. The loop has been proposed by Podelski et al. for the safety verification of recursive programs [33], and then been generalized to parallel and parameterized programs [43, 21, 19] and to liveness [20]. The idea of Podelski’s loop is to iteratively approximate unsound data flow in the program of interest, and add the approximations to the specification. Consider a program with control-flow language CF that is supposed to satisfy a safety specification given by a regular language R. If the check CF⊆ R succeeds, then the program is correct as the data flow only restricts the set of computations. If a computation w∈ CF is found that lies outside R, then it depends on the data flow whether the program is correct. If data is handled correctly, w is a counterexample to R. Otherwise, w is generalized to a regular language S of infeasible computations. We set R = R ∪ S and repeat the procedure.
Podelski’s loop has also been generalized to synthesis [36, 45]. In that setting, the program is assumed to have two kinds of non-determinism. Some of the non-deterministic transitions are understood to be controlled by the environment. They provide inputs that the system has to react to, and are also referred to as demonic non-determinism. In contrast, the so-called angelic non-determinism are the alternatives of the system to react to an input. The synthesis problem is to devise a controller that resolves the angelic non-determinism in a way that a given safety specification is met. Technically, the synthesis problem corresponds to a two-player perfect information game, and the controller implements a winning strategy for the system player. When generalizing Podelski’s loop to the synthesis problem, the inclusion check thus amounts to solving a strategy-synthesis problem.
Our motivation is to synthesize functional programs with Podelski’s loop. We assume the program to be given as a non-deterministic higher-order recursion scheme where the non-terminals are assigned to two players. One player is the system player who tries to enforce the derivation of words that belong to a given regular language. The other player is the environment, trying to derive a word outside the language. The use of the corresponding strategy-synthesis algorithm in Podelski’s loop comes with three characteristics: (1) The algorithm is invoked iteratively, (2) the program is large and the specification is small, and (3) the specification is non-deterministic. The first point means that the strategy synthesis should not rely on costly precomputation. Moreover, it should have the chance to terminate early. The second says that the cost of the computation should depend on the size of the specification, not on the size of the program. Computations on the program, in particular iterative ones, should be avoided. Together with the third characteristic, these two consequences rule out reductions to reachability games. The required determinization would mean a costly precomputation, and the reduction to reachability would mean a product with the program. This discussion in particular forbids a reduction of the strategy-synthesis problem to higher-order model checking [47], which indeed can be achieved (see the full version [29] for a comparison to intersection types [42]). Instead, we need a strategy synthesis that can directly deal with non-deterministic specifications.
We show that the winning region of a higher-order inclusion game w.r.t. a non-deterministic right-hand side can be computed with a standard fixed-point iteration. Our contribution is a domain suitable for this computation. The key idea is to use Boolean formulas whose atomic propositions are the states of the targeted finite automaton. While a formula-based domain has recently been proposed for context-free inclusion games [36] (and generalized to infinite words [45]), the generalization to higher-order is new. Consider a non-terminal that is ground and for which we have computed a formula. The Boolean structure reflects the alternation among the players in the plays that start from this non-terminal. The words generated along the plays are abstracted to sets of states from which these words can be accepted. Determining the winner of the game is done by evaluating the formula when sets of states containing the initial state are assigned the value true. To our surprise, the above domain did not give the optimal complexity. Instead, it was possible to further optimize it by resolving the determinization information. Intuitively, the existential player can also resolve the non-determinism captured by a set. Crucially, our approach handles the non-determinism of the specification inside the analysis, without preprocessing.
Besides offering the characteristics that are needed for Podelski’s loop, our development also contributes to the research program of effective denotational semantics, as recently proposed by Salvati and Walukiewicz [53] as well as Grellois and Melliès [25, 25], with [5, 50] being early works in this field. The idea is to solve verification problems by computing the semantics of a program in a suitable domain. Salvati and Walukiewicz studied the expressiveness of greatest fixed-point semantics and their correspondence to automata [53], and constructions of enriched Scott models for parity conditions [52, 51]. A similar line of investigation has been followed in recent work by Grellois and Melliès [26, 27]. Hofmann and Chen considered the verification of more restricted ω-path properties with a focus on the domain [34]. They show that explicit automata constructions can be avoided and give a domain that directly captures subsets (so-called patches) of the ω-language. The work has been generalized to higher order [35]. Our contribution is related in that we focus on the domain (suitable for capturing plays).
Besides the domain, the correctness proof may be of interest. We employ an exact fixed-point transfer result as known from abstract interpretation. First, we give a semantic characterization showing that the winning region can be captured by an infinite model (a greatest fixed point). This domain has as elements (potentially infinite) sets of (finite) Boolean formulas. The formulas capture plays (up to a certain depth) and the atomic propositions are terminal words. The infinite set structure is to avoid infinite syntax. Then we employ the exact fixed-point transfer result to replace the terminals by states and get rid of the sets. The final step is another exact fixed-point transfer that justifies the optimization. We give a matching lower bound. The problem is (k+1)EXP-complete for order-k schemes.
The relationship between recursion schemes and extensions of pushdown automata has been well studied [16, 17, 38, 30]. This means algorithms for recursion schemes can be transferred to extensions of pushdown automata and vice versa. In the sequel, we will use pushdown automata to refer to pushdown automata and their family of extensions.
The decidability of Monadic Second Order Logic (MSO) over trees generated by recursion schemes was first settled in the restricted case of safe schemes by Knapik et al. [38] and independently by Caucal [14]. This result was generalized to all schemes by Ong [47]. Both of these results consider deterministic schemes only.
Related results have also been obtained in the consideration of games played over the configuration graphs of pushdown automata [54, 13, 39, 30]. Of particular interest are saturation methods for pushdown games [7, 22, 12, 8, 31, 32, 9]. In these works, automata representing sets of winning configurations are constructed using fixed-point computations.
A related approach pioneered by Kobayashi et al. operating directly on schemes is that of intersection types [41, 42], where types embedding a property automaton are assigned to terms of a scheme. Recently, saturation techniques were transferred to intersection types by Broadbent and Kobayashi [10]. The typing algorithm is then a least fixed-point computation analogous to an optimized version of our Kleene iteration, restricted to deterministic schemes. This has led to one of the most competitive model-checking tools for schemes [40].
One may reduce our language inclusion problems to many of the above works. E.g. from an inclusion game for schemes, we may build a game over an equivalent kind of pushdown automaton and take the product with a determinization of the NFA. This obtains a reachability game over a pushdown automaton that can be solved by any of the above methods. However, such constructions are undesirable for iterative invocations as in Podelski’s loop.
We already discussed the relationship to model-theoretic verification algorithms. Abstract interpretation has also been used by Ramsay [49], Salvati and Walukiewicz [52, 51], and Grellois and Melliès [25, 24] for verification. The former used a Galois connection between safety properties (concrete) and equivalence classes of intersection types (abstract) to recreate decidability results known in the literature. The latter two strands gives a semantics capable of computing properties expressed in MSO. Indeed, abstract interpretation has long been used for static analysis of higher-order programs [4].
Let (D, ≤) be a partial order with set D and (partial) ordering ≤ on D. We call (D, ≤) pointed if there is a greatest element, called the top element and denoted by ⊤ ∈ D. A descending chain in D is a sequence (di)i ∈ N of elements in D with di ≥ di+1. We call (D, ≤) ω-complete if every descending chain has a greatest lower bound, called the meet or the infimum, and denoted by ⊓i ∈ N di. If (D, ≤) is pointed and ω-complete, we call it a pointed ω-complete partial order (cppo). In the following, we will only consider partial orders that are cppos. Note, cppo is usually used to refer to the dual concept, i.e. partial orders with a least element and least upper bounds for ascending chains.
A function f : D → D is ⊓-continuous if for all descending chains (di)i ∈ N we have f(⊓i ∈ N di) = ⊓i ∈ N f(di). We call a function f : D → D monotonic if for all d, d′ ∈ D, d ≤ d′ implies f(d) ≤ f(d′). Any function that is ⊓-continuous is also monotonic. For a monotonic function, ⊤ ≥ f(⊤) ≥ f2 (⊤) = f(f(⊤)) ≥ f3 (⊤) ≥ … is a descending chain. If the function is ⊓-continuous, then ⊓i ∈ N fi (⊤) is by Kleene’s theorem the greatest fixed point of f, i.e. f(⊓i ∈ N fi (⊤)) = ⊓i ∈ N fi (⊤) and ⊓i ∈ N fi (⊤) is larger than any other element d with f(d) = d. We also say ⊓i ∈ N fi (⊤) is the greatest solution to the equation x = f(x).
A lattice satisfies the descending chain condition (DCC) if every descending chain has to be stationary at some point. In this case ⊓i ∈ N fi (⊤) = ⊓i = 0i0 fi (⊤) for some index i0 in N. With this, we can compute the greatest fixed point: Starting with ⊤, we iteratively apply f until the result does not change. This process is called Kleene iteration. Note that finite cppos, i.e. with finitely many elements in D, trivially satisfy the descending chain condition.
A non-deterministic finite automaton (NFA) is a tuple A = (QNFA, Γ, δ, q0, Qf) where QNFA is a finite set of states, Γ is a finite alphabet, δ ⊆ QNFA × Γ × QNFA is a (non-deterministic) transition relation, q0 ∈ QNFA is the initial state, and Qf ⊆ QNFA is a set of final states. We write q –[a]→ q′ to denote (q, a, q′) ∈ δ. Moreover, given a word w = a1⋯ aℓ, we write q –[w]→ q′ whenever there is a sequence of transitions, also called run, q1 –[a1]→ q2 –[a2]→ ⋯ –[aℓ]→ qℓ+1 with q1 = q and qℓ+1 = q′. The run is accepting if q = q0 and q′ ∈ Qf. The language of A is L(A) = {w | q0 –[w]→ q ∈ Qf} .
We introduce higher-order recursion schemes, schemes for short, following the presentation in [28]. Schemes can be understood as grammars generating the computation trees of programs in a functional language. As is common in functional languages, we need a typing discipline. To avoid confusion with type-based approaches to higher-order model checking [41, 48, 42], we refer to types as kinds. Kinds define the functionality of terms, without specifying the data domain. Technically, the only data domain is the ground kind o, from which (potentially higher-order) function kinds are derived by composition:
κ ::= o ∣ ( κ1 → κ2 ) . |
We usually omit the brackets and assume that the arrow associates to the right. The number of arguments to a kind is called the arity. The order defines the functionality of the arguments: A first-order kind defines functions that act on values, a second-order kind functions that expect functions as parameters. Formally, we have
| = 0, |
| = 0, | ||||||||||||||||||||||
|
|
|
|
Let Κ be the set of all kinds. Higher-order recursion schemes assign kinds to symbols from different alphabets, namely non-terminals, terminals, and variables. Let Γ be a set of such kinded symbols. For each kind κ, we denote by Γκ the restriction of Γ to the symbols with kind κ. The terms Tκ(Γ) of kind κ over Γ are defined by simultaneous induction over all kinds. They form the smallest set satisfying
If term t is of kind κ, we also write t ∶ κ. We use T(Γ) for the set of all terms over Γ. We say a term is λ-free if it contains no sub-term of the form λ x . t. A term is variable-closed if all occurring variables are bound by a preceding λ-expression.
The semantics of G is defined by rewriting subterms according to the rules in R. A context is a term C[•] ∈ T (Γ ⨃ { • ∶ o }) in which • occurs exactly once. Given a context C[•] and a term t:o, we obtain C[t] by replacing the unique occurrence of • in C[•] by t. With this, t ⇒G t′ if there is a context C[•], a rule F = λ x1 … λ xn.e, and a term F t1 … tn:o such that t = C[F t1 … tn] and t′ = C[e[x1 ↦ t1, …, xn ↦ tn]]. In other words, we replace one occurrence of F in t by a right-hand side of a rewriting rule, while properly instantiating the variables. We call such a replaceable F t1 … tn a reducible expression (redex). The rewriting step is outermost to innermost (OI) if there is no redex that contains the rewritten one as a proper subterm. The OI-language L(G) of G is the set of all (finite, ranked, labeled) trees T over the terminal symbols that can be created from the initial symbol S via OI-rewriting steps. We will restrict the rewriting relation to OI-rewritings in the rest of this paper. Note, all words derivable by IO-rewriting are also derivable with OI-rewriting.
We consider word-generating schemes, i.e. schemes with terminals T⨃ {$:o} where exactly one terminal symbol $ has kind o and all others are of kind o→ o. The generated trees have the shape a1 (a2 (⋯ (ak $))), which we understand as the finite word a1 a2 … ak ∈ T*. We also see L(G) as a language of finite words.
The above schemes are non-deterministic in that several rules may rewrite a non-terminal. We associate with a non-deterministic scheme G=(V, N, T, R, S) a deterministic scheme Gdet with exactly one rule per non-terminal. Intuitively, Gdet makes the non-determinism explicit with new terminal symbols.
Formally, let F:κ be a non-terminal with rules F= t1 to F= tℓ. We may assume each ti = λ x1 … λ xk. ei, where ei is λ-free. We introduce a new terminal symbol opF: o → o → … → o of arity ℓ. Let the set of all these terminals be Tdet={opF | F∈ N}. The set of rules Rdet now consists of a single rule for each non-terminal, namely F= λ x1 … λ xk. opF e1 ⋯ eℓ. The original rules in R are removed. This yields Gdet=(V, N, T⨃ Tdet, Rdet, S). The advantage of resolving the non-determinism explicitly is that we can give a semantics to non-deterministic choices that depends on the non-terminal instead of having to treat non-determinism uniformly.
Let G=(V, N, T, R, S) be a deterministic scheme. A model of G is a pair M=(D, I), where D is a family of domains (D(κ))κ∈Κ that satisfies the following: D(o) is a cppo and D(κ1→ κ2)=Cont(D(κ1), D(κ2)). Here, Cont(A, B) is the set of all ⊓-continuous functions from domain A to B. We comment on this cppo in a moment. The interpretation I: T→ D assigns to each terminal s:κ an element I(s)∈D(κ).
The ordering on functions is defined component-wise, f ≤κ1 → κ2 g if (f x) ≤κ2 (g x) for all x∈D(κ1). For each κ, we denote the top element of D(κ) by ⊤κ. For the ground kind, ⊤o exists since D(κ) is a cppo, and ⊤κ1 → κ2 is the function that maps every argument to ⊤κ2. The meet of a descending chain of functions (fi)i ∈ N is the function defined by (⊓κ1 → κ2 (fi)i ∈ N) x = ⊓κ2 (fi x)i ∈ N. Note that the sequence on the right-hand side is a descending chain. The semantics of terms defined by a model is a function
M〚−〛:T→ (N⨃ V↛ D)→D . |
that assigns to each term built over the non-terminals and terminals again a function. This function expects a valuation ν:N⨃ V↛ D and returns an element from the domain. A valuation is a partial function that is defined on all non-terminals and the free variables. We lift ⊓ to descending chains of valuations with (⊓i ∈ N νi)(y) = ⊓i ∈ N (νi(y)) for y ∈ N ⨃ V. We obtain that the set of such valuations is a cppo where the greatest elements are those valuations which assign the greatest elements of the appropriate domain to all arguments.
Since the right-hand sides of the rules in the scheme are variable-closed, we do not need a variable valuation for them. We need the variable valuation, however, whenever we proceed by induction on the structure of terms. The semantics is defined by such an induction:
|
We show that M〚t〛 is ⊓-continuous for all terms t. This follows from continuity of the functions in the domain, but requires some care when handling application.
Given M, the rules F1=t1,…, Fk=tk of the (deterministic) scheme give a function
rhsM : (N → D) → (N → D) , where rhsM(ν)(Fj) = M〚tj〛 ν . |
Since the right-hand sides are variable-closed, the M〚tj〛 are functions in the non-terminals. Provided M〚t1〛 to M〚tk〛 are ⊓-continuous (in the valuation of the non-terminals), the function rhsM will be ⊓-continuous. This allows us to apply Kleene iteration as follows. The initial value is the greatest element σM0 where σM0(Fj) = ⊤j with ⊤j the top element of D(κj). The (i+1)th approximant is computed by evaluating the right-hand side at the ith solution, σMi+1 = rhsM(σMi). The greatest fixed point is the tuple σM defined below. It can be understood as the greatest solution to the equation ν = rhsM(ν). We call this greatest solution σM the semantics of the scheme in the model.
σM = ⊓i ∈ N σMi = ⊓i ∈ N rhsMi (σM0) |
Our goal is to solve higher-order games, whose arena is defined by a scheme. We assume that the derivation process is controlled by two players. To this end, we divide the non-terminals of a word-generating scheme into those owned by the existential player ◇ and those owned by the universal player □. Whenever a non-terminal is to be replaced during the derivation, it is the owner who chooses which rule to apply. The winning condition is given by an automaton A, Player ◇ attempts to produce a word that is in L(A), while Player □ attempts to produce a word outside of L(A).
A play of the game is a sequence of OI-rewriting steps. Since terms generate words, it is unambiguous which term forms the next redex to be rewritten. In particular, all terms are of the form a1 ( a2 ( ⋯ (ak (t) ) ) ), where t is either $ or a redex F t1 ⋯ tm. If O(F) = ◇ then Player ◇ chooses a rule F = λ x1 … λ xm . e to apply, else Player □ chooses the rule. This moves the play to a1 (a2 (⋯ (ak e[x1 ↦ t1, …, xm ↦ tm]))).
Each play begins at the initial non-terminal S, and continues either ad infinitum or until a term a1 (a2 (⋯ (ak $))), understood as the word w=a1… ak, is produced. Infinite plays do not produce a word and are won by Player ◇. Finite maximal plays produce such a word w. Player ◇ wins whenever w ∈ L(A), Player □ wins if w ∈ L(A). Since the winning condition is Borel, either Player ◇ or Player □ has a winning strategy [44].
Our contribution is a fixed-point algorithm to decide HOG. We derive it in three steps. First, we develop a concrete model for higher-order games whose semantics captures the above winning condition. Second, we introduce a framework that for two models and a mapping between them guarantees that the mapping of the greatest fixed point with respect to the one model is the greatest fixed point with respect to the other model. Finally, we introduce an abstract model that uses a finite ground domain. The solution of HOG can be read off from the semantics in the abstract model, which in turn can be computed via Kleene iteration. Moreover, this semantics can be used to define Player ◇’s winning strategy. We instantiate the framework for the concrete and abstract model to prove the soundness of the algorithm.
Consider a HOG instance G = (G, A, O). Let Gdet be the determinized version of G. Our goal is to define a model MC = (DC, IC) such that the semantics of Gdet in this model allows us to decide HOG. Recall that we only have to define the ground domain. For composed kinds, we use the functional lifting discussed in Section 3.
Our idea is to associate to kind o the set of positive Boolean formulas where the atomic propositions are words in T∗. To be able to reuse the definition, we define formula domains in more generality as follows.
Given a (potentially infinite) set P of atomic propositions, the positive Boolean formulas PBool(P) over P are defined to contain true, every p from P, and compositions of formulas via conjunction and disjunction. We work up to logical equivalence, which means we treat φ1 and φ2 as equal as long as they are logically equivalent.
Unfortunately, if the set P is infinite, PBool(P) is not a cppo, because the meet of a descending chain of formulas might not be a finite formula. The idea of our domain is to have conjunctions of infinitely many formulas. As is common in logic, we represent them as infinite sets. Therefore, we consider the set of all sets of (finite) positive Boolean formulas P(PBool(T*))∖{∅} factorized modulo logical equivalence, denoted (P(PBool(T*))∖{∅}) /⇔. To be precise, the sets may be finite or infinite, but they must be non-empty.
To define the factorization, let an assignment to the atomic propositions be given by a subset of P′ ⊆ P. The atomic proposition p is true if p ∈ P′. An assignment satisfies a Boolean formula, if the formula evaluates to true in that assignment. It satisfies a set of Boolean formulas, if it satisfies all elements. Given two sets of formulas Φ1 and Φ2, we write Φ1⇒ Φ2, if every assignment that satisfies Φ1 also satisfies Φ2. Two sets of formulas are equivalent, denoted Φ1⇔ Φ2, if Φ1⇒ Φ2 and Φ2⇒ Φ1 holds. The ordering on these factorized sets is implication (which by transitivity is independent of the representative). The top element is the set {true}, which is implied by every set. The conjunction of two sets is union. Note that it forms the meet in the partial order, and moreover note that meets over arbitrary sets exist, in particular the domain is a cppo. We will also need an operation of disjunction, which is defined by Φ1∨ Φ2 = {φ1∨ φ2 | φ1∈ Φ1, φ2∈ Φ2}. We will also use disjunctions of higher (but finite) arity where convenient. Note that the disjunction on finite formulas is guaranteed to result in a finite formula. Therefore, the above is well-defined.
In our case, the assignment P′ ⊆ T* of interest is the language of the automaton A. Player ◇ will win the game iff the concrete semantics assigns a set of formulas to S that is satisfied by L(A).
From a ground domain, higher-order domains are defined as continuous functions as in Section 3. Thus we only need
DC(o) = | ⎛ ⎜ ⎝ | P(PBool | ⎛ ⎝ | T* | ⎞ ⎠ | )∖ | ⎧ ⎨ ⎩ | ∅ | ⎫ ⎬ ⎭ | ⎞ ⎟ ⎠ | /⇔ . |
The endmarker $ yields the set of formulas {ε}, i.e. IC($)={ε}. A terminal a:o→ o prepends a to a given word w. That is IC(a) = prependa, where prependa distributes over conjunction and disjunction:
prependa(φ) = | ⎧ ⎪ ⎪ ⎨ ⎪ ⎪ ⎩ |
|
We apply prependa to sets of formulas by applying it to every element. Finally, IC(opF) where opF has arity ℓ is an ℓ-ary conjunction (resp. disjunction) if Player □ (resp. ◇) owns F.
For MC = (DC, IC) to be a model, we need our interpretation of terminals to be ⊓-continuous. This follows largely by the distributivity of our definitions.
|
We need to show that the concrete semantics matches the original semantics of the game.
When σMC(S) is satisfied by L(A) the concrete semantics gives a winning strategy for ◇: From a term t such that MC〚t〛 σMC is satisfied by L(A), Player ◇, when able to choose, picks a rewrite rule that transforms t to t′, where MC〚t′〛 σMC remains satisfied. The proof of Theorem 1 shows this is always possible, and, moreover, Player □ is unable to reach a term for which satisfaction does not hold. This does not yet give an effective strategy since we cannot compute MC〚t〛 σMC. However, the abstract semantics will be computable, and can be used in place of the concrete semantics by Player ◇ to implement the winning strategy.
The proof that σMC(S) being unsatisfied implies a winning strategy for Player □ is more involved and requires the definition of a correctness relation between semantics and terms that is lifted to the level of functions, and shown to hold inductively.
The concrete model MC does not lead to an algorithm for solving HOG since its domains are infinite. Here, we consider an abstract model MA with finite domains. The soundness of the resulting Kleene iteration relies on the two semantics being related by a precise abstraction α. Since both semantics are defined by fixed points, this requires us to prove α (σMC) = σMA. In this section, we provide a general framework to this end.
Consider the deterministic scheme G together with two models (left and right) Ml = (Dl, Il) and Mr = (Dr, Ir). Our goal is to relate the semantics in these models in the sense that σMr=α(σMl). Such exact fixed-point transfer results are well-known in abstract interpretation. To generalize them to higher-order we give easy to instantiate conditions on α, Ml, and Mr that yield the above equality. Interestingly, exact fixed-point transfer results seem to be rare for higher-order (e.g. [48]). Our development is inspired by Abramsky’s lifting of abstraction functions to logical relations [3], which generalizes [11, 4]. These works focus on approximation and the compatibility we need for exactness is missing. Our framework is easier to apply than [15, 6], which are again concerned with approximation and do not offer (but may lead to) exact fixed-point transfer results.
For the terminology, an abstraction is a function α:Dl(o)→ Dr(o). To lift the abstraction to function domains, we define the notion of being compatible with α. Compatibility intuitively states that the function on the concrete domain is not more precise than what the abstraction function distinguishes. This allows us to define the abstraction of a function by applying the function and abstracting the result, α(f) α(vl)=α(f vl). Compatibility ensures the independence of the choice of vl.
By definition, all ground elements vl∈Dl(o) are compatible with α. For function domains, compatibility and the abstraction are defined as follows.
We lift α to valuations ν : N ⨃ V ↛ Dl by α(ν)(F) = α(ν(F)) and similar for x. We also lift compatibility to valuations ν:N⨃ V↛ Dl by requiring ν(F) to be compatible for all F∈ N and similar for x∈ V.
The conditions needed for the exact fixed-point transfer are the following.
(P1) is surjectivity of α. (P2) states that α is well-behaved wrt. ⊓. (P3) says that the greatest element is mapped as expected. Note that (P1)-(P3) are only posed for the ground domain. One can prove that they generalize to function domains by the definition of function abstraction. (P4) is that the interpretations of terminals in MC and MA are suitably related. Finally (P5) is compatibility. (P4) and (P5) are generalized to terms in Lemma 2.
To prove α(σMl)=σMr, we need that rhsMr is an exact abstract transformer of rhsMl. The following lemma states this for all terms t, in particular those that occur in the equations. The generalization to product domains is immediate. Note that the result is limited to compatible valuations, but this will be sufficient for our purposes. The proof proceeds by induction on the structure of terms, while simultaneously proving Ml〚t〛 compatible with α. With this result, we obtain the required exact fixed-point transfer for precise abstractions.
We propose two domains, abstract and optimized, that allow us to solve HOG. The computation is a standard fixed-point iteration, and, in the optimized domain, this iteration has optimal complexity. Correctness follows by instantiating the previous framework.
Our goal is to define an abstract model for games that (1) suitably relates to the concrete model from Section 4 and (2) is computable. By a suitable relation, we mean the two models should relate via an abstraction function. Provided the conditions on precision hold, correctness of the abstraction then follows from Theorem 2. Combined with Theorem 1, this will allow us to solve HOG. Computable in particular means the domain should be finite and the operations should be efficiently computable.
We define the MA = (DA, IA) as follows. Again, we resolve the non-determinism into Boolean formulas. But rather than tracking the precise words generated by the scheme, we only track the current set of states of the automaton. To achieve the surjectivity required by precision, we restrict the powerset to those sets of states from which a word is accepted. Let acc(w) = { q | q –[w]→ qf ∈ Qf }. For a language L we have acc(L) = { acc(w) | w ∈ L }. The abstract domain for terms of ground kind is DA(o) = PBool(acc(T*)). The lifting to functions is as explained in Section 3. Satisfaction is now defined relative to a set Ω of elements of P(QNFA) (cf. Section 4). With finitely many atomic propositions, there are only finitely many formulas (up to logical equivalence). This means we no longer need sets of formulas to represent infinite conjunctions, but can work with plain formulas. The ordering is thus the ordinary implication with the meet being conjunction and top being true.
The interpretation of ground terms is IA($) = Qf and IA(a)= prea. Here prea is the predecessor computation under label a, prea(Q)={q′∈ QNFA | q′–[a]→ q ∈ Q}. It is lifted to formulas by distributing it over conjunction and disjunction. The composition operators are again interpreted as conjunctions and disjunctions, depending on the owner of the non-terminal. Since we restrict the atomic propositions to acc(T*), we have to show that the interpretations use only this restricted set. Proving IA(s) is ⊓-continuous is standard.
Recall our concrete model is MC=(DC, IC), where DC=P(PBool(T*)). To relate this model to MA, we define the abstraction function α:DC(o)→DA(o). It leaves the Boolean structure of a formula unchanged but maps every word (which is an atomic proposition) to the set of states from which this word is accepted. For a set of formulas, we take the conjunction of the abstraction of the elements. This conjunction is finite as we work over a finite domain, so there is no need to worry about infinite syntax. Technically, we define α on PBool(T*) by α(Φ)=∧φ∈Φα(φ) for a set of formulas Φ∈P(PBool(T*)), and
α(φ) = | ⎧ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎩ |
|
This definition is suitable in that α(σMC)=σMA entails the following.
To see that the theorem is a consequence of the exact fixed-point transfer, observe that {Q ∈ acc(T*) | q0 ∈ Q} = acc(L(A)). Then, by σMA=α(σMC) we have acc(L(A)) satisfies σMA(S) iff it also satisfies α(σMC(S)). This holds iff L(A) satisfies σMC(S) (a simple induction over formulas). By Theorem 1, this occurs iff Player ◇ wins the game.
It remains to establish α(σMC)=σMA. With the framework, the exact fixed-point transfer follows from precision, Theorem 2. The proof of the following is routine.
The above model yields a decision procedure for HOG via Kleene iteration. Unfortunately, the complexity is one exponential too high: The height of the domain for a symbol of order k in the abstract model is (k+2)-times exponential, where the height is the length of the longest strictly descending chain in the domain. This gives the maximum number of steps of Kleene iteration needed to reach the fixed point.
We present an optimized version of our model that is able to close the gap: In this model, the domain for an order-k symbol is only (k+1)-times exponentially high. The idea is to resolve the atomic propositions in MA, which are sets of states, into disjunctions among the states. The reader familiar with inclusion algorithms will find this decomposition surprising.
We first define α:PBool(acc(T*))→ PBool(QNFA). The optimized domain will then be based on the image of α. This guarantees surjectivity. For a set of states Q, we define α(Q)=∨Q= ∨q∈ Q q. For a formula, the abstraction function is defined to distribute over conjunction and disjunction. The optimized model is MO = (DO, IO) with ground domain α(PBool(acc(T*))). The interpretation is IO($)=∨Qf. For a, we resolve the set of predecessors into a disjunction, IO(a) q = ∨prea({q}). The function distributes over conjunction and disjunction. Finally, IO(opF) is conjunction or disjunction of formulas, depending on the owner of the non-terminal. Since we use a restricted domain, we have to argue that the operations do not leave the domain. It is also straightforward to prove our interpretation is ⊓-continuous as required.
We again show precision, enabling the required exact fixed-point transfer.
It is sufficient to show σMA(S) is satisfied by {Q ∈ acc(T*) | q0 ∈ Q} iff σMO(S) is satisfied by {q0}. Theorem 3 then yields the statement. Propositions Q in σMA(S) are resolved into disjunctions ∨Q in σMO(S). For such a proposition, we have Q∈ {Q ∈ acc(T*) | q0 ∈ Q} iff ∨Q is satisfied by {q0}. This equivalence propagates to the formulas σMA(S) and σMO(S) as the Boolean structure coincides. The latter follows from α(σMA(S))=σMO(S).
To solve HOG, we compute the semantics σMO and then evaluate σMO(S) at the assignment {q0}. For the complexity, assume that the highest order of any non-terminal in G is k. We show the number of iterations needed to compute the greatest fixed point is at most (k+1)-times exponential. We do this via a suitable upper bound on the length of strictly descending chains in the domains assigned by DO.
The lower bound is via a reduction from the word membership problem for alternating k-iterated pushdown automata with polynomially-bounded auxiliary work-tape. This problem was shown by Engelfriet to be (k+1)EXP-hard. We can reduce this problem to HOG via well-known translations between iterated stack automata and recursion schemes, using the regular language specifying the winning condition to help simulate the work-tape.
Together, these results show the following corollary and final result.
Acknowledgments. This work was supported by the Engineering and Physical Sciences Research Council [EP/K009907/1]. The work instigated while some of the authors were visiting the Institute for Mathematical Sciences, National University of Singapore in 2016. The visit was partially supported by the Institute.
We elaborate on the relation of our work to the influential line of research on intersection types as pioneered by [42]. With intersection types, it is usually proven that there is a word or tree derivable by a HORS that is accepted by an automaton, i.e. a well-typed type environment can be certificate for the non-emptiness of the intersection L(scheme) ∩ L(Automaton) ≠ ∅. If the HORS is deterministic, L(scheme) consists of a single tree, so this is also decides the inclusion L(scheme) ⊆ L(Automaton). If we naively extend intersection types to non-deterministic schemes, this is not true anymore. To prove the inclusion in this case, we will need to complement the automaton and prove the emptiness of the intersection, i.e. L(scheme) ∩ L(Automaton) = ∅. Note that a well-typing (a well-typed type environment) cannot prove the emptiness by itself: If the type for the initial symbol does not contain a transition from a final to an initial state, that can either stem from the non-existence of a such a transition sequence, or from the typing not being strong enough. For example, the empty typing that does not assign any type to any symbol (or the empty intersection, if you want), is a well-typing and does not prove anything. Therefore, an algorithm that decides the non-emptiness of the intersection by using intersection-types has to guarantee that it constructs a well-typing strong enough to prove the existence of an accepting transition sequence if such a sequence exists. Note that algorithms that compute intersection types usually allow alternating automata as the specification. It is conceptually easier to complement an alternating automaton than it is to complement a non-deterministic automaton: The transition for each origin and label is given as a Boolean formula, and we can get the complement automaton by considering the dual formula (i.e. the formula in which conjunctions and disjunctions are swapped). Note that usually, the transition formulas are normalized to disjunctive normal form (DNF), so computing the dual formula (which will then be in CNF) and re-normalizing it to DNF can lead to an exponential blowup.
Work by Neatherway et al. [46] and Ramsay [48] considers schemes with non-determinism in the form of case statements. To handle this non-determinism they introduce union types as a ground type. Neatherway et al. give an optimised algorithm for checking such schemes against deterministic trivial automata (where all infinite runs are accepting – i.e. a Büchi condition where all states are accepting). In his thesis, Ramsay extends this to checking non-deterministic schemes against non-deterministic trivial automata using abstract interpretation from schemes to types. In our work, we generalise non-determinism to games (played over word-generating schemes), with a non-deterministic target language.
Let (νi)i ∈ N be a descending chain of evaluations, i.e. νi ≥ νi+1 for all i ∈ N. It is to show that for all t, M〚t〛 is ⊓-continuous (in the argument ν) over the respective lattice, i.e.
M〚t〛 | ⎛ ⎝ | ⊓i ∈ N νi | ⎞ ⎠ | = ⊓i ∈ N (M〚F〛 νi) . |
We proceed by induction over t.
M〚F〛 (⊓i ∈ N νi) = (⊓i ∈ N νi)(F) = ⊓i ∈ N (νi(F)) = ⊓i ∈ N (M〚F〛 νi) |
M〚s〛 (⊓i ∈ N νi) = I(s) = ⊓i ∈ N (M〚s〛 νi) |
M〚t1 t2〛 (⊓i ∈ N νi) | |
(Definition of semantics) = | (M〚t1〛 (⊓i ∈ N νi)) (M〚t2〛 (⊓i ∈ N νi)) |
(Induction hypothesis) = | (⊓i ∈ N (M〚t1〛 νi)) (⊓i ∈ N (M〚t2〛 νi)) |
(Definition of ⊓ for functions) = | ⊓i ∈ N ( (M〚t1〛 νi) (⊓i ∈ N (M〚t2〛 νi)) ) |
(Continuity of M〚t1〛 νi ∈ D) = | ⊓i ∈ N ⊓j ∈ N ( (M〚t1〛 νi) (M〚t2〛 νj)) ) |
(Argued below) = | ⊓i ∈ N ( (M〚t1〛 νi) (M〚t2〛 νi)) ) |
(Definition of semantics) = | ⊓i ∈ N (M〚t1 t2〛 νi) . |
⊓i ∈ N ⊓j ∈ N ( (M〚t1〛 νi) (M〚t2〛 νj)) ) = ⊓i ∈ N ( (M〚t1〛 νi) (M〚t2〛 νi)) ) . |
((M〚t1〛 νm) (M〚t2〛 νm)) ≤ ((M〚t1〛 νi) (M〚t2〛 νj)) . |
M〚λ x . t′〛 (⊓i ∈ N νi) | |
(Definition of semantics) = | v ↦ (M〚t′〛 (⊓i ∈ N νi)[x ↦ v]) |
(Induction hypothesis) = | v ↦ (⊓i ∈ N (M〚t′〛 νi[x ↦ v])) |
(Definition of ⊓ for functions) = | ⊓i ∈ N ( (v ↦ M〚t1〛 νi[x ↦ v]) ) |
(Definition of semantics) = | ⊓i ∈ N (M〚λ x . t′〛 νi) . |
▫
Since we have not syntactically defined the evaluation of a λ-term, our development will need a simple substitution lemma.
We show that for all ν : N⨃ V↛ D and all suitable terms t, t′, we have
M〚(λ x . t) t′〛 ν = M〚t[x ↦ t′]〛 ν . |
We have by definition
M〚(λ x . t) t′〛 ν = (M〚(λ x . t)〛 ν) (M〚t′〛 ν) = M〚t〛 (ν[x ↦ M〚t′〛 ν]) |
and show by induction over t that
M〚t〛 ν[x ↦ M〚t′〛 ν] = M〚t[x ↦ t′]〛 ν . |
In the base cases we have
Then, for the induction step, we first consider application. That is
M〚t1 t2〛 ν[x ↦ M〚t′〛 ν] = | ⎛ ⎝ | M〚t1〛 ν[x ↦ M〚t′〛 ν] | ⎞ ⎠ | ⎛ ⎝ | M〚t2〛 ν[x ↦ M〚t′〛 ν] | ⎞ ⎠ |
which is equal to, by induction,
⎛ ⎝ | M〚t1[x ↦ t′]〛 ν | ⎞ ⎠ | ⎛ ⎝ | M〚t2[x ↦ t′]〛 ν | ⎞ ⎠ | = M〚t1[x ↦ t′] t2[x ↦ t′]〛 ν = M〚(t1 t2)[x ↦ t′]〛 ν . |
Finally, for abstraction, we can assume by α-conversion that y ≠ x, and we have
M〚λ y . t1〛 ν[x ↦ M〚t′〛 ν] = v ↦ M〚t1〛 ν[x ↦ M〚t′〛 ν, y ↦ v] |
which is by induction equal to the function
v ↦ M〚t1[x ↦ t′]〛 ν[y ↦ v] = M〚λ y . t1[x ↦ t′]〛 ν = M〚(λ y . t1)[x ↦ t′]〛 ν . |
Thus, by induction, we have the lemma as required. ▫
We show that for all non-ground terminals s, IC(s) is ⊓-continuous. We need to treat the terminals a ∶ o → o of the original scheme and the terminals opF that were introduced for the determinisation separately. In each case, assume a descending chain of arguments (xi)i ∈ N.
prependa (⊓i ∈ N xi) = ⊓i ∈ N(prependa xi) |
Now assume opF has arity ℓ+1, and IC(opF) = ∨ℓ+1 is an ℓ+1-fold disjunction. We have
∨ℓ+1 (⊓i ∈ N xi) | |
(Definition of ∨ℓ+1) = | y1, …, yℓ ↦ (⊓i ∈ N xi) ∨ ∨ℓ y1 ⋯ yℓ |
(Distributivity (see below)) = | y1, …, yℓ ↦ ⊓i ∈ N (xi ∨ ∨ℓ y1 ⋯ yℓ) |
(Definition of ⊓ for functions) = | ⊓i ∈ N (y1, …, yℓ ↦ xi ∨ ∨ℓ y1 ⋯ yℓ) |
(Definition of ∨ℓ+1) = | ⊓i ∈ N ∨ℓ+1 xi |
In the above we required ∨ to distribute over ⊓, which can be seen by induction over types. In the base case, that ∨ distributes over ⊓ = ∧ is standard. For the step case, we have for all fi, g, and v
| ||||||
(Definition of ∨ and ⊓) = |
| |||||
(Induction) = |
| |||||
(Definition of ∨ and ⊓) = |
|
▫
We are required to show σMC(S) is satisfied by L(A) iff there is a winning strategy foryer Player ◇. The theorem is shown in the following two lemmas.
First, we introduce some notation. We write prependw for w=a1… an to abbreviate prependa1 ∘ ⋯ ∘ prependan.
In what follows, whenever we refer to a term t, we mean a term built over N⨃ T, but not over Tdet. The terminals opF are excluded because they do not occur in the game, they are only introduced in the determinized scheme.
We will demonstrate a strategy for ◇ that maintains the invariant that the current (variable-free) term t reached is such that MC〚t〛 σMC is satisfied by L(A). All plays are infinite or generate a word w. Since we maintain MC〚t〛 σMC is satisfied by L(A), if t represents a word w, we know w is accepted by A and Player ◇ wins the game.
Initially, we have MC〚S〛 σMC = σMC(S) which is satisfied by L(A) by assumption. Thus, suppose play reaches a term t such that MC〚t〛 σMC is satisfied by L(A). There are two cases.
In the first case t = a1 (⋯ (an $)) and let w = a1 … an. Since
MC〚a1(⋯(an($)))〛 σMC = prependa1… an(ε) = w |
and w is satisfied by L(A), we know w ∈ L(A) and Player ◇ has won the game.
In the second case, we have t = a1(⋯(an(F t1 ⋯ tm))). By assumption, we know
MC〚a1(⋯(an(F t1 ⋯ tm)))〛 σMC = |
prependa1 … an( (MC〚F〛 σMC) (MC〚t1〛 σMC) ⋯ (MC〚tm〛 σMC) ) |
is satisfied by L(A). Let F = e1, …, F = eℓ be the rewrite rules for F. There are two subcases.
prependa1 … an( (MC〚ei〛 σMC) (MC〚t1〛 σMC) ⋯ (MC〚tm〛 σMC) ) |
We need to show the invariant is maintained. Let ei = λ x1, …, xm . e. We have (using the substitution lemma, Lemma 7),
prependa1 … an( (MC〚λ x1, …, xm . e〛 σMC) (MC〚t1〛 σMC) ⋯ (MC〚tm〛 σMC) ) | |
= | prependa1 … an( MC〚(λ x1, …, xm . e) t1 … tm〛 σMC ) |
= | prependa1 … an( MC〚e[x1 ↦ t1, …, xm ↦ tm]〛 σMC ) |
= | MC〚a1(⋯(an(e[x1 ↦ t1, …, xm ↦ tm])))〛 σMC . |
Note that the term a1(⋯(an(e[x1 ↦ t1, …, xm ↦ tm]))) is the result of Player ◇ rewriting F via F = ei. Since the satisfaction by L(A) passes through the equalities, Player ◇’s move maintains the invariant as required.
prependa1 … an( (MC〚ei〛 σMC) (MC〚t1〛 σMC) ⋯ (MC〚tm〛 σMC) ) |
▫
In what follows, whenever we refer to a term t, we mean a term built over N⨃ T, but not over Tdet. The terminals opF are excluded because they do not occur in the game, they are only introduced in the determinized scheme.
For φ ∈ DC(o) and a variable-closed term t of kind o, we define φ to be sound for t, denoted φ ⊢ t, if for all w∈ T* such that prependw(φ) is not satisfied by L(A), Player □ has a winning strategy from term w(t). For w=ε, we set prependε(φ)=φ and let ε(t)=t. We can now restate the lemma as
σMC(S) ⊢ S . |
In particular, since σMC(S) is not satisfied by L(A) it is the case that prependε(σMC(S)) is not satisfied. This means Player □ has a winning strategy from ε(S) = S.
In general, for Ξ ∈ DC(κ1 → κ2), we will also define Ξ ⊢ t for terms of kind κ1 → κ2. That is, for a variable-closed term t of kind κ1 → κ2 and a function Ξ ∈ DC(κ1 → κ2), we define Ξ ⊢ t to hold whenever for all variable-closed terms t′ of kind κ1 and Ξ′ ∈ DC(κ1) such that Ξ′ ⊢ t′ we have Ξ Ξ′ ⊢ t t′:
Ξ ⊢ t, if ∀ Ξ′, t′ such that Ξ′ ⊢ t′, we have Ξ Ξ′ ⊢ t t′ . |
Similarly, we need to extend ⊢ to terms t with free variables x=x1 … xm. Here, we make the free variables explicit and write t (x). We define for Ξ : (V↛ DC) → DC that Ξ ⊢ t(x) by requiring that for any variable-closed terms t1, …, tm and any Ξ1, …, Ξm ∈ DC with Ξj ⊢ tj for all 1≤ j≤ m, we have Ξ ν ⊢ t [∀ j ∶ xj ↦ tj], where ν maps xj to Ξj.
We now show the following. For every number of iterations i in the fixed-point calculation, we have MC〚t〛 σMCi ⊢ t, for all terms t built over the terminals and non-terminals in the scheme of interest. After the induction, we will show that the result holds for the greatest fixed point. Note that we have a nested induction: the outer induction is along i, the inner is along the structure of terms.
Since we are inducting over non-closed terms, we will have to extend σMCi to assign valuations to free-variables. Thus we will write νi to denote a valuation such that νi(F) = σMCi(F) for any non-terminal F.
Base case i.
In the base case, we have i=0 and νi=⊤ for all non-terminals.
We proceed by induction on the structure of terms.
We will emphasize if an argumentation is independent of the iteration count.
This is the case for all terms except non-terminals.
Base case t.
The base cases of the inner induction that are independent of the iteration count are the following.
MC〚a〛 νi Ξ = prependa(Ξ) ⊢ a(t) |
MC〚x〛 νi = νi(x). |
The only base case of the inner induction that depends on the iteration count is t=F. Let F take m arguments and consider variable-closed terms t1, …, tm with corresponding Ξj such that Ξj ⊢ tj. We have
MC〚F〛 ν0 Ξ1 … Ξm = σMC(F)0 Ξ1 … Ξm = true. |
Thus, trivially MC〚F〛 ν0 ⊢ F since true is never unsatisfied.
Step case t.
In both cases, the argumentation is independent of the actual iteration count.
Therefore, we give it for a general i rather than for 0.
MC〚t′〛 νi ⊢ t′ and MC〚t″〛 νi ⊢ t″ . |
MC〚t′ t″〛 νi = (MC〚t′〛 νi) (MC〚t″〛 νi) ⊢ t′ t″ . |
MC〚t′ t″〛 νi = | (MC〚t′〛 νi) (MC〚t″〛 νi) |
⊢ | (t′[∀ j: xj ↦ tj]) (t″[∀ j: xj ↦ tj]) = (t′ t″)[∀ j: xj ↦ tj] . |
MC〚λ x.e〛 νi ⊢ (λ x.e)[∀ j: xj↦ tj] . |
(MC〚λ x.e〛 νi) Ξ⊢ ((λ x.e)[∀ j: xj↦ tj]) t . |
(MC〚λ x.e〛 νi) Ξ = MC〚e〛 νi[x↦ Ξ] . |
((λ x.e)[∀ j: xj↦ tj]) t = (λ x.(e[∀ j: xj↦ tj])) t. |
MC〚e〛 νi[x↦ Ξ] ⊢ e[x↦ t, ∀ j: xj↦ tj] . |
Step case i.
We again do an induction along the structure of terms.
The only case that has not been treated in full generality is F.
We now show that MC〚F〛 νi+1⊢ F.
Let F take m arguments and consider Ξ1⊢ t1 to Ξm⊢ tm.
The task is to prove MC〚F〛 νi+1 Ξ1 … Ξm⊢ F t1 … tm.
To ease the notation, assume there are two right hand sides e1′, e2 for F, i.e. we have the rules
F = λ x1 … λ xm.e1 and F = λ x1 … λ xm.e2.
This means the right-hand side in the determinised scheme is
F = λ x1 … λ xm . (opF e1 e2).
Then,
MC〚F〛 νi+1 | = νi+1(F) | ||||||||||||||||||||||||
= MC〚 λ x1 … λ xm . (opF e1 e2) 〛 νi | |||||||||||||||||||||||||
= v1, …, vm ↦ MC〚(opF e1 e2)[x1 ↦ v1, …, xm ↦ vm]〛 νi | |||||||||||||||||||||||||
| |||||||||||||||||||||||||
|
Here, IC(opF) is a conjunction or disjunction, depending on the owner of F. Recall that the conjunction and disjunction of functions are defined by evaluating the argument functions separately and combining the results. This means
MC〚F〛 νi+1 |
| ||||||||||||||||||||||||
| |||||||||||||||||||||||||
| |||||||||||||||||||||||||
| |||||||||||||||||||||||||
|
With the same reasoning, we obtain
MC〚F〛 νi+1 Ξ1 … Ξm | |
= (MC〚e1′〛 νi Ξ1 … Ξm) (∨/∧) (MC〚e2′〛 νi Ξ1 … Ξm). |
We have to prove that for any w∈ T*, if L(A) does not satisfy the formula
prependw((MC〚e1′〛 νi Ξ1 … Ξm) (∨/∧) (MC〚e2′〛 νi Ξ1 … Ξm)) | |
= | prependw(MC〚e1′〛 νi Ξ1 … Ξm) (∨/∧) prependw(MC〚e2′〛 νi Ξ1 … Ξm), |
then Player □ has a winning strategy from w(F t1 … tm).
Assume Player ◇ owns F and the formula is not satisfied.
If Player □ owns F, the reasoning is similar.
Since we have a disjunction for Player ◇, prependw(MC〚e1′〛 νi Ξ1 … Ξm) is not satisfied.
By the hypothesis of the outer induction, we obtain MC〚e1′〛 νi ⊢ e1 and thus MC〚e1′〛 νi Ξ1 … Ξm ⊢ e1′ t1 … tm.
As in the case of λ-abstraction above, we use that the game identifies e1′ t1 … tm and e1 [x1 ↦ t1, … em ↦ tm].
Hence, Player □ has a winning strategy from w(e1 [x1 ↦ t1, … em ↦ tm]).
The same argumentation applies to prependw(MC〚e2〛 νi Ξ1 … Ξm).
Consequently, whichever move Player ◇ makes at
w(F t1 … tm),
Player □ has a winning strategy.
This finishes the outer induction, proving that MC〚t〛 σMCi ⊢ t for all terms t and all i ∈ N.
We would like to conclude MC〚t〛 σMC ⊢ t.
Since the cppo under consideration is not finite, this needs to be proven separately.
Limit case.
We have shown
MC〚t〛 σMCi ⊢ t
for all i ∈ N; we now show
MC〚t〛 σMC ⊢ t
noting by Kleene that
σMC = ⊓i ∈ N σMCi.
Once we have this we have
σMC(S) ⊢ S
which proves the lemma.
We formulate a slightly more general induction hypothesis for induction over kinds: Given a descending sequence of Ξi for all i ∈ N such that each Ξi ⊢ t, we have ⊓i ∈ N Ξi ⊢ t. In the base case we have t is of kind o and we assume Ξi ⊢ t. We now argue ⊓i ∈ N Ξi ⊢ t.
Take any w and suppose prependw(⊓i ∈ N Ξi) is not satisfied, then we need to show by the definition of ⊢ that Player □ has a winning strategy. Since ⊓ is conjunction, if
prependw(⊓i ∈ N Ξi) = ⊓i ∈ N prependw(Ξi) |
is not satisfied, it must be the case that for some i we have prependw(Ξi) is not satisfied. In this case, we have Ξi ⊢ t by assumption and thus by the definition of ⊢ that Player □ has a winning strategy from w(t). This proves ⊓i ∈ N Ξi ⊢ t.
If t is of kind κ1 → κ2 we need to show for all Ξ ⊢ t′ that (⊓i ∈ N Ξi) Ξ ⊢ t t′. We have by the definition of ⊓ over functions
(⊓i ∈ N Ξi) Ξ = ⊓i ∈ N (Ξi Ξ) |
Since by assumption on Ξi and definition of ⊢ for function kinds, we have Ξi Ξ ⊢ t t′ for each i. By the induction on the kind, we obtain ⊓i ∈ N(Ξi Ξ) ⊢ t t′ . Since (⊓i ∈ N Ξi) σ = ⊓i ∈ N (Ξi σ) we establish the desired statement that finishes the induction.
Finally, since MC〚t〛 σMCi satisfies the conditions of the above induction hypothesis and because we have already shown MC〚t〛 σMCi ⊢ t for all t, we obtain
⊓i ∈ N (MC〚t〛 σMCi) ⊢ t . |
Then, since using continuity of MC〚t〛 we have
MC〚t〛 σMC = MC〚t〛 (⊓i ∈ N σMCi) = ⊓i ∈ N (MC〚t〛 σMCi) |
we obtain the lemma as required. ▫
We show that several properties needed for precision can be lifted from the ground domain to function domains.
We show that, if (P1) holds, then for every κ∈ Κ and every vr∈Dr(κ) there is a compatible vl∈Dl(κ) with α(vl)=vr. We proceed by induction on kinds. The base case is given by the assumption (P1) and the fact that every ground element is compatible. Assume we have the required surjectivity of α for κ1 and κ2 and consider fr∈Dr(κ1→ κ2). The task is to find a compatible function fl so that α(fl)=fr. Assume fr vr = vr′. By surjectivity for κ1, there are compatible elements in α−1(vr), and similar for vr′. Let vl′ be a compatible element that is mapped to vr′ by α. We define fl vl = vl′ for all compatible vl ∈ α−1(vr). Since α is total on D(κ1), this assigns a value to all compatible vl. We do not impose any requirements on how to map elements that are not compatible.
We argue that fl is compatible. To this end, consider compatible vl1 and vl2 with α(vl1)=α(vl2). By definition, both are mapped identically by fl, fl vl1=fl vl2. Hence, in particular the abstractions coincide. Moreover, given a compatible vl, we defined fl vl = vl′ to be a compatible element.
Concerning the equality of the functions, we have α(fl) vr = α(fl vl) = α(vl′) = vr′. The first equality is the definition of abstraction for functions and the fact that α−1(vr) contains compatible elements, one of them being vl, the second is the fact that vl is mapped to vl′, and the last is by vl′ ∈ α−1(vr′). ▫
We proceed by induction on kinds to show that, if (P1) and (P2) hold, then for all kinds κ∈ Κ and for all descending chains of compatible values f1, f2, … ∈D(κ), we have ⊓i∈ Nfi again compatible and α(⊓i∈ Nfi)=⊓i∈ Nα(fi). The base case is the assumption.
In the induction step, let κ=κ1→ κ2 and f1, f2,…∈Dl(κ) be a descending chain of compatible elements. Let vl∈ Dl(κ1) be compatible. The following equalities will be helpful:
α((⊓i∈ N fi) vl)=α(⊓i∈ N (fi vl))=⊓i∈ N α(fi vl) = ⊓i∈ N (α(fi) α(vl))=(⊓i∈ N α(fi)) α(vl). |
The first equality is the definition of ⊓ on functions, the second is the induction hypothesis for κ2, the third is compatibility of the fi and vl, the last is again ⊓ on functions.
To show compatibility, note that the above implies α((⊓i∈ N fi) vl)=α((⊓i∈ N fi) vl′) as long as α(vl)=α(vl′), for all compatible vl, vl′∈Dl(κ1). For compatibility of (⊓i∈ N fi) vl with vl∈Dl(κ1) compatible, note that (⊓i∈ N fi) vl=⊓i∈ N (fi vl). The latter is the meet over a descending chain of compatible elements in κ2. By the induction hypothesis on κ2, it is again compatible.
For ⊓-continuity, consider a value vr∈Dr(κ1). By Lemma 10, there is a compatible vl∈Dl(κ1) with α(vl)=vr. We have
α(⊓i∈ Nfi) vr =α((⊓i∈ Nfi) vl)= (⊓i∈ N α(fi)) α(vl)=(⊓i∈ N α(fi)) vr. |
The first equality is the definition of abstraction on functions. Note that we need here the fact that ⊓i∈ Nfi is compatible by the induction hypothesis. The second equality is the auxiliary one from above. The last equality is by α(vl)=vr. ▫
We show that, if (P3) holds, then α(⊤κl)=⊤κr for all κ∈ Κ. We proceed by induction on kinds. The base case is given by the assumption (P3). Assume for κ2, we have α(⊤κ2l)=⊤κ2r. Consider function ⊤κ1→ κ2l∈Dl(κ1→κ2). We have to show α(⊤κ1→ κ2l)=⊤κ1→ κ2r. If the given top element is not compatible, this holds. Assume it is. For vr ∈ Dr(κ1), there are two cases. If there is no compatible vl∈ Dl(κ1) with α(vl)=vr, we have
α(⊤κ1→ κ2l) vr = ⊤κ2r = ⊤κ1→ κ2r vr. |
If there is such a vl, we obtain
α(⊤κ1→ κ2l) vr =α(⊤κ1→ κ2l vl) = α(⊤κ2l)=⊤κ2r = ⊤κ1→ κ2r vr. |
The first equality is the definition of abstraction for functions, the next is the fact that ⊤κ1→ κ2l maps every element vl ∈ Dl(κ1) to ⊤κ2l. The image of ⊤κ2l is ⊤κ2r by the induction hypothesis. The last equality is the definition of ⊤κ1→ κ2r. ▫
Assume (P1), (P4), and (P5) hold. We show, for all terms t and all compatible ν, Ml〚t〛 ν is compatible and α(Ml〚t〛 ν)=Mr〚t〛 α(ν). We proceed by structural induction on t.
α(Ml〚F〛 ν)=α(ν(F))=α(ν)(F)=Mr〚F〛 α(ν) |
α(Il(s) vl)=Ir(s) α(vl) = Ir(s) α(vl′) = α(Il(s) vl′). |
α(Il(s)) vr = α(Il(s) vl)= Ir(s) α(vl) =Ir(s) vr. |
For the induction step, assume the claim holds for t1 and t2.
Mr〚t1 t2〛 α(ν) = (Mr〚t1〛 α(ν)) (Mr〚t2〛 α(ν)) = α(Ml〚t1〛 ν) α(Ml〚t2〛 ν). |
α(Ml〚t1〛 ν) α(Ml〚t2〛 ν) = α((Ml〚t1〛 ν) (Ml〚t2〛 ν)) = α(Ml〚t1 t2〛 ν). |
α((Ml〚λ x.t1〛 ν) vl) = α(Ml〚t1〛 ν[x↦ vl]) = Mr〚t1〛 α(ν[x↦ vl]) . |
For the second requirement in compatibility, let vl be compatible.
By definition of the semantics, (Ml〚λ x.t1〛 ν) vl = Ml〚t1〛 ν[x↦ vl].
Since ν and vl are compatible, ν[x↦ vl] is compatible.
Hence, Ml〚t1〛 ν[x↦ vl] is compatible by the induction hypothesis.
To prove
Mr〚λ x.t1〛 α(ν) = α(Ml〚λ x.t1〛 ν), consider an arbitrary value vr ∈ Dr(κ).
Let vl∈Dl(κ1) be compatible with α(vl)=vr, which exists by Lemma 10.
We have:
(Mr〚λ x.t1〛 α(ν)) vr = Mr〚t1〛 α(ν)[x ↦ vr] = Mr〚t1〛 α(ν[x↦ vl]) . |
We showed above that Ml〚λ x.t1〛 ν is compatible. Using the definition of abstraction for functions and the definition of the semantics, the other function yields
α(Ml〚λ x.t1〛 ν) vr = α((Ml〚λ x.t1〛 ν) vl)= α(Ml〚t1〛 ν[x↦ vl]) . |
With the induction hypothesis, α(Ml〚t1〛 ν[x↦ vl])=Mr〚t1〛 α(ν[x↦ vl]).
▫
Recall σl0 and σr0 are the greatest elements of the respective domains. We have
α(σl)=α(⊓i∈ N rhsMli(σl0))= ⊓i∈N α(rhsMli(σl0)) = ⊓i∈N rhsMri(σr0) = σr. |
The first equality is Kleene’s theorem. The second equality uses the fact that each rhsMli(σl0) is compatible and that they form a descending chain (both by induction on i), and then applies Lemma 11. The third equality also relies on compatibility of the rhsMli(σl0) and invokes Lemma 2. Moreover, it needs α(σl0)=σr0 by Lemma 12. The last equality is again Kleene’s theorem. ▫
Observe IA($) = Qf = acc(ε). Given a formula φ∈ PBool(acc(T*)), we have to show that prea(φ)∈ PBool(acc(T*)). Since prea distributes over conjunction and disjunction, it is sufficient to show the requirement for atomic propositions. Consider Q=acc(w). We have IA(a) acc(w) = prea(acc(w)) = acc(a.w). Finally, IA(opF) with F∈ N is conjunction or disjunction, and there is nothing to do as the formula structure is not modified. ▫
We require, for all terminals s, IA(s) is ⊓-continuous over the respective lattices. We remark that the case s = opF is identical to Lemma 1. Hence, we show the case s = a ∈ Γ. Given a descending chain (xi)i∈N, we have to show I(a) (⊓i∈N xi) = ⊓i∈N (I(a) xi). Recall that the meet of formulas is conjunction, and that we are in a finite domain. The latter means that the infinite conjunction is really the conjunction of finitely many formulas. Now prea is defined to distribute over finite conjunctions. We have
I(a) (⊓i∈N xi) = prea | ⎛ ⎜ ⎜ ⎝ |
| xi | ⎞ ⎟ ⎟ ⎠ | = |
| prea | ⎛ ⎝ | xi | ⎞ ⎠ | = ⊓i∈N (I(a) xi) |
as required. ▫
To show α is precise, we have to show (P1) to (P5).
For (P1), it is sufficient to argue that for every set of states Q∈ acc(T*) there is a word that is mapped to it — which holds by definition.
For formulas, note that α=acc distributes over conjunction and disjunction, which means we can take the same connectives in the concrete as in the abstract and replace the leaves appropriately. Note that we only need a set consisting of one formula.
(P2) is satisfied by the concrete meet being the union of sets of formulas and α being defined by an element-wise application.
For (P3), note that the greatest elements are {true} for DC(o) and true for DA(o). By definition, α({true})=α(true)=true.
For (P4), consider $.
We have
α(IC($)) = α({ε})=acc(ε)=Qf=IA($).
The first equality is by definition of the concrete interpretation, the second is the definition of α, the third uses the fact that ε is accepted precisely from the final states, and the last equality is the interpretation of the $ in the abstract domain.
For a letter a and a word w⊆ T*, we have
|
The first equality is the interpretation of a in the concrete, the second is the definition of prepending a letter, the third is the definition of the abstraction, the next is how taking predecessors changes the set of states from which a word is accepted, and the last equality is the interpretation of a in the abstract domain and the definition of the abstraction function. The relation generalizes to formulas by noting that both the concrete interpretation and the abstract interpretation of a distribute over conjunction and disjunction. It also generalizes to sets of formulas by noting that prependa is applied to all elements in the set and, in the abstract domain, prea distributes over conjunction.
Let F be a non-terminal owned by □. To simplify the notation, let the associated operation be binary, opF:o→ o → o. Let Φ1,Φ2∈DC(o) be sets of formulas. We have
α(IC(opF) Φ1 Φ2) = α(Φ1⋃ Φ2) |
| |||||||||||
|
The first equality is the concrete interpretation of opF. The second is the definition of the abstraction function. The third equality holds as we work up to logical equivalence. The last is the abstract interpretation of opF and again the definition of the abstraction.
Assume F is owned by ◇ and opF is again binary. Consider Φ1,Φ2∈DC(o). It will be convenient to denote {φ1∨φ2 | φ1∈ Φ1,φ2∈ Φ2} by Φ. We have
α(IC(opF) Φ1 Φ2) = α(Φ) |
| |||||||||||
| ||||||||||||
|
The first equality is the concrete interpretation of opF, the second is the definition of α on sets of formulas.
The third equality is the fact that α distributes over disjunctions and rewrites the iteration over the elements of Φ.
The following equality is distributivity of conjunction over disjunction, and the fact that we work up to logical equivalence.
The last is the abstract interpretation of opF and the definition of the abstraction function.
It remains to show (P5).
For IC($) and IC(a), there is nothing to do as all ground values are compatible.
Assume F is owned by □ and opF is binary.
The proof for ◇ is similar.
We show that, given a set of formulas Φ, the function
Φ∪ − is compatible.
An inspection of the proof of (P4) shows that for any set of formulas φ1, we have
α(Φ⋃ Φ1) = α(Φ)∧ α(Φ1). |
Hence, if α(Φ1)=α(Φ2), then α(Φ∪ Φ1)= α(Φ∪ Φ2). That Φ∪ Φ1 is compatible holds as the element is ground. ▫
v We have IO($)=∨Qf= α(Qf)=α(acc(ε)). For IO(a), we note that both the abstract and the optimized interpretation distribute over conjunctions and disjunctions. Hence, it remains to consider whether the application to leaves results in a disjunction that is the image of an abstract set. Let Q=acc(w). We have
|
| |||||||||||
| ||||||||||||
=∨prea(Q) | ||||||||||||
|
The first equality is the definition of the abstraction function. Then we apply distributivity of the optimized interpretation of a over disjunctions. The following equality is the actual interpretation of a in the optimized model. The next equality uses prea(Q)=∪q∈ Qprea(q). The following is again the definition of the abstraction function. Then we replace Q by its definition. Finally, we note the interplay between prea and acc(−).
For conjunction and disjunction, which are used as the interpretation of opF depending on the player, we note that α distributes to the arguments. Hence, if the arguments are α(φ1) and φ2, we have α(φ1)∧ α(φ2)=α(φ1∧ φ2). ▫
We need, for all terminals s, IO(s) is ⊓-continuous over the respective lattices. We remark that the case s = opF is identical to Lemma 1. The case s = a ∈ Γ follows from distributivity of IO(a) as in the proof of Lemma 4. ▫
We show the optimized abstraction is precise.
Surjectivity in (P1) holds by definition as does (P3).
Also ⊓-continuity in (P2) is by the fact that the meets over the concrete domain are finite, and hence the definition of α already yields continuity.
We argue for (P4).
For $, Lemma 5 yields IO($)=α(Qf), which is α(IA($)) as required.
For a, the same lemma shows IO(a) α(acc(w))=α(prea(acc(w))), which is α(IA(a) acc(w)).
The equality generalizes to formulas as both, the abstraction function and the interpretations distribute over conjunctions and disjunctions.
For opF, assume it is a binary conjunction.
We have
IO(opF) α(φ1) α(φ2)= α(φ1)∧ α(φ2)=α(φ1∧ φ2)=α(IA(opF) φ1 φ2). |
The first equality is the definition of the interpretation in the optimized model, the next is distributivity of α over conjunction.
Finally, we have the interpretation of opF in the abstract model.
For (P5), there is nothing to do for
IC($) and IC(a), as all ground values are compatible.
We consider the conjunctions and disjunctions used to resolve the non-determinism.
Consider a formula φ.
The task is to show that the function φ∧ − is compatible.
Consider φ1 and φ2 with α(φ1)=α(φ2).
Then
α(φ∧ φ1)=α(φ)∧α(φ1)=α(φ)∧α(φ2)=α(φ∧φ2). |
The first equality is distributivity of the abstraction function over conjunctions. The next is the assumed equality. The third is again distributivity. Compatibility of φ∧φ1 holds as ground values are always compatible. ▫
To show the complexity, we argue the upper and lower bounds separately.
[Proof of Proposition 4] We need to argue that σMO can be computed in (k+1)-times exponential time. We have that σMO = ⊓i ∈ N rhsMOi(σl0). Since the domains DO(κ) are finite for all kinds κ, there is an index i0 ∈ N such that σMO = ⊓i = 0i0 rhsMOi(σl0) = rhsMOi0(σl0). In the following, we will see that the number of iterations, i.e. the index i0 is at most (k+1)-times exponential, and that one iteration can be executed in (k+1)-times exponentially many steps.
First, we reason about the number of iterations. For a partial order D, we define its height h(D) as the length of the longest strictly descending chain, i.e. the height is m if the longest such chain is of the shape
x0 > x1 > … > xk . |
The height of the domain is an upper bound for i0 by its definition: If for some index i1 we have rhsMOi1(σl0) = rhsMOi1 + 1(σl0), we know ⊓i = 0i0 rhsMOi(σl0) = rhsMOi01(σl0) and thus i1 = i0. Such an index i1 has to exist and has to be smaller than the height of the domain, otherwise the sequence of the rhsMOi(σl0) would form a chain that is strictly longer than the height, a contradiction to the definition.
It remains to see what the height of our optimized domain is. Recall that rhsMO has the type signature (N → DO) → (N → DO). Our goal in the following is to determine h(N → DO ). We can identify N → DO with DO(F1) × … × DO(Fℓ), where F1, …, Fℓ are the non-terminals of the scheme. The height of this product domain is the sum of its height. We are done if we show that even the domain DO(F) with the maximal height is (k+1)-times exponentially high, since the number of non-terminals is polynomial in the input scheme.
In the following we prove: If kind κ is of order k′, then DO(κ) has (k′+1)-times exponential height. For the induction step, we also need to consider the cardinality of DO(κ), therefore, we strengthen the statement and also prove that the cardinality |DO(κ)| is (k′+2)-times exponential.
We proceed by induction on k′.
In the base case k′ = 0, we necessarily have κ = o, and indeed the domain α(PBool(acc(T*))) ⊆ PBool(QNFA) is singly exponentially high. To see that this is the case, consider a strictly decreasing chain (φj)j ∈ N of positive boolean formulas over QNFA, i.e. a chain where each formula is strictly implied by the next. To each formula, φj, we assign the set Qj = {Q ⊆ QNFA | Q satisfies φj} of assignments under which φj evaluates to true. That φj is strictly implied by φj+1 translates to the fact that Qj is a strict subset of Qj+1. This gives us that the sets Qj themselves form a strictly ascending chain in P(P(QNFA)), and it is easy to see that such a chain has length at most | P(QNFA) | = 2|QNFA|.
Furthermore, we can represent each equivalence class of formulas in PBool(QNFA) by a representative in conjunctive normal form, i.e. by an element of P(P(QNFA)). This shows that the cardinality of the domain is indeed bounded by | P(P(QNFA)) | = 2 |P(QNFA)| = 22|QNFA|.
Now assume the statement holds for k′, and consider κ of order k′+1. We need an inner induction on the arity m of κ.
Since o is the only kind of arity 0, and does not have order k′+1 for any k′, there is nothing to do in the base case.
Now assume that κ = κ1 → κ2. By the definitions of arity and order, we know that κ1 is of order at most k′, therefore we now by the outer induction that the height of DO(κ1) is at most (k′+1)-times exponential. The order of κ2 is at most (k′+1), but the arity of κ2 is strictly less than the arity of κ, thus we get by the inner induction that the height of DO(κ2) is at most (k′+2)-times exponential.
The domain DO(κ1 → κ2) = Cont(DO(κ1), DO(κ2)) is a subset of all functions from DO(κ1) to DO(κ2). Let us reason about the height of this more general function domain. We know that its height is the height of the target times the size of the source, i.e. h(DO(κ2)) · | DO(κ1) |. The induction completes the proof, as both h(DO(κ2)) and | DO(κ1) | are at most (k′+2)-times exponential.
It remains to argue that each iteration can be implemented in at most (k+1)-times exponentially many steps. To this end, we argue that each element of DO(κ) can be represented by an object of size (k′+1)-times exponential, where k′ is the order of κ. It is easy to see that all operations that need to be executed on these objects, namely evaluation, conjunction, disjunction, and predecessor computation can be implemented in polynomial time in the size of the objects.
Let k′ = 0, i.e. κ = o. We again represent each element of DO(o) by a formula over QNFA in conjunctive normal form, i.e. as an element of P( P( QNFA)). In the worst case, one single formula φ contains everyone of the 2QNFA many clauses, each clause having size at most |QNFA|. This means that one formula needs at most singly exponential space.
For the induction step, consider κ of order k+1. As above, we need an inner induction on the arity of κ, for which the base case is trivial.
Let κ = κ1 → κ2. An element of DO(κ) is a function that assigns to each of the |DO(κ1)|-many elements of DO(κ1) an element of DO(κ2). In the previous part of the proof, we have argued, that |DO(κ1)| is at most (k+1) times exponential. By the induction on the arity, we know that each object in DO(κ2) can be represented in at most (k+2)-times exponential space. This shows that objects of DO(κ) can be represented using (k+2)-times exponential space, and finishes the proof. ▫
We show that determining the winner in a higher-order word game is (k+1)EXP-hard for an order-k recursion scheme.
[Proof of Proposition 5] We begin with a result due to Engelfriet [18] that shows alternating k-iterated pushdown automata with a polynomially bounded auxiliary work-tape (k-PDA+) characterize the (k+1)EXP word languages. We fix any (k+1)EXP-hard language and its corresponding alternating k-PDA B. Let L(B) be the set of words accepted by B. Deciding if a given word w is in the language defined by B is (k+1)EXP-hard in the size of w (recall B is fixed). We show that this problem can be reduced in polynomial time to an inclusion problem L(B′) ⊆ L(A) for some k-iterated pushdown automaton (without work-tape) (k-PDA) B′ and NFA A of size polynomial in the length of w. From B′, we can construct in polynomial time an equivalent game over a scheme G. This will show the game language inclusion problem for order-k schemes is (k+1)EXP-hard.
In an alternating k-PDA+, there are two Players ◇ and □. When decided whether a word w is in the language of a k-PDA+, ◇ will attempt to prove the word is in the language, while □ will try to refute it.
We first describe how to obtain B′ from B. Since the word w is fixed, we can force B to output the word w by forming a product of w with the states of B. Call this automaton B × w. This reduces the word membership problem to the problem of determining whether B × w can reach an accepting state. Next, to remove the worktape from B × w (and form B′) we replace the output of B × w (which will always be w or empty) with a series of guesses of the worktape. That is, a transition of B × w will be simulated by B′ by first making a transition as expected, and then outputting a guess (consistent with the transition) of what the worktape of B × w should be. The automaton A will accept a guessed sequence of worktapes iff it is able to find an error in the sequence. The word w will be in the language of B if B′ is able to reach a final state and produce a word w′ that is correct; that is, w′ is not in the language of A.
Note, here, the reversal of the roles of the Players. In B, control states are owned by ◇ or □. When determining if w ∈ L(B) for some w, the first Player ◇ tries to show the word is accepted, while the second Player □ tries to force a non-accepting run. In B′, however, w is accepted iff the output of B′ is not included in the language of A. Thus, □ will effectively be aiming to prove that w ∈ L(B).
In more detail, we take any (k+1)EXP-hard language and its equivalent (fixed) alternating k-PDA+. Given a word w, deciding w ∈ L(B) is (k+1)EXP-hard. We define B′ directly from B rather than going through the intermediate B × w.
A transition (p, a, o, σ, p′) of B means the following. From control state p, upon reading a character a from w, apply operation o to the work-tape (which may become stuck if not applicable) and operation σ to the stack (which may also become stuck if not applicable). Next, move to control state p′, from which the remainder of w is to be read.
Let m be the polynomial bound on the size of the work-tape of A given the input word w. Let Σ be the alphabet of the work-tape. Let the set of work-tape operations O = {o1, …, on} and work-tape positions P = [1..m] be disjoint from Σ. Also, let ∘ ∈ Σ be the initial symbol appearing in each cell of the initial work-tape. We will construct A′ such that
L | ⎛ ⎝ | A′ | ⎞ ⎠ | ⊆ ∘m | ⎛ ⎝ | P O Σm | ⎞ ⎠ | ∗ . |
That is, A′ outputs a sequence of work-tape configurations separated by positions in P and operations in O. That is, A′ will simulate a run of A over w.
For every control state p of A, we will have control states (p, w′) of A′, where w′ is a suffix of w. We will also have (p, w′, o) where o is a work-tape operation to be applied. Then for each transition (p, a, o, σ, p′) of B we have a transition ((p, aw′), ε, σ, (p′, w′, o)) of B × w. From (p′, w′, o) the automaton B′ will output some character from P (a guess at the work-tape head position), followed by o (to indicate the operation applied). It will then be able to output any word from Σm (a guess of the work-tape contents) before moving to (p′, w′) and continuing the simulation. Initially, B′ will simply output ∘m and move to control state (p, w) where p is the initial control state of B.
The final step in defining B′ is to assign ownership of the control states. Recall, we needed to switch the roles of the Players. Thus, we define O((p, w)) = ◇ whenever p belongs to □ in B. All other control states of B′ are owned by □. We define the accepting control states to be those of the form (p, ε) where p is accepting in B. Observe these have no outgoing transitions.
Next we define the regular automaton A which detects mistakes in the work-tape. Such an error is either due to a poorly updated cell, or due to a poorly updated head position. The set of work-tape operations O is such that there is a mapping
π : (P × O → P) ⋃ (P × Σ × P × O → Σ ⋃ {⊥}) |
where ⊥ ∉ Σ and
Thus, we require the following regular language, for which a polynomially-sized regular automaton is straightforward to construct. Let Γ = Σ ∪ P ∪ O.
L | ⎛ ⎝ | A | ⎞ ⎠ | = | ⎛ ⎜ ⎜ ⎝ | Γ∗ | ⎛ ⎜ ⎜ ⎝ |
| i o Σm j | ⎞ ⎟ ⎟ ⎠ | Γ∗ | ⎞ ⎟ ⎟ ⎠ | ⋃ | ⎛ ⎜ ⎜ ⎝ | Γ∗ | ⎛ ⎜ ⎜ ⎝ |
| i o Σj α Γm+2 β | ⎞ ⎟ ⎟ ⎠ | Γ∗ | ⎞ ⎟ ⎟ ⎠ | . |
We have thus defined a k-PDA B′ that produces some word w′ not accepted by A iff w is accepted by B.
The final step is to produce a game over a scheme G that is equivalent to the game problem for k-iterated pushdown automata. This is in fact a straightforward adaptation of the techniques introduced by Knapik et al. [38]. However, we choose to complete the sketch using definitions from Hague et al. [30] as we believe these provide a clearer reference. In particular, we adapt their Definition 4.3.
The key to the reduction is a tight correspondence (given in op. cit.) between configurations (q, s) of a k-iterated pushdown automaton, and terms of the form1 Fqa Ψk−1 ⋯ Ψ0. That is, every configuration is represented (in a precise sense) by such a term and every term of such a form represents a configuration. Moreover, for every transition (q, a, o, σ, q′) of the pushdown automaton, when o ≠ ε we can associate a rewrite rule of the scheme
Fqa = λ |
| . o(e(q′, σ)) |
such that the term obtained by applying the rewrite rule to Fqa Ψk−1 ⋯ Ψ0 is a term o(Fq′b Ψ′k−1 ⋯ Ψ′0) where Fq′b Ψ′k−1 ⋯ Ψ′0 represents the configuration reached by the transition. That is, (q′, σ(s)). When o = ε we simply omit o, that is
Fqa = λ |
| . e(q′, σ) . |
To each non-terminal, we assign O(Fqa) = ◇ whenever q is a ◇ control state. Otherwise, O(Fqa) = □. For every accepting control state q we introduce the additional rule
Fqa = λ |
| . $ . |
Finally, we have an initial rule
S = t |
where t is the term representing the initial configuration.
Given the tight correspondence between configurations and transitions of the k-PDA and terms and rewrite steps of G, alongside the direct correspondence between the owner of a control state q and the owner of a non-terminal of G, it is straightforward to see, via induction over the length of an accepting run in one direction, or derivation sequence in the other, that B′ is able to produce a word not in A iff a word not in A is derivable from S. Thus, we have reduced the word acceptance problem for some alternating k-PDA+ to the game problem for language inclusion of a scheme. This shows the problem is (k+1)EXP-hard. ▫
This document was translated from LATEX by HEVEA.