Introduction

We have been talking about first-order logic as a means to extend our logic-based agents to factored domains.
Recall the first building blocks of a first-order formula/statements are constants, variables and function symbols. Using composition, we can use these to build terms (expressions). Terms can be substituted into predicates to make atomic formulas. First-order formulas are atomic formulas, or are built from other first order formulas via `^^`, `vv`, `neg`, `forall x`, and `exists x`.
Last week, we said how to figure out the truth of a statement in a model with respect to a variable assignment, and how by model checking to determine if a first-order formula follows from a knowledge base.
We start today by looking at knowledge bases expressed in first order logic, and looking at algorithms for first-order logic theorem proving.
We will then consider one important use of such first order logic agents: Planning.
This is the search problem of coming up with a sequence of actions which reach a desired goal state and at each time step satisfy the constraints on the problem at hand.
The particular planning language we use is a first-order logic called PDDL - Planning Domain Definition Language.
This is an extension of an earlier language called STRIPS (one of the first famous planning languages from 1971).

Examples of 1st Order Knowledge Bases

Any relational database.
The Natural Numbers:
- Has one constant `0` and one function symbol `S`. (The Peano Axioms add to these `+`, and `times`).
- We have two predicate symbols: NatNum(x) which obeys the axioms:
  `NatNum(0).`
  `forall n, NatNum(n) => NatNum(S(n)))`
  we also have equality `=` in our language.
- So `0`, `S(0)`, `S(S(0))`... are natural numbers
- To constrain the behavior of `S` we need to add the axioms:
  `forall n, 0 ne S(n).`
  `forall m,n, m ne n => S(m) ne S(n)`.
Set theory:
- Has one constant `O/` (empty) and predicate symbols `in` (element of)
- No function symbols
- Knowledge base for set theory consists of formulas like `neg( O/ in O/)`, `exists x neg(x = O/)`,... In English, set theory (ZFC) has axioms expressing in addition that any non-empty set contains an element which is disjoint from itself (foundation), sets with the same elements are the same sets (extensionality), there exists a set which is closed under successor sets (infinity), expressing existence of ordered pairs of sets, expressing existence of the union of sets of sets, expressing existence of the cartesian products of sets, expressing existence of the power sets of sets, expressing the range of a definable function with domain a set is a set, and any set can be well ordered.
- `=` abbreviates `forall z (z in x <=> z in y) ^^ forall w (x in w <=> y in w)`. I.e., `x` and `y` have the same elements and are elements of the same sets.
- For a set x, the successor set of x is defined to be `x \cup {x}`.
The book gives an example of reformulating the Wumpus World in the first-order setting.

KBs, First-Order Proofs

Write `KB |== F` to mean for all structures `M`, such that `M |== KB` we also have `M |== F`.
Proofs in 1st Order Logic are very much like proofs in propositional logic except now we have some additional axioms.
For example, some possible additional axioms might be:
`(forall x) (neg P) => neg((exists x) P)`
(for every pig, it can't fly => not there is a pig that can fly)
`neg(forall x P) =>(exists x) neg P`
(Not for every horse it is blue => there is a horse that is not blue)
`A(t) => (exists x) A(x)`
`A(x)=> (forall y) A(y)`
etc. In the above, `t` is an arbitrary term, and `x` is a variable.
In many ways, one can treat the behavior of `forall x F(x)` as a big AND over the substitution of values `x` could take in the formula `F`.
Similarly, one can treat the behavior of `exists x F(x)` as a big OR over the substitution of values `x` could take in the formula `F`.
As with OR and AND we can have `exists` and `forall` introduction and elimination rules of inference:
Exists Introduction: `frac(A(t))((exists x) A(x))`
Forall Introduction: `frac(A(y))(forall x A(x))`
Forall Elimination: `frac(forall x A(x))(A(t))`
Exists Elimination: `frac((exists x) A(x) qquad A(y) => gamma)(gamma)`

Soundness and Completeness

As with the propositional case one can show that for first-order logic, `Gamma |-- F` then `Gamma |== F` for the proof system extended by axioms as on the last slide for exists and forall. (Soundness)
Moreover, if we have a good initial set of propositional axioms, Godel (1930) showed that whenever our proof system can prove a statement `Gamma |== F` then in fact `Gamma |-- F`. (Completeness)
It is possible to do model checking in the first-order setting if the universe is finite, but the time complexity is just as bad/worse than the propositional setting.
So let's consider for a moment how we could make a theorem prover for first-order logic.

Quiz

Which of the following is true?

A Horn clause must have exactly one positive literal.
Under CWA, things we don't know about we assume are false.
It is impossible to come up with a model `M` such that `M |== (exists x) (1+1)cdot x = 1+1+1`.

Theorem Proving in First-Order Logic

The basic idea is we want to reduce the first-order case to the propositional case. To do this:

Convert each formula in the set of formulas `KB cup {neg alpha}` using some of the operations previously described into prenex normal form:
`forall vec(x) exists vec(y) forall vec(z) ... G(vec(x), vec(y), vec(z), ...)`
where `G` does not involve quantifiers and may involve `neg, vee, wedge`. The reason why we are working with sets rather than the single formula `KB wedge neg alpha` is to allow for the case KB is an infinite set of formulas.
Introduce new function symbols for `forall exists` blocks. For example,
`forall x_1 forall x_2, exists y_1, exists y_2, exists y_3 G(x_1, x_2, y_1, y_2, y_3)`
transforms to
`forall x_1 forall x_2,G(x_1, x_2, f_1(x_1, x_2), f_2(x_1, x_2), f_3(x_1, x_2))`
This process is called Skolemization. Given a formula `A` we denote its Skolemization as `A^S`. For a set of formula `Gamma` applying Skolemization to each formula yields a set `Gamma^S`. One can show `Gamma |== B` iff `Gamma^S |== B`, so if `KB cup { not alpha}` has refutation then so does `(KB cup {not alpha})^S`.
If we have all variables bound, and only universal quantifiers, we can ditch the quantifiers and have an open formula with our variables all free.
Once we get open formulas without any quantifiers, convert to CNF and try to use resolution algorithm.

Some remarks on the above:

How do we handles formulas with just existentials quantifiers and no universal? Answer: we view these as functions from no arguments to a value. I.e., we replace these with fresh constants.
How are the new function symbols related to the existing terms in the language before adding the function symbols? In general, a new function symbol `f` is connected to existing function and predicate symbols in the language but won't necessarily be expressible as a term in the original language. For example, Skolemizing an induction axiom:
`A(0, vec a) wedge forall x(A(x, vec a) => A(S(x), vec a)) => A(t(vec a), vec a)`
might yield a function of the predicate `A`, which when given a `t(vec a)` such that `A(0) and neg A(t(vec a))` holds, searches for a value `v` as a function of `vec a` such that `A(v, vec a)` holds but `A(S(v), vec a)` does not. This search operation might not be expressible as a single term in the original language.
To do resolution we might now also need to come up with substitution lists between two atomic formulas in `G`, say `F(vec(t))`, `F(vec(s))`, for the terms `t_1, ..., t_n` and `s_1`, ..., `s_n` so that they are the same formula. This is so that after doing the substitution, we can resolve the two formulas. Unification is an algorithm for doing this.

Unification Algorithm

If x is a list let head(x) denote the first element of the list, x[0], and tail(x) denote x[1:]. The code below relies on Unify-var which is on the next slide.

Unify(x, y, S)
x - a variable, constant, term or list 
y - a variable, constant, term, or list 
S - a substitution so far
if(S == fail) return fail
if (x(S) == y(S)) return S
else if ( var? (x))
    return Unify-var(x, y, S)
else if ( var? (y))
    return Unify-var(y, x, S)
else if ( term? (x) and term? (y))
    return (Unify(args (x), args(y), Unify( op(x), op(y), S))
else if (list? (x) and list? (y) )
    return (Unify(tail(x), tail(y), Unify(head(x), head(y), S)))
else
    return fail

As an example of what I mean by op and args, consider x:=((z*z) +35). Here op is the top operation. So op(x) := a + b as + is the outer most function symbol. Here a,b are new variables that don't appear elsewhere. args(x) := (z*z, 35), so contains two terms: the left and right symbol of +.

Unify-var

Here Unify-var is defined as:

Unify-var(var, y, S)
var - a variable
y - an expression 
S - a substitution
if(var |-> val) exists in S 
    return Unify(val, y, S)
else if (y |-> val) exists in S 
     return Unify(var, val, S)
else if (occur-ck? (var, y)) return fail;
else
    return ( append((var |-> y), S))

Suppose x is f(var), then var occurs in f(var) in the above and occur-ck of (var, x) is true! (note: prolog doesn't do occur-ck's)

Finish First-order Logic Overview

Outline