Take the phrase “form over function”. It’s used as a pejorative to mean the design of something is undeservedly given precedence over that thing’s ability to get work done. Yet, this is not always true. In fact, form might deserve to take precedence.

Let’s take a person P and two imaginary terms: zetafring and insigmal. P doesn’t know what either term means, but is told the zetafring is insigmal to X and is the only thing that is. Now P can answer some questions related to this topic. For instance, P can say what’s insigmal to X, whether or not the quirbirt is insigmal to X and so on.

In this case, P knows the structure of the topic and thus can get by via navigation — the terms are irrelevant. This may seem shallow knowledge: it’s the knowledge associated with superficial tests, with sleep-walking through conversations and with “AI” parlor games like ELIZA (which nonetheless fooled some people).

Yet this knowledge can be powerful. In fact, the word “formal” means the analysis of form, and is considered important enough that many fields like logic and math are based on it and many other fields aspire to it.

For example, take a logic programming language like Prolog. Prolog can implement expert systems, play games and do quite a bit of computational work via formal analysis. Take this Prolog program (you can experiment with Prolog here):

parent(john,frank). parent(john,mary). parent(frank,jane). grandparent(X,Y) <= parent(X,Z) and parent(Z,Y).

This creates a formal system consisting of definitions or axioms (parent) and a derivation rule (grandparent). The parent axioms create the following tree:

*Notice how the terms themselves (e.g.: parent) aren’t needed. The form of this knowledge (the tree) drives inferences.*

Now the system can give you the parent (given a child), the child (given a parent), the grandparent (given a grandchild), the grandchild (given a grandparent), whether or not any parent-child pair is valid, whether any grandparent-grandchild pair is valid and even all parent-child and grandparent-grandchild pairs. Yet despite this, it knows nothing about these concepts since they are just place-holders for the relations.

In fact, any terms can be used in place of the axioms given above, and if they are structurally identical, then the formal system would be as valid. For instance, what if we replace parent and grandparent with appropriate tree-related terms denoting limbs and branches? Incidentally, this also gives us another way to look at metaphors, but I digress.

This means the formal system is very generic — capable of handling a wide array of objects. Which brings us to math.

The formalist school of math (which long predated Prolog) considered math a symbol derivation game; symbols and rules were provided and one inferred based on the rules. Axioms were not true or false, but taken as givens and the system checked for inconsistencies. This lack of content may be why math is so powerful; it focuses on relations rather than objects, which means any object with the right form can be modeled. Sound familiar? To get a taste of axiomatic systems, check out Hilbert’s Axioms of Gemoetry, Euclid’s Axioms and Peano’s Axioms of Arithmetic.

It seems form can take us very far. Yet, do we even need that much?

In the heyday of AI, a lot of work went into knowledge representation. For instance, natural language processing focused on structures to represent sentence meanings; in fact, this is a niche languages like Prolog were meant to exploit. Unfortunately, this method yielded limited successes. Then statistics was employed and excelled where the the knowledge representation method failed. This troubled some people.

Instead of structuring knowledge (e.g.: via grammars), text was analyzed statistically, taking into account things like word co-locations and phrase frequencies to drive the appropriate inferences. As such, one doesn’t really understand the data, or at least not in the intuitively satisfying way many people take “understand” to mean.

To get a more concrete idea of this, here is an (not necessarily statistical) example. Assume being given a set of X & Y data. X represents levels of some compound consumption and Y the corresponding skin tone. You are then asked to produce a theory relating Y to X. One way would be to simply produce a formula relating the observed data. This can be done by plotting the (X,Y) values and fitting a curve to them:

*Points plotted on a curve fitting program and yielding the formula y = -7.75 + 22.85x – 2.75x^2*

In the above, the blue points are observations, and the red curve is a “theory”, represented by the formula above. The curve is not a perfect match, but it may be good enough (and can be refined with a more complex formula). Predictions can be made by plugging X into the formula (theory).

Yet, this curve was not generated through any understanding of the relationship between the real world entities corresponding to the blue dots. Rather, the positions of the dots themselves were entered into a generic curve fitting program which performed a numeric analysis and yielded a formula.

Now this method is not statistical if the relationship between X & Y is deterministic; however, even if the relationship is deterministic, this method/model may upset some people. Now assume a non-deterministic relationship between X & Y. That’s like adding insult to injury. In fact, some people consider the use of statistics as an admission of failure.

So we went from what many people consider knowledge, to an argument of empty form as knowledge, to an apparent abandonment of even that.

What’s left?

**Questions**

- What is knowledge?
- What is the simplest creature, device or process that can have knowledge?
- Does statistics undermine traditional knowledge?

What is knowledge indeed. Ask Theaetetus. On form over function, that plays out differently in various contexts. In the late 1800’s someone invented a spout for kettles that never dripped. It functioned perfectly, but it wasn’t much to look at. You won’t find one in a store today. I hope I haven’t been too disagreeable.

More stuff to read!

As for being “disagreeable” the point is to share ideas, so I’m definitely interested in dissent. I am not too sure what knowledge is, which is why this article only covered a subset and was a bit ambiguous (in terms of where I lean) at that.

Brilliant! Sorry I took so long to read this. I’ll try to answer the three questions in discursive mode. I think the point you’re trying to make (if I’m not mistaken) is that knowledge and statistical inference are, after all, one and the same thing, and that people attribute some higher meaning to knowledge and thus get upset when it is reduced to form manipulation of, even worse, to Bayesian inference. However, everything points out to out brain just being a gigantic Bayesian processor. Hence, I have no problem with the notion of knowledge being nothing but the statistical processing of sensory data in order to construct predictive models (in our case, models that at least originally were supposed to aid survival).

The fact that we “feel” this inference as knowledge has to be related to the fact that these inferences seem like motive enough to act, thus qualifying as a belief according to your previous article.

Thank you, and don’t worry about how long it takes; I appreciate that you read my articles and reference them 🙂

I’m struggling with the question of what knowledge is, so this article took one path through that question. However, I agree with most of your summary. I would say that statistical inference is a type of knowledge, but whether knowledge and statistical inference are identical… well, I can’t say.

Yes, I thought it significant that the brain is thought by some to be a statistical processing device, which neatly explains why the statistical approach to AI has worked so well :). Also, if this model holds out to be true, then perhaps knowledge and statistical inference ARE identical?

Nice tie in with the previous article!

Enjoyed this BR, but somehow didn’t quite grasp the message. Is all knowledge inferential? For Aristotle true knowledge is identical with it’s object. Tough topic though. .

The message was to question the distinction and value judgments between form/function and true/surface understanding (not in a spiritual sense). If a superficial understanding can lead to genuine gains, then in what sense is it superficial? If analyzing the form of something can take us ahead (more so than taking “meaning” into account), as in logic or math, then how do we justify valuing one over the other?

Oh okay. Got it.

In a sense, the knowledge that humans have of language is simply rule-based. We learn a set of rules (to turn a singular into a plural, add s), and common exceptions (the plural of man is men), plus a huge number of specific facts (a dog is a four-legged creature that barks). This is mostly form, surely?

A statistical approach is different, but may be more elegant.

By facts being form, do you mean something like propositions where truths are related to each other syntactically?

I agree that rules are a part of language, but I also think there’s a statistical element (e.g.: learning idioms, colocations, etc…). However, some of the statistical element might be a function of words having their own grammar (i.e.: we’re applying grammatical rules to the wrong things sometimes).

Also, while I wrote that statistical models have done well, this doesn’t mean that language itself is (purely) statistical. It may be that the statistics approximate the very complex rules involved — rules we don’t even fully understand.

These rules might not even exist in the form we think. If we’re really wired to understand certain language structures; then the right level of rules would be something much lower level than what we’re using to describe language.

I haven’t thought about it in sufficient detail to be pinned down on specifics 🙂 It’s a complex subject.