Posts tagged with fuzzy logic

## Hard Boundaries, Soft Boundaries

Computer scientists use filters, ≥ signs, intersections (`sql`), and other forms of what I would call “hard boundaries”.

• `Grep` either finds what you’re looking for, or it doesn’t.
• The condition inside your `while(){` loop either trips `true` and the interior code runs, or it trips `false` and it’s skipped.
• You either crawled a webpage, or you didn’t.
• In exploring a code tree or other graph, you either look at the node, or you don’t.
• Two people either are Facebook friends, or they aren’t.
• The tweet either included a word from this list, or it didn’t.
` `

But, one needn’t be so conceptually constrained. Thinking in a fuzzy logic sense, it’s possible to create a “soft” boundary.

To use a classic example from Bart Kosko's book, although the American legal system imposes a “hard boundary” on adulthood (OK, a series of hard boundaries—16, 18, 21, 25), one really passes into adulthood gradually over time. (Unless you have your first kid at 16, in which case you grow up real quick. But talking about the upper-middle-class college-enrolled set here: most of them grow up slowly.)

That’s nice in a philosophical, contemplative way. But can we use the soft-boundary concept for anything useful? I think so.

For example, in this neo4j video (minute 5)  Marko Rodriguez gives us the following line of Gremlin code:

`g.v(1).outE.filter{it.label=='knows' & since > 2006}.count()`

We could either be naïve about this and treat 2006 as a hard boundary, or make it a variable and perform sensitivity analysis. In fact, any time we see a number we could turn it into a parameter — ending with a hull of list. We could poke about in that parameter space and by doing so get a better idea of the shape of things than setting a naïve tripwire.

Is there a design pattern for this?

Notice also his gremlins can “be” on multiple nodes at once. That’s certainly not a binary data structure to the codomain. Other non-binary aspects to his graphs:

• different words (“coloured edges” in graph parlance) like “speaks”, “has worked with”, “had a child with” — all of the richness and drama of Quine’s ontology of language wrought in the connectome of the graph
• the network structure itself
• and of course edge weights
` `

Here’s an example from Unix for Poets:

`cat bible | grep Abel | uniq -c`

So-called “bright lines” appear also in the law (married vs not), statistical regression (dummy/indicator variables), and tax brackets (under \$15,000.00 or ≥ \$15,000.01).

They’re frustrating because they’re discontinuous. (Actually tax brackets are not but the first derivative is discontinuous.)

Imagine the following (non-existent, stupid) tax system:

• If you make under £30,000/year you pay no tax.
• If you make ≥ £30,000.01/year you pay 50% tax on every dollar you made (all the way down to £0.01).

It’s frustrating because it’s discontinuous. I might not go as far as to say that continuity, smoothness, holomorphicity, analyticity and so on are “natural to the human mind” — if in fact we can just take a monolithic view on “the” mind — but continuity and smoothness certainly seem—to me and to other mathematical writers I’m thinking of—like they’re more fair, just, or sensible.

` `

Imagine you’re trying to catch an email spammer, and you’ve determined that the character ! is a good trigger for spams. You could either

1. set a hard boundary: more than 3 !’s, flagged for spam; ≤ 3 !’s, not flagged
2. or you could count the number of !’s in the text

The latter approach is more flexible:

• you can change the parameter 3 to something else
• you can pass the count through a function (like a sigmoid, monotone convex or monotone concave function, or the cumulative-prospect-theory function)

• As in minute 14 of this d3.js video you can add (something like a) “blending” parameter
• you can set a known algorithm (like logistic regression) to find the optimal parameter value for you
• you can combine the ! count with other variables (like counts of the word herbal or counts of the forenames of people in the mail user’s address book)
• you can combine the ! count with other variables and use a known algorithm (like a backprop net) to set all the optimal values for you
• maybe you can find a way to half-instantiate your desired response when the count is “at half mast” or “in a middling range”.

Back to catching spammers, I drew up an idea for tumblr to catch its spammers a while ago. I noticed a few telltale markers of spam accounts:

• quick liking in succession
• squatting on a hashtag
• high number of likes
• no / low content in the title
• at first the spammrs were not reblogging stuff (now they not only reblog but post fakey “original” looking text posts … that’s counter-evolution for ya) so they usually had no posts on their blog page
• exist ads on the sidebar

They opted for social proof (let people “block” spammy likers from their dashboard and flag them as suspected spammers), which seems to have worked out very well. So I’m not saying “soft boundaries are always better” or something — just that if a “hard boundary” is preventing you from thinking about a problem like you want to, you can get around it pretty easily!

` `

I think computer scientists do use soft boundaries, although they might not draw the same analogy to the “crisp” > sign as I am.

• tag clouds don’t just count words — they increase the display size of the word depending how large the count is (maybe the `sqrt` of the count?). That tag clouds count different words rather could also be construed as a “coloured” codomain.
• you don’t just return a webpage or not return a webpage in your crawler. You might get a 404, or you might get a 302. Or you might get a 200, 500, 303, 504, and so on. Additionally the page might be in HTML, JSON, or might simply flip a switch (“turn on my  remote TV recording device”).

Business people (I’ve found) think naturally in terms of soft boundaries as well. If your client / boss is using the word “score” you can mimic that directly with what I’m calling a “soft boundary”.

All you’ve got to do is make up a functional that “measures stuff” any way you want, and slide your > sign along the resulting smooth scale.

I learned about Zadeh’s fuzzy logic when I was a graduate student…despite the intrinsic interest of the idea, there didn’t seem to be any really impressive results….

When I first heard about “fuzzy logic” control systems (…about 20 years ago — before Google or Wikipedia), I was puzzled. What exactly does the degree of truth of statements have to do with algorithms for controlling trains or elevators? When I asked this question after a dog-and-pony show at a Japanese research lab in the mid-1980s, I got answers … repeating what I already knew about fuzzy logic, without adding anything convincing about the application to control theory.

It sounded to me like technological double-talk. I was sure that the engineers were doing something relevant to control in complicated situations, but the “fuzzy logic” label seemed like a flack’s evocative slogan for a variety of different technologies that didn’t seem to have anything much to do with logic, fuzzy or otherwise.
` `
A friend with a background in chemical engineering set me straight. His explanation went something like this: Standard control systems are linear. That means that controllable outputs (heating, accelerating, braking, whatever) are calculated as a linear function of available inputs (time series of temperature, velocity, and so on).

Linearity makes it easy to design such systems with specified performance characteristics, to guarantee that the system is stable and won’t go off into wild oscillations, and so on. However, the underlying mechanisms may be highly non-linear, and therefore the optimal coefficient choices for a linear control system may be quite different in different regions of a system’s space of operating parameters.

One possible solution is to use different sets of control coefficients for different ranges of input parameters. However, the transition from one control regime to another may not be a smooth one, and a system might even hover at the boundary for a while, switching back and forth.

So the “fuzzy control” idea is to interpolate among the recipes for action given by different linear control systems. If the measured input variables put us halfway between the center of state A and the center of state B, then we should use output parameters that are halfway between state A’s recipe and state B’s recipe. If we’re 2/3 of the way from A to B, then we mix 1/3 of A’s recipe with 2/3 of B’s; and so on.
` `
In the case of the four stages of rice cooking, I suppose that a fuzzy logic controller is able to treat the process as a series of fuzzy or gradient transitions rather than a series of hard, stepwise transitions. … a vaguely analogous method to fit a smoothed piecewise linear model to data about oil recovery as a function of various independent variables, including oil field “age”.

In both cases, the fuzzy approach might well be appropriate, under whatever name (though here’s an alternative story about heating control…).

… And indeed even plain fuzzy is by no means an entirely positive word. When George Bush famously accused Al Gore of “disparaging my [tax] plan with all this Washington fuzzy math”, it was not a warm fuzzy moment.

[Update: Fernando Pereira emailed

Petroleum geologists have been pioneers on pretty sophisticated spatiotemporal estimation and smoothing techniques, for instance kriging (aka Gaussian process regression for statisticians). There are tight connections between GP regression and spline smoothing (via the theory of reproducing kernel Hilbert spaces). Either the Saudis are not hiring the best petroleum geologists, or they are being deliberately obfuscating with marketroid talk. I can’t think of any situation in which fuzzy ideas (pun intended) would be preferable to Bayesian statistics for inference.

…]

[Update 2: A review article by David Abramowitch, with slides.

Mark Liberman, in When “Fuzzy” Means “Smoothed Piecewise Linear”

One cool thing to imagine: the multi-dimensional space of parameters of the control system, the space of all possible tunings of the knobs — and how a few multi-dimensional charts — how do they meet up in this high-dimensional space? — link together.

Briefly: the linear regression model. We suppose we can explain or predict y using a vector of variables x. As in Gauß’ estimation theory, y is supposed to be unobservable, and thus has to be estimated. The assumption that y depends on x is expressed this way: the posterior distribution Prob{ Y | X } is different from the prior distribution Prob{ Y }.

The minimization of variance of the difference between [our estimation of Y given X] and [Y] leads to a unique solution: the conditional expectation.

The linear hypothesis says that the estimated value should be an affine expression of X. Moreover, the affine parameters which minimise the variance of the error are given by:

$\large \dpi{200} \bg_white \begin{matrix} \beta \text{ linear coefficients } \ = \; \ \mathrm{cov}_{X,Y} \ \cdot\ {\mathrm{cov}_{Y,Y}}^{-1} \\ \alpha \text{ constant term } \quad \ \, = \ \ E[Y] \; - \; \beta \cdot E[X] \end{matrix}$

The above linear model coincides with the optimal conditional expectation model when X,Y are Gaussian.
Michel Grabisch, in Modeling Data by the Choquet Integral
(liberally edited)

### Fuzzy Logic

Not everything is so simple as true or false. Even declarative statements may evaluate outside {0,1}. So let’s introduce the kind-of: truth ∈ [0,1].

Examples of non-binary declarative statements:

• Shooting trap, my bullet nicked the clay pigeon but didn’t smash it. I 30%-hit the mark.
• I’m not exactly a vegetarian. I purposely eat  of my meals without meat, but — like yogini Sadie Nardini I feel weak if I go 100% vegetarian. So I’m  contributing to the social cause of non-animal-eating, and I’m a ⅔ vegetarian.
• I’m sixteen years old. Am I a child, or an adult? Well, I don’t have a career or a mortgage, but I do have a serious boyfriend. This one is going to be hard to assign a single number as a percentage.
` `

So that’s the motivation for Fuzzy Logic. It sounds compelling. But the academic field of fuzzy logic seems to have achieved not-very-much, although there are practical applications. Hopefully it’s just not-very-much-yet (Steven Vickers and Ulrich Höhle have two interesting-looking papers I want to read).

I see three problems which a Sensible Fuzzy Logic must overcome:

1. Implication. Classical logic (“the propositional calculus”) uses a screwed up version of “If A, then B”. It equates “if” to “Either not A, or else B is true, or else both.”

Fuzzy logic inherits this problem — but also lacks one clear, convincing “t-norm”, which is the fuzzy logic word for fuzzy implication.  Can you come up with a sensible rule for how this should work?:
• A implies B, and A is 70% true. How true is B?
• Furthermore, should there be different numbers attached to “implies” ? Should we have “strongly implies” and “weakly implies” or “strongly implies if Antecedent is above 70% and does not imply at all otherwise” ?

You can see where I’m going here. There is an 2 of choices for the number of possible curves / distributions which could be used to define “A implies B”.

2. Too specific. Fuzzy logic uses real numbers, which include transcendental numbers, which are crazy. Bart Kosko’s book explains FL with familiar two-digit percentages, which are for the most part intuitive. So I can accept that something might be 79% true — but what does it mean for something to be π/4 % true? Or e^e^π^e / 22222222222 % true?

We’re encumbering the theory with all of these unneeded, unintuitive numbers.

3. One-dimensional.  For all of the space, breadth, depth, and spaceship adventures contained in the interval [0,1], it’s still quite limited in terms of the directions it can go. That is [0,1] comprises a total order with an implied norm. Again, why assume distance exists and why assume unidimensionality, if you don’t actually mean to. There are alternatives.

• N/A
• I don’t know
• Sort of
• Yes and no
• It’s hard to say
• I’m in a delicate superposition
, — or rather it maps effectively different answers onto the same number.

• Sometimes things are both good and bad;
• sometimes they are neither good nor bad;
• sometimes things are not up for evaluation;
• sometimes a generalised function (distribution) expresses the membership better than a single number;
• sometimes the ideas are topologically related or order related but not necessarily distance related;
• sometimes an incomplete lattice might be best.
` `

So those are my gripes with fuzzy logic. At the same time, Kosko’s book was my introduction to an interesting, new way of thinking. It definitely set my mind spinning. For the logical mind that wants a rigorous framework for understanding ambiguity, vagueness, and gray areas, fuzzy logic is a good start.

## Clouds

You can see the “edge” of a cloud from far away so it should be obvious what ∂cloud means. But up close (from an airplane) you can see there is no edge. The mist fades gradually into blue sky.

Here’s another job for schwartz functions: to define a “fuzzy boundary”  that looks sharp from far away but blurred up close. In other words, to map each Cartesian 3-point to a fuzzy inclusion % in the set {this cloud}.

Jan Koenderink, in his masterpiece Solid Shape, notes that a typical European cumulus cloud has density 𝓞(100 droplets) per cm³ (times 16 in inch⁻³). Droplets are 3–30 μicrons in diameter. (3–30 hair widths across) Typical clouds have a density of .4g/m³ or 674 pounds of water per cubic football field of cloud.

To lift directly from page 508:

What is actually meant by “density” here? Clearly the answer depends on the inner scale or resolution.

At a resolution of 1 μm the density is either that of liquid water or that of air, depending critically on the position within the cloud. At a resolution of ten miles the density is near zero because the sample in the window is diluted.

Both results are essentially useless. The right scale is about a meter, with maybe an order of magnitude play on both sides.

Rather than having just one sharp boundary, ∂cloud is a sequence of level surfaces that enclose a given density at a given resolution. To avoid having to choose an arbitrary resolution parameter, we can define the fuzzy inclusion with a schwartz function. We get a definite beginning and end (compact support) without going too into particulars (like rate of the % dropoff) and this is true at any sensible resolution.

We can’t say exactly where the boundary is, but we can point to a spot in the sky that’s not cloud and we can point to a spot in the sky that is cloud.

Leonardo da Vinci’s ability to embrace uncertainty, ambiguity, and paradox was a critical characteristic of his genius. —J Michael Gelb

Say you want to use a mathematical metaphor, but you don’t want to be really precise. Here are some ways to do that:

• Tack a onto the end of an equation.
• Use bounds (“I expect to make less than a trillion dollars over my lifetime and more than \$0.”)
• Speak about a general class without specifying which member of the class you’re talking about. (The members all share some property like, being feminists, without necessarily having other properties like, being women or being angry.)
• Use fuzzy logic (the  membership relation gets a percent attached to it: “I 30%-belong-to the class of feminists | vegetarians | successful people.”).
• Use a specific probability distribution like Gaussian, Cauchy, Weibull.
• Use a tempered distribution a.k.a. a Schwartz function.

Tempered distributions are my current favourite way of thinking mathematically imprecisely, thanks to this book: Theory of Distributions, a non-technical introduction.

Tempered distributions have exact upper and lower bounds but an inexact mean and variance. T.D.’s also shoot down very fast (like exp −x², the gaussian) which makes them tractable.

For example I can talk about the temperature in the room (there is not just one temperature since there are several moles of air molecules in the room), the position of a quantum particle, my fuzzy inclusion in the set of vegetarians, my confidence level in a business forecast, ….. with a definite, imprecise meaning.

Classroom mathematics usually involves precise formulas but the level of generality achieved by 20th century mathematicians allows us to talk about a cobordism between two things without knowing precisely everything about them.

It’s funny, the more “advanced” and general the mathematics, the more casual it can become. Even post calc 1, I can speak about “a concave function" without saying whether it’s log, sqrt, or some non-famous power series.

Our knowledge of the world is not only piecemeal, but also vague and imprecise. To link mathematics to our conceptions of the real world, therefore, requires imprecision.

I want the option of thinking about my life, commerce, the natural world, art, social networks, and ideas using manifolds, metrics, groups, functors, topological connections, lattices, orthogonality, linear spans, categories, geometry, and any other metaphor, if I wish.

In some major oil companies the overall, gas and oil well drilling success rates have risen to an average of 47 percent in 1996 from 3-30 percent in the early 1990’s. `(SOURCE OF THIS DATA?)` For example, in US only, by year 2010, these innovative techniques are expected to contribute over 2 trillion cubic feet (Tcf) /year of additional gas production and …