Stationary Waves: More Wrong

The following is a synopsis and expansion of a series of comments I left under a post at SlateStarCodex.com. Toward the end of the debate, a few fellow commentators remarked that I was making the case for frequentist inference, which - because I am not an academic statistician or philosopher - is something I hadn't heard of until yesterday.

Now, I'm not very fond of putting all opinions into categories, and I reiterate that I only heard of frequentist inference yesterday, so I'm not ready to declare to the world, "World, I am a frequentist inferror!" But I did a little reading up, and it does seem to reflect my approach to probability. Moreover, I'm pleased to learn that I wasn't just spouting a bunch of crazy-talk, and that intelligent people had traversed that path before I.

Enough preamble, though, let's get to it.

* * *

My core claim: I think LessWrong-ers set themselves into a pattern of thinking that is ill-suited to the majority of the human experience.

* * *

Here's a quick example of why I think so, to help initiate the discussion:

Suppose Eliezer Yudkowsky asked me to assign a probability to whether a sentient robot will murder a human being during the next, say, 20 years. Setting aside the fact that technological advances are not a probability (because they are the product of deliberate human action) and focusing solely on the question of the robot itself – assumed to exist – choosing to murder a human being, this is not a question of probability, because volitional acts don’t "just happen." It's not randomness that causes someone to murder someone else, there are deliberate thoughts and actions going on, and these things cannot be assigned "likelihoods" based on anything rational.

Now, it's possible to suggest that all things that human beings do are purely random phenomena, as some eventually claim. They say that human brain functions are subject to quantum mechanics, and there is randomness involved there. But it would be disturbing to use that fact to suggest that, at any given moment, there is an X% chance that you will murder someone (even if the chance is very small). At the risk of sounding harsh, such a belief sounds a lot like a psychotic break to me.

On the other hand, we could indeed observe that. each year, X% of people commit a murder. We can ask, “What is the probability that next year, the number will be Y% instead?” The reason we can ask that is because we're no longer asking about the probability of a particular murder involving specific people. Instead, we’re asking about the likelihood that a sample mean will differ from a historical population mean. That, my friends, is indeed a question of probability, and we can do valid statistical analysis on a question like that.

Proponents of Bayesian inference - such as the "Less Wrong community" - like to say things like, "but I know often murders occur, and I know what the demographics of a murderer are, and I can compare the prevalence of murder among certain demographics to a particular person and arrive at a forecast for how likely I think it is that the person will commit murder..." And I can see how that does become a probability problem, but it only really works for a random observation.

What I mean is, if I put a random person named Joe in front of you along with some demographic data, you can come up with a statistical model that can do a best-possible job of predicting whether that random person is going to become a murderer at some point in the future. But if I task you to predict whether someone you know is going to murder someone else you know, then we're no longer talking about randomness or probability. We're talking about a couple of people that you know, and people do things by choice, not by probability (unless you're having a psychotic break and you've convinced yourself that everything you think and do is the whim of the random forces of molecules colliding inside of you, otherwise called "Because Quantum Mechanics! Nihilism," or BQM Nihilism for short).

In short, there are two questions here:

Will Joe murder someone?
What is the probability that I can correctly guess whether someone fitting a particular demographic profile is a murderer?

My position: Only question #2 is a question of probability. Only question #2 is appropriate for statistics.

* * *

But still, suppose I had to predict whether Joe was a murderer. Then, isn't coming up with a Bayesian prior, and fitting it into a predictive model and conducting some analytics the best I can do, given the information that I have?

Here's where things get more interesting to me...

One of the marks of a truly wise person, in my opinion, is the ability to say (honestly), "Gee, I just don't know." Being comfortable with the fact that there are some things out there that are just simply unknowable is part of being a grown-up. It's a sign of emotional maturity. .We all wish we knew everything there was to know, but no matter how smart we are, no matter what kind of Bayesian games we play with ourselves, we'll never know everything. We'll never even come close! It's just not possible.

It's admirable to try to expand human knowledge, of course, and it's a wonderful character trait to have a thirst for knowledge. But it's mature to accept your limitations.

Back to Joe: If you know Joe, and you need to predict whether he will become a murderer at some point in the future, then sure you could assign a bunch of probabilities and update your priors in an ongoing "virtual Markov Chain," but fundamentally we're asking about Joe's character, and that's not subject to probability. Either you've got Joe's number, or you don't, but you didn't get to where you were by running a Bayesian model against your every interaction with him.

And if you did, then you're not human.

* * *

When I’m uncertain about something, I just say, “I don’t really know for sure.” Then I either choose to guess, or choose not to guess. If I choose to guess, I take stock of the available information, but I don’t delude myself into thinking that there is a cardinal number attached to my guess when I’m talking about situations in which cardinal numbers do not apply.

Theists use physics right up until they don’t understand the physics anymore and then say, “The rest is a miracle of god!”

Over-use of probability is a similar kind of thing. It’s just something LW-ers do to grapple with that whole “Incomplete Other” thing that fascinated Jacques Lacan so much.

I’m not going to say that it’s true in all cases, but hopefully you can see how this kind of thinking is susceptible to producing an obsessional neurosis. Obsessional neurosis occurs when someone engages in some compulsive activity in lieu of gaining real control over his or her life. Developing a giant Bayesian statistical model for life is the ultimate neurosis for a person inclined to formalized logic. Hell, somebody even made a movie about it:

You can imagine some poor schmuck trying to estimate the year of his death using a Markov Chain Monte Carlo simulation and choosing when the best time to sire a child might be…

* * *

But wait - there are even more problems with this kind of thinking.

One of them is that, when you're building a predictive model about something, you're engaged in a priori theorizing. You're implicitly saying, "This thing that I have chosen to include in my model is relevant to the question I am trying to answer." Similarly, by not including something, you are implicitly suggesting that it's not very relevant, or not statistically significant, to your question.

So, when we build a model to predict whether Joe is a murderer, we include a certain set of demographic information, but we may exclude other sets of information, and in doing so, we've biased our analysis with our opinions. We've expressed a "Bayesian prior" subconsciously, and that excluded prior is basically this: "There is a zero percent chance that the thing I have excluded from my model is relevant to the question I purport to answer."

Maybe it is irrelevant. But maybe not. More to the point, if you haven't included it in your model for the first run of the analysis, then you've biased your model unfairly - according to the rules of Bayesian inference itself! And since no one could ever hope to begin with a model that includes everything, then there is no possible way that Bayesian analysis improves on our ability to solve common, everyday problems any more than any other biased method of cognition.

Period.

* * *

This blog post is already long enough, and I've left out the important criticism that statistical modeling almost always implies some sort of linear relationship between the predictive variables and the variable being predicted. Who among us is prepared to claim that human behavior is always and everywhere a continuous function of physical inputs?

And yet, when we attempt to subject all decision-making to Bayesian inference, that is exactly what we're suggesting.

Now, the funny part there is that I often encounter people - behavioral economists, for example - who are happy to suggest that any type of human preference that doesn't behave according to a modelable continuous function is "irrational." Gleefully, they proclaim that humans are not rational animals because, look here, people smoke cigarettes even though they know cigarettes are unhealthy, and look over there, people take on more debt than they can afford, even when it's clear that the debt is unaffordable.

Something about the Less Wrong crowd makes me think that this is their view, too. I get the impression that they simply feel that they are combating their irrational tendencies with a sublime brand of rationalism.

It's that underlying sense of transcendence, the suggestion that you might be able to achieve some sort of higher state of existence by putting into practice Eliezer Yudkowsky's principles of rationalism - which he himself describes in quasi-religious language, like "The Way" - that gives people like me the heebie-jeebies.

When a community of people offer you a chance at achieving a higher state of being by becoming a little less human, it starts to look more like a religion than a science. Is this a fair criticism of the Less Wrong community? I have no idea, but I do know that I'm not the only one to have made it.

* * *

What's the point? Why write all this when I don't really have any skin in the game? What do I care if people follow some Silicon Valley thirty-something with a quasi-religious fervor based on some basically sound and convincing mathematical source material?

Recall that I come from a place in the world where both religions and cults thrive. Recall that my blog has in many ways become a place to engage in self-analysis, self-criticism, and hopefully, the end of the kind of illusions we tell ourselves. Personal growth (not transcendence) requires that we always try to grow and develop.

We'll always be wrong - usually we'll be more wrong, not less. Trying to be less wrong is okay if it means gaining some new knowledge and using it to improve the results of your day-to-day action. But there is only so much knowledge you can have. It's tempting in today's world of "big data" and big processors and big Markov Chains to believe that all we need to do is model the possible outcomes of any scenario and update our priors.

But it's also vain. Growing up means accepting your limitations and working with them. In some cases, that might mean letting go of Bayesian inference and restricting your statistical analyses to problems that can actually be solved with statistics.

Stationary Waves

2015-08-05

More Wrong

No comments:

Post a Comment