Talk:Bayes' theorem/Archive 5

This is an archive of past discussions about Bayes' theorem. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

←

Archive 3

Bayes' theorem or Bayes's theorem?

Latest comment: 14 years ago9 comments8 people in discussion

I am not a native English speaker, but should the title of this article be Bayes's theorem instead of Bayes' theorem? I think Bayes' theorem would be correct if there were a Baye Sr. and a Baye Jr. and theorem would honour two or more Bayes :-) Albmont 03:38, 15 January 2007 (UTC)

That might make some kind of sense. But habits vary. Compare St James's Park with St James' Park and St James Park. I would prefer "Bayes theorem" as most people say, but Wikipedia puts in an apostrophe. Henrygb 09:25, 15 January 2007 (UTC).

Yes, Bayes's is grammatically correct. But don't change it in the body until we can get the title changed. Pianoroy 08:56, 19 January 2007 (UTC)

If it were a question of ownership this might be relevant, but in mathematics, far more often than not, the possessive is not used for naming theorems. Thus we have the Gauss lemma (not Gauss's lemma), the Smale theorem (not Smale's theorem), the Whitehead theorem (not Whitehead's theorem), the Wiles theorem (not Wiles's theorem), and so on... Furthermore, the more significant the result, the more likely the possessive is not used when naming or ascribing the theorem. If WikiGrammarians persist, won't we soon be seeing the King James's Version of the Bible? The article title and references to the theorem should be Bayes theorem. By the way, I'm a (now retired) professional academic mathematician. Chuck 15:24, 25 January 2007 (UTC)

Oddly, though, every instance you mention uses the article 'the' before the name of the person responsible for the theorem. But the usual context for citing Bayes' theorem doesn't use the article 'the'; similarly, I've never seen Zorn's lemma cited as "the Zorn lemma," but have often seen it cited "Zorn's lemma." I've seen Smale's theorem cited as such; as for the rest, I have no information. But I do have many other counter-examples. So my conclusion is that it is legitimate in many cases to use the construction "the X theorem" (or lemma), but it is also legitimate to use the construction "X's theorem" (or lemma). This brings it back to the realm of grammar. Bill Jefferys 01:55, 5 March 2007 (UTC)

Thanks Chuck for the clarification. I'll switch it back to 'Bayes theorem' (title and body) when I get a chance (unless someone else is willing to oblige). Pianoroy 00:10, 30 January 2007 (UTC)

Nu? Drop the apostrophe or keep it, but don't leave this grammatical error for so long! Zin 19:02, 7 April 2007 (UTC)

My understanding is the both Bayes' and Bayes's theorem are correct, but it is just a matter of taste which one you use. See rule 6.1 on http://jonathan.rawle.org/hyperpedia/apostrophe.php Forwardmeasure 19:32, 7 April 2007 (UTC)

My understanding is that: A) Bayes' is a grammatically correct, but slightly archaic way of denoting a possessive; B) s' (rather than s's) was customary at the time when the theorem was published, and has thus become ingrained. Bayes' Theorem cannot therefore be dismissed as incorrect. Moreover, it is has the advantage of being the most commonly used form. As such, I would argue in favour of retaining it in this form. PRJ 15:50, 22 February 2009 (UTC)

Since when did using "Bayes'" become archaic? Using an apostrophe after an "s" is proper formal English, plain and simple. Jason Quinn (talk) 15:41, 4 May 2010 (UTC)

I agree, FWIW (native british english speaker). "Bayes'" is grammatical and is also the way we say it when discussing it in the department --mcld (talk) 16:28, 4 May 2010 (UTC)

Obvious CONTRADICTION in the theorem itself

Latest comment: 17 years ago1 comment1 person in discussion

The Bayesian theorem places an obstackle to itself. It shoulf define the area of results to be distinct from Pr(A). It defines the normalisation constant as dependent of the Pr(B) (the prior or marginal probability of B, and acts as a normalizing constant) of Pr(A) (the prior probability or marginal probability of A. It is "prior" in the sense that it does not take into account any information about B).

In fact, Bayes is wrong in the start, because he has proven his theorem BECAUSE of the interdependance of Pr(A) and Pr(B).

That is inadmissable:

See <a href="http://en.wiki.x.io/wiki/Bayes%27_theorem#Derivation_from_conditional_probabilities">Derivation from conditional probabilities</a>.

(Goran Arsov: arsov1978@yahoo.com)

Discussed thoroughly at www.MySpace.com, Groups » Philosophy of Psychology » Topics » Representational theory of mind and Jerry Fodor

—The preceding unsigned comment was added by 62.162.223.243 (talk) 12:01, 16 January 2007 (UTC).

Could you explain what you mean with inadmissible - is it a mathematical term I am unaware of or do are you pointing out that Bayes' law defies intuition? I guess everyone should agree that bayes theorem can prove anything based on any data as long as the prior is chosen in the right way. This may make Bayes' law unphysical, but it doesn't make it unmathematical: the outcome of Bayes' law is quite clear but it is easily misinterpreted. I got these ideas from a book by A. Tarantola called "Inverse Problem Theory".

I added a quote to another problem with Bayes' theorem: the Borel paradox: conditional probability density functions are not invariant under coordinate transformations.

Jelmer Wind --

Mailing lists / forums

Latest comment: 17 years ago2 comments2 people in discussion

Are there any mailing lists or forums where Bayesian topics are discussed? So far I have mainly found mailing lists for different software packages. ParkerJones2007 16:30, 2 June 2007 (UTC)

Will someone take all of the gibberish off of this article? What kind of jerks are you? Ron Levy Ronaldlevy@gmail.com 716.990.2649 —Preceding unsigned comment added by 74.78.114.129 (talk) 05:32, 5 September 2007 (UTC)

Incomprehensible

Latest comment: 17 years ago1 comment1 person in discussion

I am a fairly well-educated non-statistician. This page is unintelligible to me. —Preceding unsigned comment added by 24.195.211.159 (talk) 16:34, 5 November 2007 (UTC)

90% in Monty Hall

Latest comment: 17 years ago2 comments2 people in discussion

It is entirely unclear to me, and appears to be contradictory to Monty Hall Problem, where the 90% value came for the monty hall problem. I removed it until someone explains to me why the example given was right, It was tagged with "verify" and I couldn't nut it out, so please have a look and see if anyone can explain that.

Thanks User A1 (talk) 21:42, 5 December 2007 (UTC)

I think it is better without that section - it only confuses the problem (imo) and takes away from the point of the example. PhySusie (talk) 00:13, 6 December 2007 (UTC)

Whacky

Latest comment: 16 years ago1 comment1 person in discussion

Reading this certainly causes eyes to cross and brows to furrow. You all really should try to make it even more abstract and ridiculous than it is right now (Feb 2, 2008). Why go for just absurdly arcane? By the way, the a priori probability of voters in a referendum voting Yes or No should include other possibilities including hanging chads, voting "present" or, possibly voting Yes AND No. This means that the statement that the probability of a Yes vote is a priori 0.5 is just plain wrong (and worse, it is contrary to the real world reality). Outcomes of a real coin flip: H, T, edge, lost, unknown(disrupted). Hence the a priori probability of Heads is 20%, not 50% (thats assuming we eliminate things like the coin never landing, dimensional warps, etc.) —Preceding unsigned comment added by 69.40.243.166 (talk) 02:20, 3 February 2008 (UTC)

Present a simple example earlier on?

Latest comment: 16 years ago2 comments2 people in discussion

It takes reading rather far into the article for the "general reader" before they see a simple example illustrating the rule. A simple illustration could and should be given immediately following the statement of the theorem.

I further think the cookie example is not particularly good for the purpose, because (in spite of the concrete terminology) it is actually rather abstract. Thinking about a more evocative example, one that came to mind is that of a school having 60% boys and 40% girls as students. The girl students wear pants or skirts in equal numbers; the boys all wear pants. You see a (random) student from a distance; all you can see is that this student is wearing pants. What is the probability this student is a girl?

Two questions. Number one: do you all agree that we should have a simple illustration early on? Number two: is the above example indeed better than that with the cookies?

--Lambiam 19:40, 3 February 2008 (UTC)

I'd answer 'yes' to both your questions. I'd suggest a change of wording however due to the different meaning of "pants" in British English ! --Qwfp (talk) 18:37, 4 February 2008 (UTC)

Change of terms in the Monty Hall section

Latest comment: 16 years ago1 comment1 person in discussion

The Monty Hall problem section begins by talking about a "presenter" but then moves on to refer to a "host". I believe that these are meant to refer to the same person, and if so, the same term should be used throughout:

Let us call B "the presenter opens the blue door". Without any prior knowledge, we would assign this a probability of 50%.

In the situation where the prize is behind the red door, the ~~host~~ presenter is free to pick between the green or the blue door at random. Thus, P(B | Ar) = 1 / 2

In the situation where the prize is behind the green door, the ~~host~~ presenter must pick the blue door. Thus, P(B | Ag) = 1

In the situation where the prize is behind the blue door, the ~~host~~ presenter must pick the green door. Thus, P(B | Ab) = 0

Sylvie369 (talk) 16:42, 3 May 2008 (UTC) Sylvie369

ad Example 2: Bayesian inference incorrect stating?

Latest comment: 15 years ago1 comment1 person in discussion

"From Bayes' theorem we can calculate the probability distribution function for r using ...." should not there be density instead of distribution, for the f() is density? 84.16.123.194 (talk) 12:08, 20 December 2008 (UTC)

Mark-up looks incorrect in Example 2

Latest comment: 15 years ago1 comment1 person in discussion

The numerator is listed as: f(m=7 | r,n=10)*f(r). It should be: f(m=7, n=10 | r)*f(r).

The m=7,n=10 part is the observed probability. The r is the posited population distribution. —Preceding unsigned comment added by Rdhettinger (talk • contribs) 09:19, 24 February 2009 (UTC)

The Monty Hall Problem

Latest comment: 15 years ago1 comment1 person in discussion

I'm probably just being stupid here but I have a slight confusion on this point.

I get the idea of it in general - the article's discussion of it seems to me to centre around the idea that the presenter *knows* which door the prize is behind. That seems pivotal -because if he doesn't know, the information cannot influence his decision.

Therefore the 'another way to think about it' part- is confusing to me. It basically says "the chance of it being the red door was 1/3, so when the green door is taken away, the blue door's chance becomes 2/3" ... this defies the idea that the presenter's knowledge is essential. It implies that it doesn't matter at all whether or not the presenter knows which door it is behind. If that's true, I don't understand the first part.

Could someone please help me with this? I'm just curious about it. Thanks,

--SparkyCola (talk) 16:10, 22 March 2009 (UTC)

Application to the 'Fine-Tuning' Argument for the Existence of God?

Latest comment: 14 years ago2 comments2 people in discussion

This is my first edit of anything, ever, so forgive if this comes out looking/sounding weird etc. It seems that Bayes' Theorem is used in philosophy of religion to back up 'fine-tuning' variants of the teleological/design argument for the existence of the theistic God, in the sense that one's belief that 'A' (the existence of a designer God) is 'updated' (to use the language of the article) by having observed 'B' (the very specific values of the boundary conditions of the universe that allow life to exist). Debate about the fine tuning argument/anthropic principle is recent, and academic interest in it, particularly from philosophers of religion and science, is growing, which I believe justifies a reference to it, from this page. The article "Fine-tuned Universe" references the article 'Bayesian probability' and the prosecutor's fallacy but not this article. It at least seems that the relevance of Bayes' Theorem to the articles 'Fine-tuned universe' or 'Teleological argument' mean it deserves to reference one or the other, or both, of them. Thanks, Lekr17 (talk) 10:26, 30 March 2009 (UTC)

>>It seems that Bayes' Theorem is used in philosophy of religion to back up 'fine-tuning' variants of the teleological/design argument ... one's belief that 'A' (the existence of a designer God) is 'updated' (to use the language of the article) by[after] having observed 'B' (the very specific values of the boundary conditions of the universe that allow life to exist).

Maybe so long an article should include some mention of this. It is not a "historical remark" in the sense of the last section, which includes the argument for existence of God by Bayes' editor Richard Price. Maybe it belongs only if it does fit there. Is this modern argument descended from Price's argument --that is, the arguments by Price and his contemporary Reverends?

I disagree with the point in one sense. Inference such as Bayes Theorem quantifies must be the essence of the design argument for God, not mere 'backup' support for the argument. Right? --P64 (talk) 20:02, 10 March 2010 (UTC)

P(A') etc not used in the example

Latest comment: 15 years ago1 comment1 person in discussion

I really appreciate the example, and have no strong feelings about this, but it seems to me P(A') and P(B|A') are being defined without playing any role in the following. Doceddi (talk) 11:36, 19 May 2009 (UTC)

Abstract

Latest comment: 15 years ago2 comments2 people in discussion

Given two absolutely continuous probability measures P ~ Q etc...

E_{P}\left[X|{\mathcal {G}}\right]={\frac {E_{Q}\left[\left.{\frac {dP}{dQ}}X\right|{\mathcal {G}}\right]}{E_{Q}\left[\left.{\frac {dP}{dQ}}\right|{\mathcal {G}}\right]}}

.

>>>I don't know wether this is the right formulation, at least two things seem to interfere: the conditional expectation, given a sub-sigma-algebra, and the relation between to measures P and Q. As for the proof:

Proof: by definition of conditional probability,

>>>it is not the cond. prob. that is at stake here, but the cond. expectation. Further I very much doubt the next statement:

E_{P}\left[X|{\mathcal {G}}\right]={\frac {E_{P}\left[X1_{\mathcal {G}}\right]}{P\left[{\mathcal {G}}\right]}}={\frac {E_{P}\left[X1_{\mathcal {G}}\right]}{E_{P}\left[1_{\mathcal {G}}\right]}}

>>>At least the following definition holds: for any $G\in {\mathcal {G}}$

\int _{G}E_{P}\left[X|{\mathcal {G}}\right]dP=\int _{G}XdP=\int 1_{G}XdP=E_{P}\left[X1_{G}\right]

>>>and I can't see the relation between the two.

We further have that

E_{Q}\left[{\frac {dP}{dQ}}X\right]=\int {\frac {dP}{dQ}}X\,dQ=\int X\,dP=E_{P}\left[X\right]

.

>>>What is the role is of this statement in the proof? Nijdam (talk) 16:55, 29 May 2009 (UTC)

I agree, there are definitely problems with the proof. The actual definition of conditional expectation is more general than the one given in the proof, which calls the result into question. On a more general note, it is difficult to see the value of having the Abstract Form section right now. It would be very useful (provided the proof is cleaned up/corrected) if someone could show the relevance of the abstract form to other forms (discrete and density), at least sketching the derivation. At the moment the abstract section does nothing (in terms of adding to the understanding of the subject of the article) for the less technical readers and potentially misleads the more technical ones. Shureg81 (talk) 16:24, 14 July 2009 (UTC)

The cap is a barrier to NPOV

Latest comment: 14 years ago2 comments2 people in discussion

In Bayesian interpretations, probability attaches to propositions. In the frequentist interpretation, probability attaches to sets of outcomes of random variables. This has implications for formalism, because propositions are joined by logical conjunction (ampersand or pointy hat) whereas sets are logically joined by intersection (the cap). Thus how we choose to formalise the theorem favours one side or another - presently the frequentist approach. I'm flagging this issue up although I don't have any suggestions for how to proceed.MartinPoulter (talk) 16:00, 3 June 2009 (UTC)

I doubt that the distinction between proposition and set, between and and intersection, between ampersand symbol and cap symbol is important in probability and statistics, only in the foundations of probability. Right? I envision a quagmire of tautologies (or instances of axioms, or theorems, I have forgotten the kinds) expressed in the second-order predicate calculus.

When they step up from the foundation to a quantitative world --that is, where propositions are about quantities (other than the probabilities of propositions)-- the propositions and sets are equivalent. They are only formally different. For example, it's literally true but not importantly true that P(x=<a) is about a proposition and P{x|x=<a} is about a set. It's a matter of formal style whether one permits P{x|x=<a}, a probability attached to a set, as well as P(x=<a) and P(y in {x|x=<a}), probabilities attached to propositions. Right?

The Bayesian can't really escape attaching probability to sets where intersection is appropriate or the frequentist escape attaching probability to propositions where and is appropriate. --P64 (talk) 21:09, 10 March 2010 (UTC)

Conditional

Latest comment: 15 years ago2 comments1 person in discussion

@MartinPoulter. Thank you for your speedy answer. My actual concern is: I don't like the added sentence about objective Bayesians. It doesn't contribute to the undersanding of Bayes' law, it merely is the starting point for Bayesian probability. The form I added to the article is of more use to a reader. Nijdam (talk) 10:53, 20 June 2009 (UTC)

Now we're on it, I do not quite understand your remark aove about the 'cap'. Is it about the notation $A\cap K$ , where Bayesians often use a comma instead: $A,K$ ?

And futher I made a remark about the section with the abstract formulation. I very much doubt its correctness. Do you?Nijdam (talk) 11:27, 20 June 2009 (UTC)

Prior × likelihood

Latest comment: 15 years ago1 comment1 person in discussion

This article gives the simple discrete form of the result that seccondary-school pupils understand and later gives an "abstract version" involving sigma-algebras, etc., but it doesn't mention the ordinary work-horse version that gets relied on incessantly and heavily, which says:

prior probability distribution × likelihood function × normalizing constant = posterior probability distribution.

That needs to be given great prominence in the article. It's what Bayes' theorem says. Michael Hardy (talk) 17:26, 14 July 2009 (UTC)

ad introduction

Latest comment: 15 years ago2 comments1 person in discussion

the introduction is too dense for an encyclopedic entry. controversy on matters of use is important, but should be reserved for a section AFTER the theory has been explained, as in an Applications section. As a someone who came looking for information on the topic, i had to make my way through that whole second paragraph, which isn´t really relevant to someone just trying to find the mathematical representation of the 1st paragraph. I would suggest moving the second and third paragraphs to a later "utility" section, and leaving only a one or two sentence summary of like "Bayes theorem is valid in all common interpretations of probabability, and has significant applications in science and engineering. " actually im just going to do it. I also think there´s an excess of examples, but im not going to waste my time arguing with whoever spent months crafting them. mainly b/c of the medical/cookie wars above.

im also tempted to remove the first sentence of the statement section, given that the "Bayes gave a special case involving continuous prior and posterior probability distributions and discrete probability distributions of data." isnt followed by an explanation of what this means. maybe i just don{t know anything about statistics. Hateweaver (talk) 23:48, 11 September 2009 (UTC)hateweaver

just to explain in more brief form my changes, and avoid an instant revert, i see a large amount of discussion between a group that wants things simple for a general audience seeking information on statistics and a group that not opposed to that other, still wants an encyclopedic article with extensive information on and expounding on the topic. My edits were an attempt to rearrange the article such that group more likely to give up when seeing extensive jargon would not be presented with it right away, and the group that wants more information could read the simplified introductory section, then scroll on down. Hateweaver (talk) 00:39, 12 September 2009 (UTC)

the definition of simple

Latest comment: 15 years ago1 comment1 person in discussion

..... the 'simple' sections are possibly not quite as simple as one might think. the language of bayes' original paper is, in fact, probably more intelligble to the average lay reader than the 'simple' language presented on this wikipedia entry, despite the best intentions and labors of the wikipedia authors. i am sorry that i cannot improve upon their work, but feel rather unqualified as i am not well acquainted with the theorem itself. however i would express a great desire for the authors and editors to continue in their endeavors to pare down the language so that it is intelligible to the ordinary person....

I would like to remember Mr Faraday (who Einstein kept a picture of)... "Faraday was an excellent experimentalist who conveyed his ideas in clear and simple language. However, his mathematical abilities did not extend as far as trigonometry or any but the simplest algebra." - wikipedia entry on Michael Faraday

"Faraday did not have a background in sophisticated mathematics consequently he does not articulate his discovery through a complex equation." -- columbia university

thank you Decora -- (talk) 16:12, 19 September 2009 (UTC)

Popper-Miller argument

Latest comment: 15 years ago10 comments3 people in discussion

I feel this topic is too philosophical to belong in this article on Bayesian Theory. It belongs in its own article. Jwesley78 (talk) 23:55, 12 October 2009 (UTC)

I should've said Bayes' Theorem". Jwesley78 (talk) 23:58, 12 October 2009 (UTC)

The whole introduction is philosophical. On the other hand, the argument is strictly mathematical and logical. (and thus has been published as a letter in Nature, not in a philosophical journal) --rtc (talk) 00:25, 13 October 2009 (UTC)

The citation I gave ([http://www.jstor.org/stable/187924 "In Defense of the Popper-Miller Argument") even calls their result an "argument", not a "theorem" (and it was written by someone who supported their claim). I don't think including it in this article does anything to enhance the reader's understanding of this topic. I created a "dead link" in the See also section for someone to create an article about the Popper-Miller argument. Perhaps having a link to that article from here would be a good compromise? —Preceding unsigned comment added by Jwesley78 (talk • contribs) 00:45, 13 October 2009 (UTC)

It is both a theorem and an argument. It is a theorem that shows that

P(H\leftarrow E\mid E)\leq P(H\leftarrow E)

. This theorem serves as an argument against the claim that evidence E can inductively support a hypothesis H in a probabilistic way. The argument is central to the philosophical views implicit in the introduction and the rest of the article. --rtc (talk) 00:52, 13 October 2009 (UTC)

I think that it will only confuse the reader. (It confused me at first.) Their result is too deep for this article. See Wikipedia:Make technical articles accessible. Jwesley78 (talk) 01:01, 13 October 2009 (UTC)

However it is by no means a proof, in the mathematical sense. The relative merit of such discussions i would hazard is difficult to gauge for even the most expert of readers. My take on this is that it is (1) complicated (2) tangential and (3) semi-mathematical, indeed it is as the journal would say, philosophical in nature. I originally read the initial posting and was unsure if the method was or was not fringe science. User A1 (talk) 07:35, 13 October 2009 (UTC)

Of course it is a proof. That proof is a proof of mathematics and logic, not "semi-mathematical". It serves as a a philosophical argument in the philosophical debate surrounding Bayes Theorem. The whole introduction and a lot of the article is not less philosophical in nature! I mean how can a statement like "after evidence E is observed" be considered non-philosophical? There is nothing in Bayes theorem that talks about such things as observation or evidence; it merely talks about statements and probabilities assigned to them. In the same way Popper's and Miller's proof merely talks about statements and the probabilities assigned to them (and some of the deductive relations between the statements that are basic and undisputed facts of logic). It is only their interpretation of this theorem in the context of the philosophy under discussion (the philosophy mentioned before which says that e has something to do with observation and evidence) which makes it also a philosphical argument. If you wish to call it fringe science, sure! But statements like "after evidence E is observed" are certainly fringe science, too, and the introduction is full of them! --rtc (talk) 14:32, 13 October 2009 (UTC)

Bayes' Theorem is well established, and so is concept of Conditional probability. Any "introduction to probability" textbook will discuss both. However, I have yet to see an "introduction to probability" textbook that discusses Popper-Miller Theorem. If you can find it in some introductory textbook, I'll reconsider my position. —Preceding unsigned comment added by Jwesley78 (talk • contribs) 16:01, 13 October 2009 (UTC)

Bayes' Theorem is well established, and so is concept of Conditional probability, yes. But the philosophy (called Bayesianism) that is usually assocated with it (and is in the introduction of this article, too), and that some people identify as natural for or inherent in Bayes Theorem is not well established. It is in fact heavily disputed. You certainly won't find the Popper-Miller-theorem in any introductory textbook because they are biased towards that philosophy. This is well-known... Wikipedia is an encyclopedia. It is not bound to what is in the introductory textbooks. It is bound to neutrality. This article (like introductory textbooks on the matter) philosophically propagates the bayesianistic interpretation of Bayes' theorem, and so is not neutral. How about a non-bayesianistic interpretation of Bayes rule? The article does not mention that this has even or can at all be considered. But how about the propensity interpretation? Miller's article "Propensities May Satisfy Bayes's Theorem" says exactly what some people (like Paul Humphreys, who argues along such lines in his article "Why Propensities Cannot be Probabilities") consider as absurd: That there may be a non-bayesianistic way to understand Bayes' Theorem. --rtc (talk) 17:09, 13 October 2009 (UTC)