Talk:Maximum parsimony (phylogenetics)

Latest comment: 4 years ago by Daniele.Catanzaro in topic Minimum Evolution

comments

edit

Good article, anon, but I wish to improve it to clarify some points and possibly make it more accessible.

OLD: Maximum parsimony

Maximum parsimony is a simple but popular technique in bioinformatics to predict the best phylogenetic tree for an organism.

NEW: Introduction

Maximum parsimony is a simple but popular technique used in cladistics to predict an accurate phylogenetic tree for a set of taxa (commonly a set of species or reproductively-isolated populations of a single species.

Comments: The level 2 heading 'maximum parsimony' is redundant - have replaced with 'Introduction'. 'Bioinformatics' has been replaced with 'cladistics' because the former is more of a broad approach to biological research than a specific field of research. Have removed 'best' from introduction as this does not scientifically-describe the criteria for an optimal tree - 'accurate' is better. In addition, 'best' could be inferred as stating that MP is the best method to produce a tree when there is no consensus that any method is the best under all circumstances. Have also replaced 'organism' with 'set of taxa' and given two examples of appropriate sets to compare.


--ChrisJMoor 00:59, 16 Feb 2005 (UTC)


There needs to be some mention of bootstrapping. Jim Bowery 18:57, 18 July 2005 (UTC)Reply

NEW VERSION

edit

I've done the best I can with what is here, but it would be great if, in the future, someone could break this out into separate artciles for different classes of data, types of analysis, and optimality criteria. As it is, the MP entry is pretty much holding down the entire subject of character-based phylogenetics, which isn't ideal. And, of course, citations would really help alot. Everything in the article is supported, honest. Google Hillis. 24.174.131.158 02:26, 1 January 2007 (UTC)Reply

Cleanup taskforce: you are doing a really good job with this article!70.113.48.142 15:06, 10 July 2007 (UTC)Reply

redo

edit

The WP:LEDE is probably too long... and many other nitpicks to be repaired... I'll try to add some cites as time permits but will be busy this week... --Ling.Nut 06:32, 26 August 2007 (UTC)Reply

  • Mmmm, also looks to me like there may be more than a little shtuff that needs to be moved to some other article(s), such as the paragraph beginning: "Neighbor-joining is a form of star decomposition..." Will do as time permits. --Ling.Nut 07:31, 26 August 2007 (UTC)Reply

feedback

edit

Great job editing this page!

- Agree on the "alternatives" material: I do hope some of it can be salvaged...

- The "criticism" is not Original Research... most of these criticism IS in fact published, but finding references is not going to be pleasant. Some of the RESPONSES may be even more difficult to find references for.

- I haven't read the "self-referencing" guidlines, but I never actually CITED myself. For what it is worth.

Again, you are doing a great job! —Preceding unsigned comment added by 128.83.166.125 (talk) 16:04, 7 September 2007 (UTC)Reply

'Maximum' considered unnecessary

edit

"Maximum parsimony" is not the correct title for this article - the word "maximum" is implicit in the term "parsimony" and its usage... So many seem to get this wrong!' Ryan —Preceding unsigned comment added by 41.245.132.41 (talkcontribs) 26 March 2008

clarify "Problems with maximum parsimony pylogeny estimation"

edit

I have a hard time understanding the example in this section. In the first paragraph, it is referred to "charactes (A & C)", but surely these are taxa?

Then, is the figure supposed to represent the actual relationship, or the calculated (MP) tree?

While I see the point with only one character, won't the many substitutions (another word could profitably be used here, this seems to assume we're working on a molecular level) between branches A and D ensure that the tree will be more accurate when more characters are taken into account?

kzm (talk) 11:16, 23 May 2008 (UTC)Reply

repetitive

edit

Several times throughout the article, it is said that it's not good to exclude input data only because they have gaps/wildcards. Probably a valid point, but there's no need to hammer. (I also wonder how biologists can use such ad-hoc elimination methods on their data in seemingly arbitrary fashion, without a mathematical proof that they are useful, but maybe I'm misinterpreting the text.)--87.162.58.83 (talk) 04:38, 12 January 2009 (UTC)Reply

Citation Needed

edit

There's a citation needed tag about inconsistency of ML with suboptimal models. The citation is Farris, James S., 1999. Likelihood and Inconsistency. Cladistics 15, 199–204. — Preceding unsigned comment added by 76.192.51.242 (talk) 15:58, 3 August 2011 (UTC)Reply


Minimum Evolution

edit

Minimum evolution redirects here, but I don't think it should. MP and ME are different approaches, and there is no mention of ME in this article. In the absence of a separate article on 'minimum evolution', how about having a min evolution section in 'computational phylogenetics', and redirecting there? Objections?tom fisher-york (talk) 01:55, 5 March 2012 (UTC)Reply

I agree in part: minimum evolution should definitely have a dedicated page and the balanced minimum evolution should be sufficiently highlighted as it is the most accurate statistically consistent estimation model in distance method (and in broad phylogenetics).

However, it should be noted that in Kidd and sgaramella-zonta’s article the framework minimum evolution actually relies implicitly upon the lex parsimoniae (or occam razor’s principle). This particular detail connects ME to MP. I invite to read that article again because this issue is delicate. Daniele.Catanzaro (talk) 20:15, 5 June 2020 (UTC)Reply


Minimum Evolution - Part 2

edit

I rewrote the previous paragraph on Minimum Evolution mainly due to the presence of numerous imprecisions and nonsenses. I have also proposed the creation of a main article on Minimum Evolution (to be slowly and accurately completed over time). Minimum Evolution is the ONLY method currently available to analyze very massive datasets such as those relative to the SARS-Cov2 pandemic (see https://nextstrain.org/ncov/global)

The removed text is the following:

The minimum-evolution tree-optimality criterion is similar to the maximum-parsimony criterion in that the tree that has the shortest total branch lengths is said to be optimal. The difference between these two criteria is that minimum evolution is calculated from a distance matrix, whereas maximum parsimony is calculated directly using the character matrix. Like the maximum-parsimony tree, the minimum-evolution tree must be sought in "tree space", typically using a heuristic search method. A "fast minimum evolution" method can be quicker than the usual neighbor-joining algorithm for (greedily) generating such a tree.[1]


The new text looks like the following

From among the distance methods, there exists a phylogenetic estimation criterion, known as Minimum Evolution (ME), that shares with maximum-parsimony the aspect of searching for the phylogeny that has the shortest total sum of branch lengths.[2][3]

A subtle difference distinguishes the maximum-parsimony criterion from the ME criterion: while maximum-parsimony is based on an abductive heuristic, i.e., the plausibility of the simplest evolutionary hypothesis of taxa with respect to the more complex ones, the ME criterion is based on Kidd and Sgaramella-Zonta's conjectures (proven true 22 years later by Rzhetsky and Nei[4]) stating that if the evolutionary distances from taxa were unbiased estimates of the true evolutionary distances then the true phylogeny of taxa would have a length shorter than any other alternative phylogeny compatible with those distances. Rzhetsky and Nei's results set the ME criterion free from the Occam's razor principle and confer it a solid theoretical and quantitative basis.[5]

References

  1. ^ Desper R, Gascuel O (March 2004). "Theoretical foundation of the balanced minimum evolution method of phylogenetic inference and its relationship to weighted least-squares tree fitting". Molecular Biology and Evolution. 21 (3): 587–98. doi:10.1093/molbev/msh049. PMID 14694080.
  2. ^ Catanzaro, Daniele (2010). Estimating phylogenies from molecular data, in Mathematical approaches to polymer sequence analysis and related problems. Springer, New York.
  3. ^ Catanzaro D (2009). "The minimum evolution problem: Overview and classification". Networks. 53 (2): 112–125.
  4. ^ Rzhetsky A, Nei M (1993). "Theoretical foundations of the minimum evolution method of phylogenetic inference". Molecular Biology and Evolution. 10: 21073–1095.
  5. ^ Desper R, Gascuel O (March 2004). "Theoretical foundation of the balanced minimum evolution method of phylogenetic inference and its relationship to weighted least-squares tree fitting". Molecular Biology and Evolution. 21 (3): 587–98. doi:10.1093/molbev/msh049. PMID 14694080.

A stab at a simplified lead

edit

I didn't think the lead was serving the needs of an intelligent middle-schooler at all well, so I made rather aggressive structural changes to the lead, while denting the actual text as little as possible—though I did add my own explanatory text concerning the core challenge of phylogenetics to help set the stage. Having not massaged the original text, the revised structure is not as smooth as it might be (and there are some obvious warts). I encourage anyone with a sense of appropriate language in this subject domain to further polish the flow. — MaxEnt 10:51, 12 January 2015 (UTC)Reply

technical content swizzle in lead

edit

I took a second stab to make the lead more comprehensive and less essayish in tone, bumping the last two paras off the lead into a new section titled "Alternate characterization and rationale" (admittedly an ugly limbo) and replacing these two paragraphs with substantive concerns lifted from the main article and mildly restructured for reading level. — MaxEnt 11:23, 12 January 2015 (UTC)Reply