Talk:Floating point operations per second

Latest comment: 10 months ago by 5.178.188.143 in topic Origin of term?

FLOPS comparison

edit

The rounding in the flops comparison prices is too great. The difference between two items being rounded to the nearest penny can be more than a 10% inaccuracy, which is pretty bad. For example the gap between the PS5 and Xbox series X is significant but they're both 4C per gigaflop. I think the solution is to measure the cost per teraflop, or to show fractions of cents. As computers get faster this rounding will make the comparison useless, because it is already 1C and cannot go any lower without the cost displaying zero. If think the solution is to display $40/teraflop instead of 4C/gigaflop.

Processor table

edit

The AMD section does not include 3DNow! which was AMD's equivalent (and then some) of Intel's MMX for floating point acceleration. I believe they also had the equivalent of an x87 unit in their earlier processors?

I'm not sure how necessary the table truly is to the article however. — Preceding unsigned comment added by 75.138.33.10 (talk) 02:01, 14 January 2024 (UTC)Reply

Records Section

edit

The records section has some significant errors in it. The section itself is completely speculative, as no actual BlueGene/P systems have been deployed yet. Currently, no computational system operates at sustained speeds in excess of a single petaflop. The IBM press release being used is not objective, and it should not be stated as a record until the speed is demonstrated publicly, or through an independent entity such as TOP500.

Really, the whole section needs to be cleaned up, as the SX-9 has not public benchmarks available either.

In fact, if one wants to speculate about the fastest computers, it would be perhaps more useful to include a discussion of the NSF Track 2 systems, as well as the proposed Track 1 system (which is specifically being designed to operate in excess of one petaflop, sustained) The currently operational track 2 system RANGER is operating at approximately half a petaflop, putting it around the current speeds of the BlueGene systems that have been deployed.


I have a comment on the records section, in regards to this paragraph:

"As of 2008, the fastest PC processors (quad-core) perform over 51 GFLOPS(QX9775)[13]. GPUs in PCs are considerably more powerful in pure FLOPS. For example, in the GeForce 8 Series the nVidia 8800 Ultra performs around 576 GFLOPS on 128 Processing elements. This equates to around 4.5 GFLOPS per element, compared with 2.75 per core for the Blue Gene/L. It should be noted that the 8800 series performs only single precision calculations, and that while GPUs are highly efficient at calculations they are not as flexible as a general purpose CPU."

If you want to show that graphics cards can perform considerably more flops, why not compare to the ATI Radeon 4870X2, which can do 2.4 teraflops? That's over four times faster than the 8800 ultra that's listed. —Preceding unsigned comment added by 128.205.179.184 (talk) 20:33, 14 August 2008 (UTC)Reply

Yes, lets update it every time a new GPU comes out, and then spark a fanboy riot whenever we pick ATI or Nvidia because nobody can conclusively decide which cards are faster anymore. —Preceding unsigned comment added by 67.185.55.105 (talk) 18:14, 19 October 2009 (UTC)Reply

"As of 2010, the fastest six-core PC processor has a theoretical peak performance of 107.55 GFLOPS (Intel Core i7 980 XE) in double precision calculations." I suspect this number is wrong, because even Intel seems to think that architecture can theoretically do only 4 FLOPS/core/cycle.[1] The correct number for the i7 980 XE should be closer to 80 GFLOPS. At any rate, the current record for a single PC processor is probably ~108 GFLOPS (Intel Core i7 2600K). It only has 4 cores, but the new Sandy Bridge architecture lets it do 8 FLOPS/core/cycle. I can't find the equivalent numbers for AMD processors, but IF the Opteron 6100 can also do 4 FLOPS/core/clock, then it should be able to outdo the Sandy Bridge and hit 120 GFLOPS. φ (talk) —Preceding undated comment added 10:20, 10 March 2011 (UTC).Reply

June 24, 2011: http://www.intel.com/support/processors/sb/cs-023143.htm#1 Intel provides specifications for several processors in GFLOPS. Admittedly, they do not cover every processor the produce. They do call out the i7-975 @ 3.46 GHz as being capable of 55.36 GFLOPS. This is a quad-core processor. I doubt that 50% more cores with a slower clock would double the GFLOPS. It may be better to simply go with something attributable here. — Preceding unsigned comment added by 134.223.116.201 (talk) 13:59, 24 June 2011 (UTC)Reply

What about the 3960X or the 3930K? I've seen 216GFLOPS (overclocked and hyperthreading off). Should be fixed, right?20:51, 15 February 2012 (UTC)20:51, 15 February 2012 (UTC)20:51, 15 February 2012 (UTC)20:51, 15 February 2012 (UTC)86.177.65.29 (talk) 20:51, 15 February 2012 (UTC)Reply

References

Zettaflops

edit
Zettaflops means 1021 FLOPS, a thousand petaflops

No it doesn't—it's a million petaflops. Anyway, zettaflops appears nowhere else in Wikipedia, so we may safely assume the statement irrelevant. —Herbee19:28,2004 May 22 (UTC)

Reference ?

edit
It is interesting to note that the combined calculating power of all the computers on the planet is only several petaflops.

Interesting indeed. Does anyone have a reference? —Herbee 21:41, 2004 May 22 (UTC)

Why is that interesting? It's just a total, what did you expect? --Gunter 18:18, 8 Jan 2005 (UTC)
It's interesting numbers to estimate real possibilities of distributed computing projects. But they are much bigger.
Average computer has a speed about 1 GFLOPS. There are about 1 billion computers in the world. So total speed of all computers is about 1000 PFLOPS, or 1 ExaFLOPs. It's 4000 times faster than the fastest supercomputer IBM Blue Gene/L (280 TFLOPs). --Alexey Petrov 19:09, 13 April 2006 (UTC)Reply

Question

edit

Why do people write FLOPS and MIPS rather than flops and mips? I cannot think of a good reason for the capitals. And besides, people write mph (miles per hour) and dpi (dots per inch) rather than MPH and DPI. I rewrote the article with flops and it doesn't look bad, I think. Is there a reason why I shouldn't move this article to Flops? —Herbee 22:09, 2004 May 22 (UTC)

FLOPS is the correct abbreviation, flops is not a word. Non-technical people usually write FLOPS as flops usually out of ignorance of what the abbreviation actually means. --Gunter 18:18, 8 Jan 2005 (UTC)
It is FLOPS in Computer Architecture by Hennessy and Patterson and flop/s or Flop/s or FLOPS in Sourcebook of Parallel Computing by Dongarra et al. Personally I think it should be FLOPS, but there is no consistency in respected publications. By the way, it is MPH in my car. --Koobas 17:15, 7 August 2006 (UTC)Reply

Metric

edit

As a mathematician, the use of the term 'metric' makes me uneasy. Isn't there a better, less jargon-y, alternative: 'measure' perhaps? mat_x 19:46, 31 Dec 2004 (UTC)


There is a new winner http://news.bbc.co.uk/1/hi/technology/4379261.stm

Removal of Irrelevant and Ambiguous Material

edit

I recently removed the following:

The computers that generated Lord of the Rings characters and places - Gollum, the Balrog, and Middle Earth - are now available for hire, for example. The cluster of 1,008 computers in New Zealand can be rented on-demand, on a per hour, per processor basis.

Already the supercomputer is being used to design a super yacht and test gene sequencing algorithms.

The first paragraph is not related to FLOPS and is somewhat abrupt in context.

The second paragraph is ambiguous to which supercomputer it is referring to, and probably irrelevent in any case.

--Zdude255 21:56, 6 Apr 2005 (UTC)


I removed the following, compare MDGRAPE-3 and Blue-Gene/L:

"and, rather than costing billions of dollars, the machine only costs 7 million dollars to build."

This is quite a misleading figure, as the Japanese government got quite a bit of help from the corporations involved, none of which are included in that figure. The true cost of the MDGRAPE-3 machine is much, much higher, and unknown.

--ww.ellis 30 Aug 2006

NVIDIA vs. VIA

edit

Removed from the article:

Nvidia work out performance by adding pixel shader and vertex shader performance together (in the PS3s case this is 1.8 TFLOPs). ATI work out performance by taking the average performance of the two shader types (quoted at 0.9 TFLOPs), and so if the ATI performance was taken by the same method as Nvidia then the performance would be the same. And so the total performance of the Xbox 360 would be around the 2 TFLOPs mark too. However as it stands Microsoft have released figures based on ATI's figures (1 TFLOPs).

This seems to be a marchitecture-based dicussion. Can we let this be fought out on the games console article talk pages, please? This article now, in any case, compares manufacturer-announced FLOPs in each case: marchitecture vs. marchitecture. More in the article. -- Karada 12:30, 23 May 2005 (UTC)Reply

PS3 teraflops misquoted

edit

I changed 2.18 to 2.0, the figure stated in their press release. I note that other sources do claim the number to be 2.18 but I thought it best to use the number from their press release.

http://www.us.playstation.com/Pressreleases.aspx?id=279

Irrelevance

edit

I removed the following because they are, for the most part, irrelevant.

Furthermore, the system memory bandwidth on each console is a major factor to its performance. Even though some sources claim that
Xbox 360's overall system memory bandwidth eclipses PS3's by a factor of five, its actually untrue or, if you want, half-true. This
huge bandwith is only between the GPU and its 10MB of Embedded RAM which can not be compared to overall system bandwith. Moreover,
Xbox360 will have a shared memory architecture for CPU and GPU. Playstation 3 uses different memory pool for each, CPU and GPU,
which, along with faster XDR memory, means more than twice the bandwith of Xbox360. All in all, it can be said, that memory
architecure and bandwidth for both consoles is equally efficient. Only time will tell though.

Memory bandwidth may affect the amount of FLOPS achievable in practice, but the paragraph does not describe how.

Human mind in FLOPS?

edit
Humans are even worse floating-point processors. If it takes a person a quarter of an hour to carry out a pencil-and-paper long division problem with 10 significant digits, that person would be calculating in the milliFLOPS range. Bear in mind, however, that a purely mathematical test may not truly measure a human's FLOPS, as a human is also processing smells, sounds, touch, sight and motor coordination. This takes an average human's FLOPS up to an estimated 10 quadrillion FLOPS (roughly 10 PFLOPS). [1]

Wasn't that in MIPS? As the article says:

We might take a rough estimate and say it is handling 10 quadrillion instructions per second, but it really is hard to say.

I belive we should assume that it's about 10 GMIPS, not 10 PFLOPS. --83.11.55.87 17:10, 14 February 2006 (UTC)Reply

Based on some rough math using OpenWorm as a reference point, modeling the human brain will take in excess of 63.8 Exaflops (1018). This could be wildly under-estimated as I may not be accounting for the 100 Trillion Synapses[2]. If we are to believe the microtubular computing theory, then this figure could rise to 1024 [3]. 92.25.228.128 (talk) 15:31, 4 May 2012 (UTC)Reply

Distributed computing

edit

Quote from this article:
Distributed computing uses the Internet to link personal computers to achieve a similar effect: Folding@home, the most powerful distributed computing project, has been able to sustain over 200 TFLOPS. SETI@home computes data at more than 100 TFLOPS.

And from another article at Wikipedia (Supercomputer):
One such example, is the BOINC platform which is a host for a number of distributed computing projects recorded on March 25th 2006 processing power of over 418 TFLOPS spread over 930,000+ computers on the network [3]. Its largest single project the SETI@home project has a reported processing power of 238 TFLOPS on March 25th 2006 [4].

On May 16, 2005, the distributed computing project Folding@home reported a processing power of 195 TFLOPS on their CPU statistics page.

It seems, that second variant is right, because SETI@home is slightly bigger/faster project than Folder@home. So this article contains incorrect numbers. --Alexey Petrov 19:00, 13 April 2006 (UTC)Reply

I agree, but there is some controversy over it, I changed the article to be more netural towards both projects, since they are both the most seen and most powerful networks. Requen 08:22, 28 June 2006 (UTC)Reply

"With roughly 40 000 TFLOPS in June, 2011 BitCoin Mining uses more computing power than all the leading competitors combined (more than double the two greatest ones: BOINC and Folding@home, which include over a dozen separarate distributed projects, see below) [16] (Although admittedly WoW or other graphical video game usage was not investigated.)" - Is the source for this trustworthy? How is the network hashrate actually calculated? Regarldess, I updated the value from 90 000 TFLOPS to the 40 000 the site is actually stating. Onissum (talk) 09:00, 15 June 2011 (UTC)Reply

Just checked the calculated hashrate again, and now bitcoinwatch is reporting 155 petaFLOPS. The value fluctuates far too greatly to report it meaningfully, and I also seriously question how they are measuring the hashrate, and if it's accurate in the slightest. It went from 90 petaFLOPS, to 30, and now the measurement is sitting on 155.4, all within the span of 24 hours. Added a dubious tag to the page. Onissum (talk) —Preceding undated comment added 20:57, 15 June 2011 (UTC).Reply

The recent fluctuations in the hashrate and FLOPS on the 15-16th June were probably caused by the increase in the difficulty. The FLOPS are most likely calculated from the network hashrate, which in turn is probably calculated from the number of hashes (on average) needed to solve a block, and the blocks solved in a certain time frame.
I (too) have my doubts on the accuracy of the FLOPS value. Especially since computing SHA hashes, to my understanding, is pretty much just integer operations. I have no idea how the FLOPS can be accurately estimated in this case. Perhaps someone more knowledgeable on this topic could explain?
edit: erm, yeah forgot to explain; the huge spike in hashrate was caused (most likely), because the difficulty increased, but the calculation still had the blocks/time value from the previous difficulty. We should see the hashrate dropping over time back to the "normal" level.
82.181.165.53 (talk) 17:02, 16 June 2011 (UTC)Reply

To explain a bit, it's right, expressing in FLOPs the computing power of the whole bitcoin mining grid is not exact, as no floating operation is done, it's all integer. I think the FLOPs value is there just to allow people to get an idea of the total computing power involved, like, we are doing 8Tera hash/seconds right now, and yea, they are all integers operations, but it would require enough GPUs to make 101TFLOPs if you want to make 8THash/s. For example, an AMD 6990 make like 800Mhash/s (or 0.8Ghash/s), so it require like 10000 AMD6990 to make 8Thash/s. And since each AMD6990 has a compute power of 6TFLOPs, the total computing power involved in bitcoin mining in FLOPs is like 60000TFLOPs. Hope my explanation helped a bit — Preceding unsigned comment added by 87.0.38.83 (talk) 17:06, 17 June 2011 (UTC)Reply

Assuming a network hashrate of 8 THash/s, using the formula mentioned above:
GTX 580, 1581 Single precision GFLOPS[4], 140 MHash/s[5]:
(8000000 MHash/s)/(140 MHash/s) ~= 57 143
57143*(1581 GFLOPS) ~= 90 PFLOPS
HD 5830, 1792 Single precision GFLOPS[6], 244 MHash/s:
(8000000 MHash/s)/(244 MHash/s) ~= 32 787
32787*(1792 GFLOPS) ~= 59 PFLOPS
That's a pretty big difference. I think the bitcoin network performance is wildly inaccurate, and thus shouldn't be included in this article.
82.181.165.53 (talk) 22:45, 17 June 2011 (UTC)Reply
The difference is big because NVidia GPUs have poor integer performance which is what SHA256 is all about. Thus an equivalent AMD GPU will be superior for this case and used for Bitcoin mining. 1Node42 (talk) 16 Sep 2012 (UTC)
Well, even using the lower value, 59PFLOPs is much more than any other supercomputer or distributed computing computing power. And after all, FLOPs are the most widely unit used for computer performance — Preceding unsigned comment added by 87.0.38.83 (talk) 11:07, 18 June 2011 (UTC)Reply
So yeah... Just did some researching. To give some perspective, the combined (LINPACK) performance of the TOP500 computers is 43.67 PFLOPS[7]. So currently the bitcoin network is more powerful than the 500 fastest supercomputers AND the biggest distributed networks combined. Now that's something I call, if you'll excuse me, bullshit. I edited the article to include this comparison, to give some perspective... I really think the bitcoin network performance, at its current accuracy, doesn't belong to the article.
Also, please indent your posts.
82.181.165.53 (talk) 20:34, 18 June 2011 (UTC)Reply
That number is accurate. The shear number of distributed GPUs/FPGAs used in Bitcoin mining greatly outnumbers coalesced 2k-10k super computing clusters. The Bitcoin network has economic incentive to run the machines so they are run in far greater number vs other charity computing projects. 1Node42 (talk) 16 Sep 2012 (UTC)
I've reverted another attempt to spam the article with the previously-rejected cheerleading content about Bitcoin. The justification that was given this time was “Its [sic] still relevant, if one distributed project belongs they all do”. Sorry, but every distributed project in computing does not belong in an article about floating-point operations, and the Bitcoin project, however large and impressive it may be, is based on integer operations — a very large number of integer operations performed at an impressive rate, but integer nonetheless. And having to resort to vague hand-waving claims of equivalence — “Bitcoin … is estimated to be utilizing hardware capable of 300.59 petaFLOPS” — just serves to emphasize that Bitcoin's calculations are not floating-point. It's not bias to insist that content stay on-topic. Perhaps all of this information on Bitcoin would be better placed in List of distributed computing projects? 76.100.23.153 (talk) 23:29, 22 November 2012 (UTC)Reply
1) It's not a cheerleading attempt, its an attempt to rectify censorship on the topic. 2) FLOPS are used colloquially and universally as a measure of computing speed. In the distributed computing projects listed a lot of them actually also use integer operations as the majority of their computing as well, neither of them are held to the scrutiny or standard you are attempting to uphold Bitcoin to. 3) Bitcoin is significant and notable in its size (larger than the top 500 super computers in the world COMBINED) and people expect its speed to be measured and quantified in a category of its peers. 4) Its estimates are multi-sourced and balanced and provided in far greater depth than other projects due to its contention, each project should be equally scrutinized. 5) I concede that it would be best if we simply had a link to List of distributed computing projects with a separate section mentioning the of history of records, but this would push information people may want readily available away from where they expect it. Again if other computing projects that contain unscrutinized type of calculations and also presenting FLOP estimates are allowed here, so should Bitcoin. If neither belongs, that should be reflected. Unobjective bias however is not what wikipedia is about. 1Node42 (talk) Nov 27 2012 —Preceding undated comment added 17:30, 27 November 2012 (UTC)Reply
Oh, please don't get yourself blocked, under either WP:EW or WP:POINT, by admitting to adding known off-topic material, repeatedly, on the vague and irrelevant claim that other projects use integer operations, too, so you're going to add a more egregious example to prove a point. If it helps, the reason that such justification is irrelevant is that FLOPS only counts floating-point operations; a computing task is welcome to use as many integer operations as it likes along with the floating-point operations, but the FLOPS rate only counts the floating-point operations. The error in the repeatedly-deleted BitCoin text is that the integer operation rate is being reported as FLOPS, which is simply incorrect — there is no “equivalent” FLOPS rate for integer calculations. Even the author of the claimed equivalence at BitCoinTalk.org questions both its accuracy and its validity, e.g., “Many here among us question the validity of this estimation method”. Integer and floating-point operations are simply different, as recognized by the many dual benchmark instruments (e.g., Whetstone and Dhrystone; SPECint and SPECfp), and ignoring that distinction is just the ancient error of counting apples as oranges. 76.100.23.153 (talk) 23:19, 8 December 2012 (UTC)Reply
it's also largely taking place on hardware that's not even capable of floating-point operations because it's ASICs built only to perform hashing operations, as if that wasn't bad enough. 128.95.2.29 (talk) 00:04, 17 April 2014 (UTC)Reply

Computing in One Operation

edit

"Pocket calculators are at the other end of the performance spectrum. Each calculation request to a typical calculator requires only a single operation..."

How is it possible to compute something in one operation? It first has to store the first operand that's typed in before even starting to compute. Or am I misunderstanding the meaning of the word "operation."83.118.38.37 19:28, 25 August 2006 (UTC)Reply

Gave it some second thought and I see wat is meant now.83.118.38.37 03:18, 26 August 2006 (UTC)Reply
It's the difference between "operations" and "instructions"... the device may process a few hundreds of instructions per second, but only manage ten or twenty operations. Or in the case of an everyday mid-level scientific calculator (say a $20 Casio 85fx or the like), maybe a couple of KIPS but only 50 or so FLOPS... as seen when issuing an instruction like "!69" (it's an integer operation, but the calculator will treat all numbers as fixed-precision floats, so...) which involves 68 seperate multiplication operations, each involving a good number of seperate instructions, and needs more than a second to calculate (blanking the display whilst it works). Or much longer if power is low (they seem to work asynchronously and on the basis of how much current is available - not an issue if the user is just doing simple sums and it's more important to keep the LCD contrast high, but the scientific operations can become quite drawn-out). But it's no bad thing; it doesn't need to go any faster, and it makes the batteries last an awful long time. 193.63.174.10 (talk) 12:31, 17 November 2010 (UTC)Reply

Data

edit

In the article it says that the character "data" in Star Trek was as fast as 60 trillion operations per second or "60 TIPS". My question is, what is this "TIPS" - word? Should it not be properly explained in the article? Also it would be cool if someone with knowledge about computers wrote something about how fast that would be in comparison to todays super computers. --Mailerdaemon 13:12, 15 November 2006 (UTC)Reply

The Trek writers' policy is actually to prevent such comparisons, because they know their imagination will probably be hilariously off the mark at the supposed time of events happening, or even long before. "TIPS" could be "tera", or it could be some other unit entirely. Compared with "Quads" of storage - the T could be "Trios" or somesuch. A completely made-up measurement. Anyhow, regardless of the deliberate neutering of such attempts, coming up with comparisons between our real-world hardware (hard enough to benchmark at the best of times) and completely fictional and poorly defined stuff stated to be first used after our likely lifespans isn't really of any use. My 2100mhz, dual core, high cycle efficiency, high ram, hard disk equipped, dual hi-rez full colour monitor workstation is approximately 1000x better in all meaningful stats than my first "proper" (non-"toy") computer. For the majority of stuff I do with it, it's not really more than 3 or 4x better, and for some things, it's about the same or maybe worse. Power doesn't necessarily imbue utility. 193.63.174.10 (talk) 13:07, 17 November 2010 (UTC)Reply
Actually in the show Data says literally "60 trillion operations per second". His storage capacity is also literally stated as "800 quadrillion bits". --Colin Barrett (talk) 18:07, 9 August 2011 (UTC)Reply

Game Consoles

edit

I'd like to see some sources regarding the performance of game consoles. Those ratings seem a little far fetched to me. Those ratings would put the 360 ahead of the top quad core processors and two 8800GTXs (fastest consumer GPU on the market as of right now) in SLI according to the figures given in the article. Since the PPC Xenon in the 360 has a theoretical maximum performance of 116 GFLOPS and the ATI Xenos is capable of 240, the actual power of the 360 is less than half of what's claimed in the article. I assume the results are similar with the PS3. --Mphilp 04:24, 12 February 2007 (UTC)Reply

Indeed, I wholeheartedly agree: I would actually state that those figures presented by Sony and Microsoft are outright lies, and roughly 10 times what they should be.
  • For the Xbox 360:
    • The CPU has a theoretical maximum performance of 19.2 GFLOPS. (3 cores x 2 FP units per @3.2GHz)
    • The GPU has a theoretical maximum performance of 96 GFLOPS. (48 stream processors, each capable of handling a 4-wide FP vector calculation, @500MHz)
    • The total of the two is 115.2 GFLOPS.
  • For the Playstation 3:
    • The CPU has a theoretical maximum performance of 92.8 GFLOPS. (7 SPEs with 4 FP units per, and one PPE with 1 FP unit per, all @3.2GHz)
    • The GPU has a theoretical maximum performance of 105.6 GFLOPs. (24 pixel shader units with 2 4-wide FP vector units per, @550MHz)
    • The total of the two is 198.4 GFLOPs.
I found it rather interesting that the figures I came up with are roughly 1/10th of what the consoles' respective makers claimed for them. Though I'll admit that I may have some data errors in my numbers... But I still think it brings up a valid point, that the gaming consoles, which are PRIMARILY based off of off-the-shelf PC components, technically CANNOT be as fast as the best PC hardware made at a later point than the console hardware itself. However, for the time being, the 'false' figures stand, given that those are the "official" claims of those companies. Nottheking (talk) 20:59, 29 November 2007 (UTC)Reply
Does anyone know if the cell processor performs FLOPs both on up and down cycles of the clock? That would explain the almost doubled values. Anonymous —Preceding unsigned comment added by 216.16.237.2 (talk) 20:41, 10 September 2008 (UTC)Reply

Wrong, wrong.

  • Xbox 360:
    • The Xenon has a theoretical maximum performance of 115.2 GFLOPS - 3 cores x 8 VMX128 units per core plus parallel 4 FP units @3.2GHZ.
    • The Xenos has a theoretical maximum performance of 240 GFLOPS. (48 unified shader units, each capable of handling vector-4 MADD operations plus a scaler special function @500MHz)
  • Playstation 3:
    • The Cell Broadband Engine has a theoretical maximum performance of 204.8 GFLOPS. (7 SPEs with 8 FP and integer SIMD instruction and one PPE with 8 FP @3.2GHz)
    • The Reality Synthesizer has a theoretical maximum performance of 255.2 GFLOPs. (24 shader units, one vector-4 operation and 2 MADDS each plus 8 vertex units handling one vector-4 op plus a scaler parallel op @550MHz)

Both stand way below a single GTX 8800 (518.4 GFLOPS @ 1350MHz) and are far, far from a Radeon HD 4890 (1.36 TFLOPS @ 850MHz).

Equivalence to Hertz

edit

Are the two terms FLOPS and Hertz interconvertable? If so, what is the ratio? Jack · talk · 20:46, Monday, 12 February 2007

No, they aren't. While one (hypothetical) processor may only do one FLOP/s, another processor with different architecture could do many (many) more. Dalef 07:08, 8 May 2007 (UTC)Reply

Actually, it's possible to use the clock speed of a processor to determine its theoretical maximum floating-point performance, which, indeed, would be measured in FLOPS. To get such a figure, simply multiple the number of floating-point calculations the device can perform per clock cycle, and multiply it by the clock rate. However, it would be just that: a theoretical figure. The ACTUAL floating-point performance (peak, average, etc.) would be impossible to accurately determine in this manner, since it relies on the entire computer system, not just the floating-point calculation hardware itself. Hence, the only reliable way to get THAT figure would be an actual benchmark, such as LINPACK. Nottheking (talk) 21:01, 29 November 2007 (UTC)Reply
Remember that different processors do differing amounts of FLOPS/s, so even while you can calculate it for one processor, that value does not hold for a processor based on a different micro-architecture. The matter also get more complicated with multi-core processors. For the same clock speed, you can (ideally, not real world) do twice as many calculations.
So there is no way to convert between FLOPS and Hz that holds for all architectures. Dalef (talk) 04:17, 9 December 2007 (UTC)Reply

Etymology

edit

Does anybody know who "named" the acronym? -scottc229


Human FLOPS

edit

The article stats the following: "a purely mathematical test will not truly measure a human's FLOPS, as a human is also processing thoughts, consciousness, smells, sounds, touch, sight and motor coordination." Since the definition of a FLOP is floating point operation per second, I fail to see how thought processing, consciousness, smells, sounds, touch etc. have anything to do with FLOPS. Unless someone can give me a reason for not doing so, I will delete this phrase after a couple days. Epachamo 00:36, 8 April 2007 (UTC)Reply

I believe the idea is that if the human mind quit doing everything else and devoted all its capacity to calculating, then it would be faster, or rather do more per second. However, we cannot turn our senses off nor can we stop doing any of the other things our brain does in the background...hence, measureing the minds FLOPS, while giveing a number, has a result that fails to correspond to the actual power of the mind. (Consider: This measurement is clearly an indicator of performance speed...as it fails to correspond to such in the human case, this measure is misleading)--71.61.48.109 02:51, 23 April 2007 (UTC)Phoenix1177Reply

The fact that we have many background processes and cannot turn off these background processes is one reason WHY we cannot do many FLOPS, but does not change the fact that we CAN'T do many FLOPS. If FLOPS are your measuring stick, then we stink. However, we rock computers at image processing still ;) (and swimming)Epachamo 01:42, 24 April 2007 (UTC)Reply

flops

edit
  • single-purpose hardware can never be included in an honest FLOPS figure.

if you include MDGRAPE-3 then other single-purpose hardware must be included also; also debatable if the new GPUs are really single-purpose, as they can perform graphics rendering and Folding. Also, please sign your comments in future, this is Wikipedia policy.Jaganath 09:28, 8 May 2007 (UTC)Reply

perhaps this needs a list of processors and flop figures? or perhaps a link?

edit

Records

edit

Does anyone else think the records section should be in order of flops? Craig Mayhew 16:33, 3 July 2007 (UTC)Reply


Precision?

edit

Shouldn't the article differencieate between double precious and single precision floating point operations? A processor designed specifically to perform single precision operations may not perform so well with operations with double precision and vice versa. Adding a section explaining the difference between the two may also clear up some common misconceptions, such as only the FLOPS count being important. Rilak (talk) 07:31, 9 December 2007 (UTC)Reply

What kind of floating point operations are we talking about ?

edit

What type of floating point operation is actually meant by the acronym FLOPS? For example, isn't non-binary multiplication simply a series of addition operations, and therefore more expensive than addition alone? (At least in integer arithmetic, I assume floating point arithmetic is similar.)--Bradley Mitchell (talk) 20:48, 9 December 2007 (UTC)Reply

According to this article, flops measurement seems to be based mainly on the LINPACK benchmark, which uses a Gaussian elimination. According to the Gaussian elimination article, there are 1/3 n^3 multiplications, 1/3 n^3 subtractions, and only 1/2 n^2 division. So I understand that 2 flops would include one multiplication, one subtraction and all loading and storing necessary.
If everyone agrees with my definition, and we find a clear source about that, it would be nice to include it the article, as it is currently far from clear as for what a flop actually is.--Yitscar (talk) 14:11, 28 September 2009 (UTC)Reply
The type of float operation is unimportant. The important difference is between an INT and FLOAT operation. See explanation in the next section. INT operation are always faster by design. But even that doesn't matter in how FLOPS or IOPS are measured or compared respectively. LINPACK uses a mixed measurement for practical reasons important for timing the duration of scientific experiments which translates into real money costs. As far as computer science go what matters is the amount of instructions (IPS) or operations (OPS) that is being pushed each cycle through a specific unit (execution or arithmetic unit) in a processor. The time difference between different type of operation is just a time lag. Each cycle the same number of results goes out as is being fed. Hence, what matters is how much many OPS command a processor can execute regardless of OPS type. The same goes for instructions or INT measurement. To give a different real world example it is often said for a certain factory how many items (e.g. cars) it produces in a certain time. Here the actual production time also doesn matter.Mightyname (talk) 16:29, 25 August 2014 (UTC)Reply
Even more, the mix of addition/subtraction, multiplication, and division, doesn't change all that much for common matrix operations. As noted above, division is often much less common. (Matrix processing routines that divide a matrix row or column, will instead multiply the whole row or column by the reciprocal, so one divide and n multiplies.) Give an good benchmark, and SPEC is better than LINPACK, it is only the ratios that matter. Gah4 (talk) 19:38, 27 February 2018 (UTC)Reply

What is floating point operation?

edit

The reference I included below is a good starting point to understand the difference between floating point operation and integer (fixed-point) operation. Floating point vs fixed-point.Ryoohkies 11:02, 25 December 2009 (UTC)

Integer. These designations refer to the format used to store and manipulate numeric representations of data. Fixed-point are designed to represent and manipulate integers – positive and negative whole numbers – for example 16 bits, yielding up to 65,536 possible bit patterns (216). Integer Ryoohkies 11:02, 25 December 2009 (UTC)
Floating-point (Real Numbers). The encoding scheme for floating point numbers is more complicated than for fixed point. The basic idea is the same as used in scientific notation, where a mantissa is multiplied by ten raised to some exponent. For instance, 5.4321 × 106, where 5.4321 is the mantissa and 6 is the exponent. Scientific notation is exceptional at representing very large and very small numbers. For example: 1.2 × 1050, the number of atoms in the earth, or 2.6 × 10-23, the distance a turtle crawls in one second, compared to the diameter of our galaxy. Notice that numbers represented in scientific notation are normalized so that there is only a single nonzero digit left of the decimal point. This is achieved by adjusting the exponent as needed. Floating point representation is similar to scientific notation, except everything is carried out in base two, rather than base ten. While several similar formats are in use, the most common is ANSI/IEEE Std. 754-1985. This standard defines the format for 32 bit numbers called single precision, as well as 64 bit numbers called double precision. Floating point can support a much wider range of values than fixed point, with the ability to represent very small numbers and very large numbers.
With fixed-point notation, the gaps between adjacent numbers always equal a value of one, whereas in floating-point notation, gaps between adjacent numbers are not uniformly spaced – the gap between any two numbers is approximately ten million times smaller than the value of the numbers (ANSI/IEEE Std. 754 standard format), with large gaps between large numbers and small gaps between small numbers. Floating Point Ryoohkies 11:02, 25 December 2009 (UTC)
Dynamic Range and Precision. The exponentiation inherent in floating-point computation assures a much larger dynamic range – the largest and smallest numbers that can be represented - which is especially important when processing extremely large data sets or data sets where the range may be unpredictable. As such, floating-point processors are ideally suited for computationally intensive applications. It is also important to consider fixed and floating-point formats in the context of precision – the size of the gaps between numbers. Every time a processor generates a new number via a mathematical calculation, that number must be rounded to the nearest value that can be stored via the format in use. Rounding and/or truncating numbers during processing naturally yields quantization error or ‘noise’ - the deviation between actual values and quantized values. Since the gaps between adjacent numbers can be much larger with fixed-point processing when compared to floating-point processing, round-off error can be much more pronounced. As such, floating-point processing yields much greater precision than fixed-point processing, distinguishing floating-point processors as the ideal CPU when computational accuracy is a critical requirement. Fixed point (integer) vs Floating point Ryoohkies 11:02, 25 December 2009 (UTC)

Cost of computing

edit

In the cost of computing section, the calculation assumes continuous 135 watt consumption for the PS3 console. Even when in operation and under load, it is unlikely that the console continuously consumes 135 watts, so the claim that it would consume $118 worth of electricity at average US electricity prices is misleading. —Preceding unsigned comment added by 68.145.105.18 (talk) 16:28, 22 April 2008 (UTC)Reply

Quote: "Approximate cost per GFLOPS 1961: US$1,100,000,000,000,000, ($1.1 trillion), or US$1,100 per FLOPS". If 1 GFLOPS = 1,000,000,000 (109) FLOPS, and 1 FLOPS costs US$1,100, then 1 GFLOPS costs US$1,100,000,000,000 (US$1,100 * 109 = US$1.1 * 1012) and not US$1,100,000,000,000,000 (US$1.1 * 1015) as in the quote.--87.182.37.35 (talk) 11:02, 18 November 2009 (UTC)Reply

The gap between 1961 and 1984 is too wide. As a matter of fact, in 1961, IBM Stretch delivered ~0.5 MFLOPS for $7.78M, which is $15.5B per GFLOPS. In 1964, CDC 6600 gave 3 MFLOPS and cost $8M [8], which is $2.667B per GFLOPS. In 1967, the Soviet BESM-6 delivered 1 MIPS, and at most 0.5 MFLOPS. It cost 600 thousand roubles (pers. comm. - this is a problem), and could be bought for that price in convertible rubles (about $670K to $900+K, and there was at least one BESM-6 outside of the Eastern bloc in India, see [9]). This brings the cost to slightly higher than $1B/GFLOPS. Leob (talk) 01:55, 9 July 2017 (UTC)Reply

Silly units

edit

It is a bit silly that the side panel goes up to the unit of xeraflop, given that 1 xeraflop is about a billion times more that the sum of the entire computing power on Earth. It is not particularly useful to include units which have not yet found a use. Even statements about hypothetical computing power in the future are much clear when expressed in terms of units that are actually used now. For example "The supercomputers of 2020 may be a thousand petaflops" gets the message across better than "The supercomputers of 2020 may be one exaflop", since the exaflop unit has no real reason to be used yet. Elroch (talk) 11:51, 13 May 2008 (UTC)Reply

This argument would apply to Yottaflop and the measurements are simply there to show the path of growth -- even if we won't exist to enjoy it. Lordvolton (talk) 15:26, 24 May 2008 (UTC)Reply
Never mind that we've already easily managed such a magnitude increase of computing capacity once over the past century? These units will be in use at some point, possibly sooner than might be easily imagined, so why not use them? Furthermore they're simply part of the normal SI range of exponential prefixes, it's not like some bored computer engineer has dreamt them up off his own back. Same as going in the other direction down to micro, nano, pico, femtoFLOPs (etc) if for some reason an exceptionally slow computing device (but maybe one that did an awful lot with that one operation!) was to be built or discovered... 193.63.174.10 (talk) 13:16, 17 November 2010 (UTC)Reply

Xera is a fake prefix

edit

"Xera" is not currently an SI prefix. See, for example, the official NIST page on prefixes.

I'm therefore going to remove it. If anyone objects, we can bring it back, but I'll have to ask for an official source claiming "Xera" is an SI prefix.--Pmetzger (talk) 16:23, 9 June 2008 (UTC)Reply

There are quite a few examples of it being used, here are a couple. You should google “xeraflop” if you want more references.
http://www.neoseeker.com/news/8159-military-supercomputer-reaches-new-milestone/
http://www.ancientrails.com/?p=689

Lordvolton (talk) 19:13, 10 June 2008 (UTC)Reply

I'm sorry but no, those are not valid references. You'll notice both of those links are from "articles" written on the 9th of June - ie. yesterday, both of them about the Roadrunner computer and quoting the same peice fo text. If you google "xeraflop" all the results are also from the same peice of text copied verbatim on about a thousand or more different "news" sites and blogs. And where do you think the original author of that peice got his FLOPS prefix information? I'll bet 1000 simoleans it came from this very wikipedia article. This appears to be a classic case of unverified information making it onto Wikipedia, a bad journalist using Wikipedia as their sole source of information, and the resulting article subsequently being used as a reference for maintaining the incorrect information on Wikipedia. It's a circular reference. If you have any good references then I'd like to see them, but those you've provided don't mean anything. Frankly though I think we both know this is a non-starter: The FLOPS prefixes are based on SI unit prefixes and there simply isn't an SI prefix called "xera" -- ExNihilo (talk) 21:50, 10 June 2008 (UTC)Reply
Are they the modern day equivalent of the council of Nicaea? Everyone else is calling it xeraflop and you offer no alternative. What is your brotherhood at the council of SI calling it? If what you're saying is true then there should be an alternative reference to their preferred name. Of course, you don't provide those citations or references because they don't exist. Absent any evidence supporting your claim to an alternate name it will remain xeraflop. —Preceding unsigned comment added by Lordvolton (talkcontribs) 22:45, 10 June 2008 (UTC)Reply
What the hell are you talking about? There's no need for an alternative, there simply is no recognised name for 10^27, just as there is no recognised name for 10^30, 10^33, or 10^30000 - there are an infinite number of powers of ten, do you think there are an infinite number of words to describe them? Do you even know what SI is? And of course there's no reference for a non-existent prefix. As always the burden of proof (or rather, burden of reference) lies on the person adding something. And it's becoming quite clear you have no reference because none exists. -- ExNihilo (talk) 22:53, 10 June 2008 (UTC)Reply
@Lordvolton but now in 2023, there is ronna- as an alternative 92.24.91.127 (talk) 09:44, 13 June 2023 (UTC)Reply
This ^ ... and really, what's wrong with "kiloyotta"? or megazetta ;-) 193.63.174.10 (talk) 13:19, 17 November 2010 (UTC)Reply
If "xera" is not an official SI prefix, then it should not be mentioned. Anyone can make up their own units, and the press is particular well known for doing so to make their stories more sensational, thus invalidating the provided references. Adding "xera" is as ridiculous as adding whatever unit of week The Register is using - the last time I checked it was a "half a rat brain" or something similair being equal to roughly 0.35 teraFLOPs. Rilak (talk) 07:18, 11 June 2008 (UTC)Reply

IBM Roadrunner

edit

Supercomputer sets petaflop pace http://news.bbc.co.uk/1/hi/technology/7443557.stm —Preceding unsigned comment added by 99.247.28.157 (talk) 18:19, 9 June 2008 (UTC)Reply

Excellent citation paper

edit

[10] -- has all sorts of information comparing the first list to the 30th and lists all sorts of records that were broken Altonbr (talk) 15:34, 10 June 2008 (UTC)Reply

When will we reach a xeraflop?

edit

Assuming Moore's law holds and computational power doubles every 18 months in the year 2068 we will break the xeraflop barrier.

Note: to put it into perspective I've listed the equivalent number of petaflops. A xeraflop is over a trillion petaflops!

2008 – 1 petaflop

2011 – 4 petaflops

2014 – 16 petaflops

2017 – 64 petaflops

2020 – 256 petaflops

2023 – 1,024 petafops (1 exaflop)

2026 – 4,096 petaflops

2029 – 16,384 petaflops

2032 – 65,536 petaflops

2035 – 262,144 petaflops

2038 – 1,048,576 petaflops (1 zettaflop)

2041 - 4, 194,304 petaflops

2044 – 16,777,216 petaflops

2047 – 67,108, 864 petaflops

2050 – 268, 435,456 petaflops

2053 – 1,073,741,824 petaflops (1 yottaflop)

2056 – 4,294,967,296 petaflops

2059 – 17,179,869,184 petaflops

2062 – 68,719,476,736 petaflops

2065 – 274,877,906,944 petaflops

2068 – 1,099,511,627,776 petaflops (1 xeraflop)

Lordvolton (talk) 00:33, 11 June 2008 (UTC)Reply

I'm not sure where you're going with this. Are you suggesting that this be added to the article? If so then, as we've already established, there's no such thing as a xeraflop. Secondly there's no reason to believe that Moore's Law will hold for another 60 years so it's pretty pointless to show an obvious mathematical progression based on that assumption. -- ExNihilo (talk) 01:47, 11 June 2008 (UTC)Reply
Ray Kurzweil would ground you for two weeks if he read this blasphemy. He'd say, "Young Jedi, it's time I taught you about accelerating returns." Of course, the accelerating returns camp would complain that I am being too conservative. But like you they would complain.
Fear not, we'll get to a xeraflop sooner or later. The good news is that you'll be ungrounded way before then. =-)
Lordvolton (talk) 05:14, 11 June 2008 (UTC)Reply
We will never reach a xeraflop, because there is no such thing as a xeraflop. There is no official SI prefix for 10^27, and if there ever is one it is unlikely it will be "xera". Pmetzger (talk) 22:29, 16 June 2008 (UTC)Reply
Another wikipedian beat you to the punch. http://en.wiki.x.io/?title=Talk:SI_prefix
It looks like xeraflop has become the default reference, but the point is to show how quickly we'll reach seemingly mind boggling speeds. A xeraflop will be here in only 60 years based on Moore's Law. Lordvolton (talk) 00:01, 19 June 2008 (UTC)Reply
Default reference? If I am not mistaken, the editor describes "xeraflop" at the talk page you mentioned as "pseudostandard". In my view, that is very different from an approved international standard set by international standards authorities. Rilak (talk) 05:54, 19 June 2008 (UTC)Reply
It's become nothing of the sort. The article he mentions is the exact same article we talked about in the above discussion. It was used in a single "article" which obviously sourced this Wikipedia page when XeraFLOP was incorrectly listed as an SI prefix. That is all. There's no "default reference" (whatever that means), no de facto prefix, no de jure prefix, nothing. As best as I can tell you are the one who first added XeraFLOP to the Wikipedia article in the first place so I suspect you know all this already. -- ExNihilo (talk) 11:52, 19 June 2008 (UTC)Reply
You remind me of those English teachers who hated slang references. I actually googled "xeraflop" and "yottaflop". There are more references for xeraflop than yottaflop, which you believe is part of the Queen's English. If it's the common usage you'll have fun telling everyone to stop using the word "ain't" instead of "isn't". Well, except you have no replacement term, so we're left with xeraflop and lots of hand wringing. Even so, in 60 short years we'll have xeraflop supercomputers.
I wonder what we'll do with them? Lordvolton (talk) 22:56, 19 June 2008 (UTC)Reply
LV ... I think you're having serious issues with logical thought here. Slang may work its way into the official definitions and dictionaries of a language over time, because that's how etymology and linguistics work. It is not, however, how science works, and certainly the internationally held definitions of mathematical terms are not governed by what some person makes up to stick in a single online article because they haven't the mental agility to, I dunno, put two existing prefixes together (one of which already defines a number so huge it's possible other physical laws such as planck time and the like will constrain our computers before we get there) or just give up on increasingly confusing verbal powers-of-1000 definitions altogether and switch over to more clear and compact Standard Form (e.g. our computer has a speed of 1.57 x 10^29 FLOPS... or maybe Star Trek quads etc). If you can't understand these concepts, then you don't have a great deal of business trying to redefine how the global scientific community deals with numbers and should maybe disappear back to the Time Cube from whence you came. Oh and maybe repeat the last year of troll school, as I don't think you've earned your diploma yet. 193.63.174.10 (talk) 13:32, 17 November 2010 (UTC)Reply
I honestly don't know if you're trolling or you really just don't understand how stupid this is. As we've been through at least twice now: there is only one actual reference to XeraFLOPS - all of the sites on which that word occurs are quoting directly from that single "article", so saying that there are x amount of references when they're all the same is simply rubbish, and more to the point it is irrelevant. This has nothing to do with slang or "the Queen's English", it's to do with SI prefixes, which are what's used in conjunction with FLOPS and just about any other computer-related measurement. There's no need for a replacement term because there's no need for a word for it at all. If computers (or anything else) ever reach such a scale that a 10^27 of some unit is a practical magnitude of measurement then I'm sure SI will come up with one. There's simply no need to make things up. -- ExNihilo (talk) 00:11, 20 June 2008 (UTC)Reply
It is common to mix up "there" with "their". Therefore, "their" will now have of definition of "there", and "there" will now have the definition of "their". Now this would be silly, wouldn't it? Just because it is common does not make it right. Anyways, we can always worry about what to call 10^27 when it actually happens and that will be in 60 years, IF Moore's law does not implode and suck the universe into a vortex before 2068. Rilak (talk) 06:26, 20 June 2008 (UTC)Reply

Incorrect $/GFLOP for IBM Roadrunner?

edit

Under "Cost of Computing" it is stated that IBM Roadrunner costs $0.13 per GFLOPS. The cost is actually closer to $130 per GFLOPS, if I am not mistaken.

Cost to build Roadrunner: $133 million USD
Roadrunner performance: 1.026 petaFLOPS (1026000 GFLOPS)
$133,000,000 / 1026000 = $129.6296 per GFLOPS

Perhaps I am wrong, someone please double check the math. Remain nameless (talk) 14:19, 19 July 2008 (UTC)Reply

Your math is right. The Roadrunner entry should be either clarified or removed. Remain nameless (talk) 03:07, 29 July 2008 (UTC)Reply

'Cost of Computing' section is confusing

edit

This section of the article has me confused. I don't know whether the prices represent the cost of individual components, specific machines or anything else for that matter. Could someone please clear this up by explaining what the prices are linked to. I have also marked it as confusing on the article. Supersword (talk) 23:20, 16 November 2008 (UTC)Reply

I've reworked that section a bit; I think I consider it to be clear enough now, so I'm (tentatively) removing the tag. - Reinderientalk/contribs 21:39, 28 February 2009 (UTC)Reply

PC Speed

edit

"As of 2008, the fastest PC processors (quad-core) perform over 70 GFLOPS" I have credibility issues with this. Given a speed of 3.2GHz with 4 cores and completing one operation per cycle this gives 12.8 GFLOPS. Even assuming perfect hyperthreading that doubles the effective number of cores this is still a factor of 3 short. The reference given in the quote goes to a website that shows a graph of CPU speed giving 70 GFlops for a top spec processor but is this believable? Mtpaley (talk) 23:03, 16 August 2009 (UTC)Reply

You have to calculate the peak performance of an Intel Core2/i7 like that: SSE2 => 2 doubles in each vector; 2 different ports and 2 different execution units with a throughput of one µop for floating point addition and multiplication => one packed double addition and multiplication in one clock cycle. Hyperthreading shares the execution units and other resources, resulting not in a higher FLOPS count. So you get 4 DP FLOP per core and clock cycle. This results in FLOPS = frequency*(# of cores)*4 = 79.9 GFLOPS for a six core i7-980X. —Preceding unsigned comment added by 88.67.216.116 (talk) 19:51, 23 April 2010 (UTC)Reply

So if "frequency*(# of cores)*4 = 79.9 GFLOPS for a six core i7-980X" is true how come that the article claims its 107.55 GFLOP/s? Also the 980X==980XE

given such a complicated way to calculate the theororetical speed by assuming multiple parallel operations... it does not sound like a real number at all, I mean, how would you program such a thing in real life? How many things/equations can actually make use of such functionality. Also it totally ignores that fact that the numbers must be packed, well guess what, converting numbers to/from packed format takes quite a bit of time as well... OldCodger2 (talk) 04:28, 17 April 2013 (UTC)Reply

on the other hand, if they had been talking about the video cards in PCs they could have a valid point. I recently read about a new video card that is said to do 200 GigaFLOPS. People have been reprogramming video cards to act as computation units for many years, but it requires very specialized programming technique, it is far more doable though then the above convoluted calculation. OldCodger2 (talk) 04:37, 17 April 2013 (UTC)Reply

Aw, not this again...

edit

That blasted Xera- prefix showed up again. I nuked it. <soapbox> Ever since the Tera- prefix the standards people at SI have based the new prefix on Greek or Latin (mostly Greek) terms relating to the number of 000s. The next one not yet defined is 9 sets of 000s. So if Greek is used again for nine, εννέα or ennéa, we might have enyaFLOPS, which is cool if you like her music. Or if Latin, it's either nove or nona. I say go with Latin, combine the two forms, and we get NovaFLOPS, which sounds pimp. Either way, it's moot, because SI has not established a 1027 prefix yet. At all. My personal best guess is that someone's signature on the Starcraft forums (I'm not making this part up) who even acknowledged that it's a bogus word, got propagated into some crease of the public consciousness and found it's way here. It's still wrong. I've seen people talking on here mention that a Google of "xeraflops" gets lots of hits, but a search of "xera" with "prefix" gets only a few thousand and most on the first page do not have any relation to numbers or SI. </soapbox> Still doesn't exist. Thank you. -:-  AlpinWolf   -:- 06:11, 8 January 2010 (UTC)Reply

List of current processors

edit

There should be a list of current processors (Intel Core i7, Athlon 64, Cell, etc.) and their computing power in this article, shouldn't it? --bender235 (talk) 15:53, 4 February 2010 (UTC)Reply

Presactly. It's somewhat glaring by its absence - and the linked sites with supposed benchmark results on are baffling, lack easy to read summaries and are practically useless unless you're really wanting to see which small variant of a processor is marginally better at one element of an obscure test versus another slightly different one. They're complicated enough that I'm not even 100% sure what they show, and I'm certainly not motivated enough to pull all the data off them, learn what it means and reprocess it. A simple table, maybe even line/bar chart (logarithmically scaled?) showing fastest desktop CPU in terms of FLOPS measured by a widely regarded benchmark that spits out that figure (with all floating point accelerating features available and turned on) should do. Maybe have additional lines/bars for AMD vs Intel vs..., or premium vs budget lines... whatever. Extra data of interest could be FLOPS/Hz, and FLOPS/retail cost (everyday processors appear to be the missing link in the existing cost-per-flop table, it goes straight from huge mainframes into desktop-machine-based clusters and then to GPUs (without the cost of the necessarily attached other hardware!). Even though they obviously wouldn't be the fastest, cheapest or simplest way of getting your supercomputing oomph, it would demonstrate what this power means in a way that the lay user could understand - relating it to the device that they're using to access the article. Information on its own without understanding is effectively meaningless (and not having the information at all is worse). 193.63.174.10 (talk) 13:42, 17 November 2010 (UTC)Reply

Article describes what a floating point number is, but does is say what an op is?

edit

For example, does a FLOP involve one or two floating point numbers? Assuming two, is it the addition of two FP numbers? Is it a multiplication? Division? Subtraction? If this was answered, I apologize, that I didn't see it. —Preceding unsigned comment added by BobEnyart (talkcontribs) 16:38, 1 June 2010 (UTC)Reply

I've tried to answer it higher in the page, but I didn't know where to add the information on the main page, plus it looked like original research to meYitscar (talk) 14:02, 1 July 2010 (UTC)Reply
It's probably not clarified because it strongly depends on the context. In digital signal processing a flop often refers to a multiplication plus an addition because it's a common operation in filtering. Note that often some special architectures are used here using dedicated chips for the cpu consuming operations (like wired multiplications) introducing different FLOPS "levels" on a same board. In classrooms additions and substractions are often ignored and only multiplications and/or divisions are counted. Is there an official definition of "FLoating point OPeration"?(MaenINoldo (talk) 08:33, 12 October 2010 (UTC))Reply
As above, and as well as I know, it is one addition, subtraction, multiplication, or division. For comparable benchmarks, the ratios of those operations will be about the same, such that the numbers are comparable. Gah4 (talk) 02:02, 18 July 2018 (UTC)Reply

Double precision (dp) FLOPS of Radeon HD 5970

edit

The Radeon HD 5970 has got 2*320 dp-capable ALUs running at 725 MHz. Each of them can calculate one dp FMA (fused multiply add) per cycle. So the computing power is 2*320*725*2 MFLOPS = 928 GFLOPS and not 1.09 TFLOPS (citation of reference no. 38). (87.180.223.105 (talk) 19:10, 14 October 2010 (UTC))Reply

New record

edit

http://news.yahoo.com/s/afp/20101028/tc_afp/chinatechnologyitworld Accordion to this news story a new computer in china is taking that #1 spot. Good idea to keep an eye out for updates worth mentioning in the article.Donhoraldo (talk) 13:10, 28 October 2010 (UTC)Reply

Auto archive

edit

Any objections to me setting up auto-archive on this page? —me_and 14:34, 17 November 2010 (UTC)Reply

Bitcoin "FLOPS" computation on bitcoinwatch

edit

bitcoinwatch.com/ calculates PFLOPS of bitcoin network as: take number of Hashes/second (Terahashes/s of SHA256) and multiply by 12700 to get a "Single-precision FLOPS estimate". One hash calculation is considered as 6350 32-bit integer operations, and each integer operation is considered equal to two single-precision flops. Source of constants is: http://forum.bitcoin.org/index.php?topic=4689.0 (with reference to bincoinwatch's admin). Actual bitcoin mining contains no (or almost no) floating-point calculations.

Thanks for constants! Some refs and quotes for such calculations and about their incorrectness: `a5b (talk) 03:38, 17 November 2013 (UTC)Reply
  1. [11] // Gizmodo: "Because Bitcoin miners actually do a simpler kind of math (integer operations), you have to do a little (messy) conversion to get to FLOPS. And because the new ASIC miners—machines that are built from scratch to do nothing but mine Bitcoins—can't even do other kinds of operations, they're left out of the total entirely."
  2. [12] // SlashGear: "Bitcoin mining technically doesn’t operate using FLOPS, but rather integer calculations, so the figures are converted to FLOPS for a conversion that most people can understand more. Since the conversion process is a bit weird, it’s led to some experts calling foul on the mining figures."
  3. [13] // ExtremeTech: "As Bitcoin mining doesn’t rely on floating-point operations, these estimates are based on opportunity costs. Now that we have hardware with application-specific integrated circuits (ASIC) designed from the ground up to do nothing but mine Bitcoins, these estimates become even more fuzzy."
  4. [14] // CoinDesk: "Two, the estimates used to convert hashes to flops (resulting in about 12,700 flops per hash) date to 2011, before ASIC devices became the norm for bitcoin mining. ASICs don’t handle flops at all, so the current comparison is very rough."
  5. [15] // VR-Zone: "A conversion rate of 1 hash = 12.7K FLOPS is used to determine the general speed of the network contribution. The estimate was created in 2011, before the creation of ASIC hardware solely designed for bitcoin mining. ASIC doesn’t use floating point operations at all,... Thus, the estimate doesn’t have any real-world meaning for such hardware."

ISRO >100 EFlops by 2017 is highly unlikely

edit

While the statement is indeed sourced, it seems strange that India will accomplish 100x what supercomputer leaders plan in the same timeframe. If this is indeed the case, there should be more sources available then the single article that has been provided. Moreover, http://www.hpcwire.com/hpcwire/2012-01-04/india_aims_to_double_r_d_spending_for_science.html specifically mentions that there are dubious articles surrounding the matter. This source can be found on the wikipedia page Supercomputing in India. —(shaun) Preceding unsigned comment added by 71.53.29.50 (talk) 00:16, 3 May 2012 (UTC)Reply

The following http://asiancorrespondent.com/73169/a-1bn-supercomputer-no-that’s-not-what-the-pm-said/ article is apparently critical of the claim as well. -shaun

Lots of people have planned lots of things. And per WP:Crystal I zapped the futuristic part. Plans can fail. History2007 (talk) 17:50, 28 May 2012 (UTC)Reply

Sorry -- what I mean to say is the statement is just not correct: I think India does NOT plan to build such a supercomputer. I think the source is just plain wrong. Somebody flubbed it. There isn't a single other collaborating article. -shaun — Preceding unsigned comment added by 71.53.27.76 (talk) 08:16, 31 May 2012 (UTC)Reply

edit

I removed the following section:

It is also important to consider fixed and floating-point formats in the context of precision – the size of the gaps between numbers. Every time a processor generates a new number via a mathematical calculation, that number must be rounded to the nearest value that can be stored via the format in use. Rounding and/or truncating numbers during processing naturally yields quantization error or ‘noise’ - the deviation between actual values and quantized values. Since the gaps between adjacent numbers can be much larger with fixed-point processing when compared to floating-point processing, round-off error can be much more pronounced. As such, floating-point processing yields much greater precision than fixed-point processing, distinguishing floating-point processors as the ideal CPU when computing accuracy is a critical requirement.

The above paragraph is totally wrong, even though it is a direct quote from Summary: Fixed-point (integer) vs floating-point. The reason that article makes that claim there, but it is wrong when quoted here, is because they are comparing their 16 bit Fixed Point product to their 32 bit Floating Point product. Actually their marketing is being a little bit deceptive in this regard. Given the same number of bits, Floating Point has greater dynamic range but less Percision when compared to Fixed Point.

I am also concerned about the copyright issues, large chucks of that article "Fixed-point (integer) vs floating-point" have been copied and pasted directly into this wiki article. OldCodger2 (talk) 22:07, 22 June 2012 (UTC)Reply

"Computing" [non-]forumla

edit
 

Really? GFLOPS = (something) × FLOPS? Uh, yes, GFLOPS = (1/1000) × FLOPS, always, by definition.

First, the author of the original reference at Dell was sloppy in using "FLOPs" as a plural, which is so easily confused with "FLOPS", a rate, differentiated only by capitalization; and then second, a Wikipedia author copied the original formula incompletely, and didn't notice the glaring inconsistency which that left.  :(

It would be easy enough to correct the units and missing terms, but that would still leave a formula which (1) is overly-simplistic, as it doesn't consider threads, pipelines, functional units, or any other hardware details beyond mass-market home PC advertising; (2) doesn't consider asymmetric or hybrid mixed-processor machines, or even MPP machines with non-identical nodes; and (3) at best only calculates an absolute upper limit of FLOPS or GFLOPS, which can only ever be achieved with a very limited optimum instruction sequence, and then only for an ultra-brief time while the FP inputs can be taken from CPU registers. In short, we'll get a significantly more complicated formula to calculate a result that borders on useless.

This brief section was only added to the article within the past week or so. Is there general consensus that it needs to be a part of this article? And does it need to be the lead section of the article?

98.218.86.55 (talk) 23:47, 8 October 2012 (UTC)Reply


I agree, the equation is nonsense without a proper context. The equation describes how to calculate the theoretical performance of one specific hardware architecture (without regard to memory bandwidth or software overhead). Also the term Socket by itself is ambiguous, my initial thought was that it was describing network sockets in a compute cluster, but upon reading the referenced article I see that in this case it referrers to the number of IC chip sockets, which is a huge assumption about the hardware. GFLOPS doesn't need an equation it is simply One Billion FLOPS, it is a unit, to talk about dividing by one thousand only adds confusion. If you want to compare the computers you just compare their benchmark performance not some hardware specific conceptual value of what that performance might be. The reference itself was somewhat interesting, perhaps it can be retained, as to the equation itself, it either needs to be accompanied by a lot of explanation and moved further down the page, or else I too vote to delete it. Any dissenters? OldCodger2 (talk) 01:02, 13 December 2012 (UTC)Reply

Fixed and floating point section

edit

The explanation about fixed point arithmetic in this article is not very good in my opinion, and it is also partially wrong. The actual article on the subject is much better. The same goes for floating point arithmetic. Since this is explained better in other places on wikipedia, I think this section should be deleted. Amaurea (talk) 16:23, 20 March 2013 (UTC)Reply

As a Flop is a measure to compare different computer systems, someone really should explain how to measure in practice. I can't understand that here. How many bits are to be used? Is it about adding two numbers?

Calculating Flops

edit

The equation for calculating theoretical FLOPS is for the number of logical cores, not physical cores (unless 1 logical per physical). The numbers Intel provides supports this statement. I will leave it to the editors to decide how to incorporate this information into the article. http://www.intel.com/support/processors/sb/CS-032814.htm 76.198.38.250 (talk) 20:45, 23 June 2013 (UTC)Reply

Reference Number 2

edit

Reference number 2 is no longer valid. 76.198.38.250 (talk) 06:29, 30 June 2013 (UTC)Reply

Intro section

edit

In computing, FLOPS (also flops, for FLoating-point Operations Per Second, or flop/s (< this last choice can't be correct, as the / sign means per, so it basically says FLoating-point Operations Per per Second. Makes no sense, on the same level as ATM machine or PIN number.) is a measure of computer performance.


My comment is in (). — Preceding unsigned comment added by 153.25.178.61 (talk) 02:22, 2 July 2013 (UTC)Reply

Incorrect data in cost per flop

edit

The PS3 is listed at the same 1.84TFLOP, yet costs $250. Plenty of machines costing less than $750 also provide more than 4TFLOP (using cheap processor+ AMD 7970 GHz Ed), so the PS4 does not belong there at all, it is neither an average nor the best performance for price. — Preceding unsigned comment added by 60.43.54.2 (talk) 15:02, 20 October 2013 (UTC) I think you mean PS4, which costs $400.Reply

Please update the chart for cost per flop. AMD just released the R9-290 GPU, which has 4.8TFlops at a cost of $400.

In addition, the price listed / GFLOP for the quad 7970 system is the double precision value, while the PS4 is single precision taken from a marketing release. They're not comparable. — Preceding unsigned comment added by 74.94.87.220 (talk) 19:10, 13 November 2013 (UTC)Reply

Additionally the 2007 data is grossly inaccurate. The cost for gigaflop at that time was around $2, not $50 as graphics cards were several hundred gigaflops fast at the time for only 500-600 dollars. The entire chart needs to be recalculated using lowest cost per gigaflop data. — Preceding unsigned comment added by 71.210.192.124 (talk) 05:30, 28 January 2014 (UTC)Reply

Smartphone FLOPS... now surpass a supercomputer

edit

I was wondering what a typical computer, device or tablet out there in the wild is capable of performing now. It would appear that a high-end smartphone should outperform a Cray X-MP supercomputer (from the 1980s) within the next few months, if not already... Using Linpack, a good speed is 200 MFLOPS on Android when I wrote this. Using a native app and multithreading, some ARM CPUS have surpassed the 400 to 800 MFLOPS of a Cray X-MP. Many smartphones outperform the Cray 1 already at floating point. It would be interesting to compare their I/O speed and storage, which for the Cray XM-P were based on solid state drives, 1.2 GB hard drive systems, and streaming mag tapes. I wonder which is faster: Bluetooth or mag tape? Near-field communication or modems? I like to saw logs! (talk) 09:01, 28 May 2014 (UTC)Reply

Hardware costs

edit

The section Hardware costs includes a table of milestones, and a heading paragraph which doesn't match the table. The heading claims that “the least expensive computing platform able to achieve one GFLOPS is listed”, which is manifestly untrue, as the actual listings show milestones in price/performance instead.
Example: The 2011 entry, at $30,000 overall cost, was added because it was an enormous improvement in price/performance over the 2007 entry, yet that 2007 entry, at $1256, was still the “least expensive”.
If there are no objections, I'd like to rewrite that heading paragraph to more accurately describe the content it introduces.
 Unician   21:45, 28 May 2014 (UTC)Reply

The PlayStation 4 was not available in June 2013. The earliest release date was in November 2013 in North America and Europe, and then three months later in Japan. Should the cost per GFLOPS really be cited in June 2013 if no one could purchase it for another 5 months? — Preceding unsigned comment added by AlgeBrad (talkcontribs) 23:28, 29 September 2014 (UTC)Reply

Hardware costs 2

edit

I would like to add, that the numbers seem inaccurate in general, which might be a result from simply integrating information randomly found on the internet. For instance the last row (October 2007) claims that "Three AMD RX Vega 64 graphics cards provide just over 75 TFLOPS half precision [...] at ~$2,050 for the complete system". The given reference (https://pcpartpicker.com/user/mattebaughman/saved/8DQZ8d) does not give any information regarding the full price of the system. Further the cost of one AMD RX Vega 64 graphic card alone is far above 1000 USD (Amazon (link visited 13.12.2018)). You might get one for ~630 EUR (roughly< 710 USD) at Newegg.com (visited 13.12.2018), however 3x710 = 2100 USD for the Grafic Card alone. 134.28.120.119 (talk) 09:50, 13 December 2018 (UTC)Reply

edit

Hello fellow Wikipedians,

I have just added archive links to one external link on FLOPS. Please take a moment to review my edit. If necessary, add {{cbignore}} after the link to keep me from modifying it. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true to let others know.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers. —cyberbot IITalk to my owner:Online 03:29, 29 August 2015 (UTC)Reply

edit

Hello fellow Wikipedians,

I have just added archive links to one external link on FLOPS. Please take a moment to review my edit. You may add {{cbignore}} after the link to keep me from modifying it, if I keep adding bad data, but formatting bugs should be reported instead. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether, but should be used as a last resort. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—cyberbot IITalk to my owner:Online 06:16, 28 March 2016 (UTC)Reply

History

edit

This article seems to be more about the use of the term than the term itself. I see no history of the term. I am interested in who invented the term. I think IBM invented the term (the idea or whatever) to provide a more useful comparison of their processors compared to Amdahl and other competitors. Sam Tomato (talk) 01:41, 5 May 2016 (UTC)Reply

edit

Hello fellow Wikipedians,

I have just modified 4 external links on FLOPS. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 21:15, 28 December 2016 (UTC)Reply

GFLOPS is (incorrectly?) used for both "giga" and "geo"

edit

In the first table, the same abbreviation (GFLOPS) is given to two terms. Since "geo" is not an official SI prefix, it's highly doubtful – and indeed not possible for avoidance of ambiguity – that "G" would be used to abbreviate "geo". I'm going to remove last two abbreviations from the table (for both "bronto" and "geo"), as both are unofficial prefixes and as such don't have official abbreviations. cherkash (talk) 06:39, 7 January 2017 (UTC)Reply

Hardware costs - misleading comparisons

edit

The list of cost per GFLOP starts with supercomputers which were the absolute fastest of their time, then switches to clusters of commodity hardware, and then to console and PC hardware. All of these are different types of computers, and it is misleading to compare FLOPs between them. A supercomputer for example is more expensive per FLOP, but offers the biggest absolute computing power at a given time in one system, and meets much higher stability and availability requirements compared to a PC.

Further, even ignoring this over-simplification, the cost per GFLOP numbers from the last years are misleading. One would have a hard time even installing an operating system, and a graphics driver on the hardware that is given as source for the $0.08/GFLOP (http://www.freezepage.com/1420850340WGSMHXRBLE) because of the small SSD and low amount of RAM.

I suggest to either rework the section to give a more realistic view on the cost of computing power (i.e. compare the same type of systems over time), or to just remove it.

--Aquaschaf (talk) 22:45, 24 May 2017 (UTC)Reply

In the 60s, there were only mainframes practically, micro or even mini computers not being invented yet. The table tries to show the cheapest way to achieve GFLOPS speed. It does have some room for improvement, so feel free to add to it. By the way, installing Linux on a 32 GB drive is no problem. Additionally, you could easily exchange the SSD for a larger HDD (and save!) without loss of FLOPS. --Zac67 (talk) 07:45, 25 May 2017 (UTC)Reply
edit

Hello fellow Wikipedians,

I have just modified 4 external links on FLOPS. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 04:05, 27 September 2017 (UTC)Reply

What i meant...

edit

...was that even a simple programmable calculator kan perform thousands of consecutive floating point operations without manual intervention. It's only when a person punches at least one button between each operation that 1/10 of a second is neglectable. - Tournesol (talk) 15:59, 16 November 2017 (UTC)Reply

Even complex or programmed calculations are usually noticeably slow on a pocket calculator. It's not about manual intervention, the article is about float operations per second. Do you have a source for a simple calculator performing more than "relatively few" FLOPS? Unless it's many KFLOPS that's relatively few for me. For today's standard and in the context of the paragraph, relatively few is everything not measured in GFLOPS. --Zac67 (talk) 17:52, 16 November 2017 (UTC)Reply

You appear to agree with what I initially wrote in the article, so I don't understand why you reverted my edit. I'm not spending more time on this, though. Have it your way. - Tournesol (talk) 15:49, 23 November 2017 (UTC)Reply

Aaaall right, now I understand. I meant that a calculator normally carries out rather few consecutive floating point operations (not per second, but the actual number of operations carried out) without pausing for manual input, so the number of operations per second isn't really perceived by the user - any responds within 200 ms will be perceived as instantaneous or at least faster than the display so if it takes 1 ms or 100 ms to do the calculation doesn't matter. In a case where a calculator is programmed to do a hundred such operations, however, the difference becomes 100 ms or 10,000 ms.

What I realize now was that the final "S" isn't a plural S (operations) but a per second S so my edit was indeed erroneous. - Tournesol (talk) 15:59, 23 November 2017 (UTC)Reply

Great to hear you've sorted it out. ;-) --Zac67 (talk) 18:23, 23 November 2017 (UTC)Reply

base two?

edit

One sentence indicates that FLOPS are base two, the next includes IBM base 16 floating point. This then ignores that IEEE 754-2008 includes decimal (base 10) floating point formats. Gah4 (talk) 19:46, 27 February 2018 (UTC)Reply

Cost of Computing section is woefully inaccurate

edit

The numbers, particularly past 2000, are way off. The 2007 $57 figure is off by an order of magnitude - calculating the price per GFLOPS performance of an 8800 Ultra from that year yields a $2.1 price tag per GFLOPS and even that's very unlikely to be the best performance per dollar considering it is an enthusiast tier card. Another example in 2011: a GTX 480 is capable of $0.37 per GFLOPS compared to the $1.98 listed. I am unsure of all the figures, but I suspect these are not the only inaccuracies considering that the examples usually use high end consumer graphics cards that are going to be price-inefficient. The whole chart needs a rework past 2000 at least, no idea about other eras. --Krispion (talk) 18:56, 23 September 2018 (UTC)Reply

"Cycle"

edit

The word "cycle" is used in the "FLOPs per cycle" section without explanation. If "cycle" is equivalent to one tick of the processor clock (i.e., the inverse of the clock speed in GHz), this should be stated explicitly. If not, further explanation should be given. Also, I think the "s" in FLOPs should be capitalized. Jess_Riedel (talk) 15:04, 26 September 2018 (UTC)Reply

First Commercial Single-Chip Processor capable of 1 Gigaflop?

edit

I've been looking for an answer to, what seems like, a simple question, and coming up empty handed. What was the first commercial, off the shelf, single chip, stick it in your favorite motherboard, processor that was capable of a sustained 1 gigaflop speed? That wasn't a one-off special chip, that didn't require special/custom hardware, that wasn't a special research project, or required an N-way crossbar connecting X processors with Y memory arrays, in a Beowulf cluster, in order the win the Gordon GoFaster prize? Just whatever advancement/next-step in Pentium/AMD/PowerPC/Whatever came along that allowed the average college-student/gamer to put 1 gigaflop on their desk? I suspect this point was reached right around the year 2000, but I haven't found anything more exact than that.Gcronau (talk) 16:12, 14 January 2019 (UTC)Reply

New FLOPS formula

edit

I found an article that seems to give a new formula for FLOPS calculus : https://cs.iupui.edu/~fgsong/cs590HPC/how2decide_peak.pdf. I would like to know what you think about it. I'm working on applying it for every CPU that can be evaluated in FLOPS. For example, here the i7-2600 number of FLOPS :

On the Wiki article : https://en.wiki.x.io/wiki/Sandy_Bridge, Whestone benchmark measures 83 GFLOPS for this CPU.

The CPU has 4 cores and 8 threads (HyperThreading Technology available). The CPU runs at 3.40 GHz. It has the SSE and AVX instruction set.

So, for SSE scalar operation, we have : 4 (cores) * 2 (number of thread per core) * 1 (instruction set specific multiplier) * 3.40 = 27.20 GFLOPS

For SSE vector operation (128 bits wide) in double precision, we have : 4 * 2 * 2 (instruction set specific multiplier) * 3.40 = 54.40 GFLOPS

For SSE vector operation (128 bits wide) in simple precision, we have : 4 * 2 * 4 * 3.40 = 108.80 GFLOPS

For AVX vector operation (256 bits wide) in simple precision, we have : 4 * 2 * 8 * 3.40 = 217.60 GFLOPS

Jithel (talk) 13:49, 15 March 2019 (UTC)Reply

It is partially correct. However, the FLOPs per core are for the real cores only, not for HT. The two HT siblings share just one FP unit. So your SandyBridge at 3.4GHz with 4 cores with AVX theoretically does 3.4x4x8=108.8 GFLOPs Jfikar (talk) 12:16, 6 November 2023 (UTC)Reply

FLOPS estimation for historical/well known processors

edit

Besides record values, wouldn't it be interesting to mention estimated FLOPS values of some well known or ″historic″ processors: Zilog Z80, Intel 4004, Intel 8080, Motorola 68000, 80x86 series, PDP-11,... --Olivier Debre (talk) 18:08, 25 February 2021 (UTC)Reply

@Olivier Debre: If you find a reliable website that talks about it, why not, that would be interesting. Bensuperpc (talk) 18:44, 25 February 2021 (UTC)Reply
Most of the processors in the list do not have floating point implementations in hardware. Intel offered the x87 coprocessors for this function. 198.85.71.19 (talk) 14:22, 5 June 2023 (UTC)Reply

FLOPS per cycle per core

edit

Floating Point Operations Per Second per cycle. That's like saying meters per second per hour. It doesn't make any sense. There has to be a better way of phrasing that 24.192.184.78 (talk) 20:33, 22 June 2022 (UTC)Reply

Bothers me as well. Can we use FLOP (for FLoating-point OPeration) here? TOP500 use flop/s. The formula above uses FLOPs, apparently as plural, but I don't think that's really working either. Basically, it boils down to floating IPC. I guess why just shouldn't try to abbreviate. --Zac67 (talk) 05:25, 23 June 2022 (UTC)Reply
I think the title was just wrong and the table is actually floating-point operations per cycle (per core). The linked reference shows eg "64 FLOPs/cycle", and further up that page, it says "FLOPS = FLOPs/cycle x cycles/second". I've gone ahead and changed the title to "Floating-point operations per clock cycle for various processors". We should probably make this more concise, but at least it's not wrong now. 158.106.213.34 (talk) 01:33, 15 July 2022 (UTC)Reply

Origin of term?

edit

The article states without citation that Frank McMahon at LLNL invented the term. (added here) McMahon wrote a book on benchmarking in the 80s mentioned here, but I can't access it.

While the derivation from "FLoating point OPerationS" is obvious, and I'm sure the term was used informally before it ever appeared in print, I am wondering if this couldn't be figured out in more detail.

I ask because I happened to be reading "Nineteen Dubious Ways to Compute the Exponential of a Matrix" by Moler and van Loan, and it defines the term without citing any previous definitions. That's a 1978 paper that I don't think originated from LLNL. 2603:6081:2340:23A:E47A:E4C7:965F:C2AF (talk) 15:59, 4 December 2022 (UTC)Reply

I looked up on Google Books and while there are a lot of talk about McMahon popularizing MFLOPS in the 1980s and a 2020 obituary attributing the term to him probably due to Wikipedia, OED first attests it way back in 1976 and I found the actual first attestation in 1974 by David Kuck which is written like nobody used that term before. That's why I removed McMahon and replaced him with Kuck. 5.178.188.143 (talk) 12:45, 15 January 2024 (UTC)Reply

Cost per gigaflops is very erratic, requires changes.

edit

It ought to be supercomputers only or personal computers only or gaming consoles only. And FP32 only or FP64 only. For example, for 4x 7970 FP64 is counted and for the PS4 FP32 is counted. Why? Makes no sense.

GameCube in 2001 had 9.4 gigaflops for $199 ($21.17 per gigaflops) in 2003 for only $99 ($10.53 per gigaflops). In 2013 PlayStation 4 was 1843 gigaflops for $399 (meaning $0.216 per gigaflops, about 100x cheaper than GameCube 12 years earlier). In 2020 Xbox Series X was 12150 gigaflops for $499 (so $0.041 per gigaflops, 5.27x cheaper than PS4 7 years earlier). This is how it really went so far in the 21st century. Xbox Series X in 2020 had 904x more gigaflops per $ than GameCube in 2001 when adjusted for inflation. That's the real, actual progress. The article suggests 25,525x between 2000 and 2020, I disagree with that. Why not use PlayStation 2 for example? Vizorstamp2 (talk) 16:50, 23 December 2022 (UTC)Reply

"It ought to be supercomputers only or personal computers only or gaming consoles only. "
There is little difference in basic architecture between these groups for the past ~15 years. The modern difference is programmable versus fixed function accelerators. GPUs further blur this definition with OpenCL/CUDA, general purpose vector processors, and fixed function ray-trace and AI units. FPGA's are a class of their own.
"And FP32 only or FP64 only"
Comparing numbers for half(16), single(32) and double(64) precision values makes no sense, apples to oranges.
Each of the FLOPS presented needs to have its precision, format and method, specified and calculated separately. 198.85.71.19 (talk) 15:03, 5 June 2023 (UTC)Reply

Update "Cost of computing" metric to $/TFLOPS?

edit

With the lowest price per GFLOP being one cent now, I think it may be appropriate to update the chart's metric to $/TFLOPS for more meaningful comparisons going forward that won't require fractions of cents. The fact that it hasn't been done yet is already kinda weird, since the table ends in several entries that each only have one significant figure. Sizniche (talk) 22:00, 27 May 2023 (UTC)Reply