Talk:List of file signatures

Latest comment: 1 year ago by Shawngmc in topic IANA Content Types

Actually, d64 and d81 have no signature

edit

Those given here is what CBM DOS writes there when formatting a disk. It is part of the directory header and can be changed to anything, which is frequently done. No software should rely on any certain bytes present at those offsets. The common way to detect those images is by its size and the file extension. For details and variants, look here: http://ist.uwaterloo.ca/~schepers/formats/D64.TXT LogicDeLuxe (talk) 20:47, 11 July 2022 (UTC)Reply

Incomplete?

edit

Isn't this list VERY incomplete?! ok, i can post it here... List of file signatures —Preceding unsigned comment added by 178.120.99.208 (talk) 18:42, 11 June 2010 (UTC)Reply

  • Gary Kessler's list seems to contain errors though, for example "46 4F 52 4D 00" is stated to be an IFF-AIFF file. "46 4F 52 4D" just shows that it's an IFF file, the next "00" is a part of the unsigned 32-bit length indicator, and could be any value. So the correct file signature for IFF AIFF is "46 4F 52 4D .. .. .. .. 41 49 46 46". I wonder how many errors might also be in there. JoaCHIP (talk) 11:41, 2 January 2015 (UTC)Reply

Kessler's list

edit

Isn't Gary Kessler's list rather unofficial? I know, there might not be a real "official" list, but his site seems informal. --Lance E Sloan (talk) 14:30, 9 February 2011 (UTC)Reply

No merge please

edit

The file currently has a merge template. I think this is an informative and expandable list, which will be too long when completed to be merged. Thue | talk 19:56, 8 March 2011 (UTC)Reply

I agree with Thue. This should NOT be merged with Magic number (programming), if with anything, perhaps with List of file formats. Jahibadkaret (talk) 11:10, 29 March 2011 (UTC)Reply
[EDIT] I also added an extended (public domain) list of the file extension magic numbers. — Preceding unsigned comment added by Jahibadkaret (talkcontribs) 19:10, 29 March 2011 (UTC)Reply
I am also in agreement that this should not be merged. This is a useful list and is far too long to be included in that page. Zell Faze (talk) 03:26, 7 September 2011 (UTC)Reply
No Merge. While many file signatures are instances of magic numbers, not all of them are. And the reverse is not true, and there are far more other used of magic numbers than just file signatures. These are different topics covering different areas of computer science. — Loadmaster (talk) 20:03, 7 September 2011 (UTC)Reply

I'm going to take the initiative and close this merge request. There doesn't seem to be any consensus to merge. I'll remove the merge template as well. —Tom Morris (talk) 14:28, 21 September 2011 (UTC)Reply

edit

This edit removed a lot of information from the article, claiming it is a copyright violation. The same edit also added an external link apparently run by the person responsible for the edit. How do we know this is true and not a scam to charge readers $25 for the same info? Astronaut (talk) 16:36, 5 March 2012 (UTC)Reply

The edit summary said that he was removing updates by Franz Waldmann (talk · contribs), but he removed a lot more. My guess was that this was merely confusion about the edit history? - David Biddulph (talk) 17:09, 5 March 2012 (UTC)Reply
I've politely asked the user to proceed with one of the processes at WP:CPI. In the mean time, we should probably leave it off until they respond. --NYKevin @048, i.e. 00:09, 6 March 2012 (UTC)Reply
How much of the original list is available in public published sources? Surely those items can't be construed as copyright violations since they are not proprietary. Individual phrases describing the file contents can (probably) be copyrighted, but isn't it a stretch to say that a phrase such as "PGP Public Key-ring File" violates copyright? Or is there some bigger issue at stake here? — Loadmaster (talk) 02:20, 6 March 2012 (UTC)Reply
Personally, I don't put too much stock into these sorts of claims, but if it turns out this person doesn't own the copyrights in the first place, we won't have to fight that fight as strenuously. I did specifically mention that concern, however, when contacting this individual. We'll see what happens. --NYKevin @178, i.e. 03:16, 6 March 2012 (UTC)Reply
Thank you, but don't expect much of a reply. The editor's only edit to date has been to remove the alleged copyright violation. Astronaut (talk) 07:00, 6 March 2012 (UTC)Reply

Looking through the history, I think this user might actually be complaining about a single action: [1]. This looks similar to the removed contents, and the history makes the net change look like a no-op, considering the other edits in between these two: [2]. In other words, this might actually be a real complaint about a specific set of information. OTOH, the idea-expression divide makes me continue to be skeptical that this is actually infringing, regardless of the intent of the remover. If they don't contact us soon, I would not oppose restoring the information. --NYKevin @845, i.e. 19:16, 14 March 2012 (UTC)Reply

The page that I assume is related to this complaint, filesig.co.uk, is a bit painful to use but does seem to be established and probably legitimate. Whether the material is from there I don't know. I do know that a trivial google search shows an exact copy of what was posted has been online at http://code.google.com/p/baraeco/source/browse/trunk/utils/headers.txt since mid 2009. Is this the original version? Your guess is as good as mine. It is the only hit I can see on google that is a duplicate of the uploaded data though. Is the material copyright? I counter this with another question: is a virus signature database copyrighted? I'm not a laywer, but to me a file signature this trivial seems like nothing more than a list of public domain knowledge which I would think makes the copyright claim hinge on the quantity of original work placed in to curating the list. But don't quote me on that, ask an expert. 89.238.157.212 (talk) 03:31, 27 April 2012 (UTC)Reply

JPEG signature

edit

This edit adds E0 as fourth byte in JPEG signature. However, Kessler's list notes that it can be D8, E0, E1, E2, E3 and E8. and recommends to use only first three of them. I heard there can be mistakes in it, but right now I have some jpegs that have FF-D8-FF-E2 beginning.

List order

edit

Given that this is a list of signatures, may I suggest that new entries are inserted in the list ordered by hex signature as if compared with strcmp? Ant diaz (talk) 19:09, 7 June 2016 (UTC)Reply

ISO-8859-1 is sometimes incorrect/incomplete

edit

Many ISO-8859-1 signatures are incomplete (using ASCII instead), and some are even wrong (e.g. MP3 was ˙ű instead of ÿû). Maybe this column should be completed and checked for errors (plus I would advocate for Windows-1252 which contains more printable characters than ISO-8859-1). Seems like a hard task though; is there a way to automate this? —Cousteau (talk) 17:20, 23 August 2016 (UTC)Reply

I think it should be possible to automate with a bit of JavaScript. I also think most cells have too much spacing, another candidate for an automated edit.
Done. For reference, here's the JS I used: .split(/ |
/g).map(x=>x=='??'?-1:parseInt(x,16)).map(x=>x<32?x|0x2400:x==127?0x2421:x>=127&&x<=160?'␡€□‚ƒ„…†‡ˆ‰Š‹Œ□Ž□□‘’“”•–—˜™š›œ□žŸ⍽'[x-127].codePointAt(0):x).map(x=>x==-1?'?':x==173?'-':String.fromCodePoint(x))
I then cleaned up the result.

XML and character encoding

edit

It seems to me that the XML "magic number" is not accurate because XML documents are allowed to have a byte-order marker (BOM) unicode character as the first characters. For now I will change the XML row to mention that this is for ASCII XML files. Because UTF-16 XML is also a thing and would not have this byte sequence. — Preceding unsigned comment added by 2605:6000:1019:129:4970:5EE5:E42F:9E59 (talk) 00:27, 2 September 2016 (UTC)Reply

edit

Hello fellow Wikipedians,

I have just modified one external link on List of file signatures. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:

When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 5 June 2024).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—InternetArchiveBot (Report bug) 08:36, 30 December 2017 (UTC)Reply

MOD (file format) is missing?

edit

Any reason the .mod file format is not in the list? Its magic number is "M.K." Xenophil (talk) 19:25, 4 April 2022 (UTC)Reply

DF BF 34 EB CE for PDFs seems incorrect

edit

DF BF 34 EB CE was added as an alternate magic number for PDFs by an anonymous user: https://en.wiki.x.io/w/index.php?title=List_of_file_signatures&diff=prev&oldid=1075014939

A brief search for that sequence of numbers did not come up with any relevant results. Also the exiting citation for PDFs also did not contain any reference to DF BF 34 EB CE, so that addition seems bogus, and I have removed it.

Ryan 1729 (talk) 21:54, 15 April 2022 (UTC)Reply

cdi signature is incorrect

edit

at the very least, the magic is "CD-I " (trailing space) - and the offset is almost certainly variable. may or may not be worth trying to verify tuxy's other edits to this page?

spec (??) is here: https://www.lscdweb.com/data/downloadables/2/8/cdi_may94_r2.pdf - the part about "CD-I " is at page 80 out of 1000. the header is also mentioned here: http://fileformats.archiveteam.org/wiki/CD-i Somebody1235 (talk) 09:33, 19 October 2022 (UTC)Reply

IANA Content Types

edit

Would it make sense to add the standard IANA content types to this list? Extensions are fast and loose - for example, a PEM file often goes by .crt, .key, .cert, etc... Shawngmc (talk) 15:41, 29 March 2023 (UTC)Reply