Talk:List of file signatures
This article is rated List-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||||||||||||||||||
|
This article links to one or more target anchors that no longer exist.
Please help fix the broken anchors. You can remove this template after fixing the problems. | Reporting errors |
Actually, d64 and d81 have no signature
editThose given here is what CBM DOS writes there when formatting a disk. It is part of the directory header and can be changed to anything, which is frequently done. No software should rely on any certain bytes present at those offsets. The common way to detect those images is by its size and the file extension. For details and variants, look here: http://ist.uwaterloo.ca/~schepers/formats/D64.TXT LogicDeLuxe (talk) 20:47, 11 July 2022 (UTC)
Incomplete?
editIsn't this list VERY incomplete?! ok, i can post it here... List of file signatures —Preceding unsigned comment added by 178.120.99.208 (talk) 18:42, 11 June 2010 (UTC)
- Gary Kessler's list seems to contain errors though, for example "46 4F 52 4D 00" is stated to be an IFF-AIFF file. "46 4F 52 4D" just shows that it's an IFF file, the next "00" is a part of the unsigned 32-bit length indicator, and could be any value. So the correct file signature for IFF AIFF is "46 4F 52 4D .. .. .. .. 41 49 46 46". I wonder how many errors might also be in there. JoaCHIP (talk) 11:41, 2 January 2015 (UTC)
Kessler's list
editIsn't Gary Kessler's list rather unofficial? I know, there might not be a real "official" list, but his site seems informal. --Lance E Sloan (talk) 14:30, 9 February 2011 (UTC)
No merge please
editThe file currently has a merge template. I think this is an informative and expandable list, which will be too long when completed to be merged. Thue | talk 19:56, 8 March 2011 (UTC)
- I agree with Thue. This should NOT be merged with Magic number (programming), if with anything, perhaps with List of file formats. Jahibadkaret (talk) 11:10, 29 March 2011 (UTC)
- [EDIT] I also added an extended (public domain) list of the file extension magic numbers. — Preceding unsigned comment added by Jahibadkaret (talk • contribs) 19:10, 29 March 2011 (UTC)
- I am also in agreement that this should not be merged. This is a useful list and is far too long to be included in that page. Zell Faze (talk) 03:26, 7 September 2011 (UTC)
- No Merge. While many file signatures are instances of magic numbers, not all of them are. And the reverse is not true, and there are far more other used of magic numbers than just file signatures. These are different topics covering different areas of computer science. — Loadmaster (talk) 20:03, 7 September 2011 (UTC)
I'm going to take the initiative and close this merge request. There doesn't seem to be any consensus to merge. I'll remove the merge template as well. —Tom Morris (talk) 14:28, 21 September 2011 (UTC)
Copyright violation or scam?
editThis edit removed a lot of information from the article, claiming it is a copyright violation. The same edit also added an external link apparently run by the person responsible for the edit. How do we know this is true and not a scam to charge readers $25 for the same info? Astronaut (talk) 16:36, 5 March 2012 (UTC)
- The edit summary said that he was removing updates by Franz Waldmann (talk · contribs), but he removed a lot more. My guess was that this was merely confusion about the edit history? - David Biddulph (talk) 17:09, 5 March 2012 (UTC)
- I've politely asked the user to proceed with one of the processes at WP:CPI. In the mean time, we should probably leave it off until they respond. --NYKevin @048, i.e. 00:09, 6 March 2012 (UTC)
- How much of the original list is available in public published sources? Surely those items can't be construed as copyright violations since they are not proprietary. Individual phrases describing the file contents can (probably) be copyrighted, but isn't it a stretch to say that a phrase such as "PGP Public Key-ring File" violates copyright? Or is there some bigger issue at stake here? — Loadmaster (talk) 02:20, 6 March 2012 (UTC)
- Personally, I don't put too much stock into these sorts of claims, but if it turns out this person doesn't own the copyrights in the first place, we won't have to fight that fight as strenuously. I did specifically mention that concern, however, when contacting this individual. We'll see what happens. --NYKevin @178, i.e. 03:16, 6 March 2012 (UTC)
- Thank you, but don't expect much of a reply. The editor's only edit to date has been to remove the alleged copyright violation. Astronaut (talk) 07:00, 6 March 2012 (UTC)
- Personally, I don't put too much stock into these sorts of claims, but if it turns out this person doesn't own the copyrights in the first place, we won't have to fight that fight as strenuously. I did specifically mention that concern, however, when contacting this individual. We'll see what happens. --NYKevin @178, i.e. 03:16, 6 March 2012 (UTC)
- How much of the original list is available in public published sources? Surely those items can't be construed as copyright violations since they are not proprietary. Individual phrases describing the file contents can (probably) be copyrighted, but isn't it a stretch to say that a phrase such as "PGP Public Key-ring File" violates copyright? Or is there some bigger issue at stake here? — Loadmaster (talk) 02:20, 6 March 2012 (UTC)
Looking through the history, I think this user might actually be complaining about a single action: [1]. This looks similar to the removed contents, and the history makes the net change look like a no-op, considering the other edits in between these two: [2]. In other words, this might actually be a real complaint about a specific set of information. OTOH, the idea-expression divide makes me continue to be skeptical that this is actually infringing, regardless of the intent of the remover. If they don't contact us soon, I would not oppose restoring the information. --NYKevin @845, i.e. 19:16, 14 March 2012 (UTC)
The page that I assume is related to this complaint, filesig.co.uk, is a bit painful to use but does seem to be established and probably legitimate. Whether the material is from there I don't know. I do know that a trivial google search shows an exact copy of what was posted has been online at http://code.google.com/p/baraeco/source/browse/trunk/utils/headers.txt since mid 2009. Is this the original version? Your guess is as good as mine. It is the only hit I can see on google that is a duplicate of the uploaded data though. Is the material copyright? I counter this with another question: is a virus signature database copyrighted? I'm not a laywer, but to me a file signature this trivial seems like nothing more than a list of public domain knowledge which I would think makes the copyright claim hinge on the quantity of original work placed in to curating the list. But don't quote me on that, ask an expert. 89.238.157.212 (talk) 03:31, 27 April 2012 (UTC)
JPEG signature
editThis edit adds E0 as fourth byte in JPEG signature. However, Kessler's list notes that it can be D8, E0, E1, E2, E3 and E8. and recommends to use only first three of them. I heard there can be mistakes in it, but right now I have some jpegs that have FF-D8-FF-E2 beginning.
List order
editGiven that this is a list of signatures, may I suggest that new entries are inserted in the list ordered by hex signature as if compared with strcmp? Ant diaz (talk) 19:09, 7 June 2016 (UTC)
ISO-8859-1 is sometimes incorrect/incomplete
editMany ISO-8859-1 signatures are incomplete (using ASCII instead), and some are even wrong (e.g. MP3 was ˙ű
instead of ÿû
). Maybe this column should be completed and checked for errors (plus I would advocate for Windows-1252 which contains more printable characters than ISO-8859-1). Seems like a hard task though; is there a way to automate this? —Cousteau (talk) 17:20, 23 August 2016 (UTC)
- I think it should be possible to automate with a bit of JavaScript. I also think most cells have too much spacing, another candidate for an automated edit.
- Done. For reference, here's the JS I used:
.split(/ |
/g).map(x=>x=='??'?-1:parseInt(x,16)).map(x=>x<32?x|0x2400:x==127?0x2421:x>=127&&x<=160?'␡€□‚ƒ„…†‡ˆ‰Š‹Œ□Ž□□‘’“”•–—˜™š›œ□žŸ⍽'[x-127].codePointAt(0):x).map(x=>x==-1?'?':x==173?'-':String.fromCodePoint(x)) - I then cleaned up the result.
- Done. For reference, here's the JS I used:
XML and character encoding
editIt seems to me that the XML "magic number" is not accurate because XML documents are allowed to have a byte-order marker (BOM) unicode character as the first characters. For now I will change the XML row to mention that this is for ASCII XML files. Because UTF-16 XML is also a thing and would not have this byte sequence. — Preceding unsigned comment added by 2605:6000:1019:129:4970:5EE5:E42F:9E59 (talk) 00:27, 2 September 2016 (UTC)
External links modified
editHello fellow Wikipedians,
I have just modified one external link on List of file signatures. Please take a moment to review my edit. If you have any questions, or need the bot to ignore the links, or the page altogether, please visit this simple FaQ for additional information. I made the following changes:
- Added archive https://web.archive.org/web/20100502014229/http://docsrv.sco.com:507/en/man/html.C/compress.C.html to http://docsrv.sco.com:507/en/man/html.C/compress.C.html
When you have finished reviewing my changes, you may follow the instructions on the template below to fix any issues with the URLs.
This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}}
(last update: 5 June 2024).
- If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
- If you found an error with any archives or the URLs themselves, you can fix them with this tool.
Cheers.—InternetArchiveBot (Report bug) 08:36, 30 December 2017 (UTC)
MOD (file format) is missing?
editAny reason the .mod file format is not in the list? Its magic number is "M.K." Xenophil (talk) 19:25, 4 April 2022 (UTC)
DF BF 34 EB CE for PDFs seems incorrect
editDF BF 34 EB CE
was added as an alternate magic number for PDFs by an anonymous user:
https://en.wiki.x.io/w/index.php?title=List_of_file_signatures&diff=prev&oldid=1075014939
A brief search for that sequence of numbers did not come up with any relevant results. Also the exiting citation for PDFs also did not contain any reference to DF BF 34 EB CE, so that addition seems bogus, and I have removed it.
cdi signature is incorrect
editat the very least, the magic is "CD-I " (trailing space) - and the offset is almost certainly variable. may or may not be worth trying to verify tuxy's other edits to this page?
spec (??) is here: https://www.lscdweb.com/data/downloadables/2/8/cdi_may94_r2.pdf - the part about "CD-I " is at page 80 out of 1000. the header is also mentioned here: http://fileformats.archiveteam.org/wiki/CD-i Somebody1235 (talk) 09:33, 19 October 2022 (UTC)
IANA Content Types
editWould it make sense to add the standard IANA content types to this list? Extensions are fast and loose - for example, a PEM file often goes by .crt, .key, .cert, etc... Shawngmc (talk) 15:41, 29 March 2023 (UTC)