Text watermarking is a technique for embedding hidden information within textual content to verify its authenticity, origin, or ownership.[1] With the rise of generative AI systems using large language models (LLM), there has been significant development focused on watermarking AI-generated text.[2] Potential applications include detecting fake news and academic cheating, and excluding AI-generated material from LLM training data.[3] With LLMs the focus is on linguistic approaches that involve selecting words to form patterns within the text that can later be identified.[1] The results of the first reported large-scale public deployment, a trial using Google's Gemini chatbot, appeared in October 2024: users across 20 million responses found watermarked and unwatermarked text to be of equal quality.[3] Research on text watermarking began in 1997.[1]

See also

edit

References

edit
  1. ^ a b c Kamaruddin, Nurul Shamimi; Kamsin, Amirrudin; Por, Lip Yee; Rahman, Hameedur (2018). "A Review of Text Watermarking: Theory, Methods, and Applications". IEEE Access. 6: 8011–8028. Bibcode:2018IEEEA...6.8011K. doi:10.1109/ACCESS.2018.2796585. ISSN 2169-3536.
  2. ^ Liu, Aiwei; Pan, Leyi; Lu, Yijian; Li, Jingjing; Hu, Xuming; Zhang, Xi; Wen, Lijie; King, Irwin; Xiong, Hui; Yu, Philip (2024-09-03). "A Survey of Text Watermarking in the Era of Large Language Models". ACM Computing Surveys. 57 (2): 1–36. arXiv:2312.07913. doi:10.1145/3691626. ISSN 0360-0300.
  3. ^ a b Gibney, Elizabeth (Oct 23, 2024). "Google unveils invisible 'watermark' for AI-generated text". Nature. 634 (8036): 1027–1028. Bibcode:2024Natur.634.1027G. doi:10.1038/d41586-024-03462-7. PMID 39443774. Retrieved Oct 26, 2024.