Language creation in artificial intelligence

In artificial intelligence, researchers teach AI systems to develop their own ways of communicating by having them work together on tasks and use symbols as parts of a new language. These languages might grow out of human languages or be built completely from scratch. When AI is used for translating between languages, it can even create a new shared language to make the process easier. Natural Language Processing (NLP) helps these systems understand and generate human-like language, making it possible for AI to interact and communicate more naturally with people.

Evolution from English

edit

In 2017 Facebook Artificial Intelligence Research (FAIR) trained chatbots on a corpus of English text conversations between humans playing a simple trading game involving balls, hats, and books.[1] When programmed to experiment with English and tasked with optimizing trades, the chatbots seemed to evolve a reworked version of English to better solve their task. In some cases the exchanges seemed nonsensical:[2][3][4]

Bob: "I can can I I everything else"
Alice: "Balls have zero to me to me to me to me to me to me to me to me to"

Facebook's Dhruv Batra said: "There was no reward to sticking to English language. Agents will drift off understandable language and invent codewords for themselves. Like if I say 'the' five times, you interpret that to mean I want five copies of this item."[4] It's often unclear exactly why a neural network decided to produce the output that it did.[2] Because the agents' evolved language was opaque to humans, Facebook modified the algorithm to explicitly provide an incentive to mimic humans. This modified algorithm is preferable in many contexts, even though it scores lower in effectiveness than the opaque algorithm, because clarity to humans is important in many use cases.[1]

In The Atlantic, Adrienne LaFrance analogized the wondrous and "terrifying" evolved chatbot language to cryptophasia, the phenomenon of some twins developing a language that only the two children can understand.[5]

Beginning of the AI Language Creation

edit

In 2017 researchers at OpenAI demonstrated a multi-agent environment and learning methods that bring about emergence of a basic language ab initio without starting from a pre-existing language. The language consists of a stream of "ungrounded" (initially meaningless) abstract discrete symbols uttered by agents over time, which comes to evolve a defined vocabulary and syntactical constraints. One of the tokens might evolve to mean "blue-agent", another "red-landmark", and a third "goto", in which case an agent will say "goto red-landmark blue-agent" to ask the blue agent to go to the red landmark. In addition, when visible to one another, the agents could spontaneously learn nonverbal communication such as pointing, guiding, and pushing. The researchers speculated that the emergence of AI language might be analogous to the evolution of human communication.[2][6][7]

Similarly, a 2017 study from Abhishek Das and colleagues demonstrated the emergence of language and communication in a visual question-answer context, showing that a pair of chatbots can invent a communication protocol that associates ungrounded tokens with colors and shapes.[5][8]

This shows the language generation and how models were trained from scratch for the AI to understand and build off for human communication and understanding.

Interlingua

edit

In 2016, Google deployed to Google Translate an AI designed to directly translate between any of 103 different natural languages, including pairs of languages that it had never before seen translated between. Researchers examined whether the machine learning algorithms were choosing to translate human-language sentences into a kind of "interlingua", and found that the AI was indeed encoding semantics within its structures. The researchers cited this as evidence that a new interlingua, evolved from the natural languages, exists within the network.[2][9]

Current standpoint of Language generation in AI

edit

At the timeline of this page, AI generation is at a slow pace. The development of Natural Language Processing (NLP) has change the game of language generation which is currently being used throughout various generative AI chatbots such as ChatGPT, Bing AI, and Bard AI. The whole basis of language generation is through the training of computer models and algorithms which can learn from a large dataset of information. For example, there are mixed sentence models which tend to perform better as they take a larger sampling size of sentenced data rather than just words[10]. These models continuously develop over time through the integration of more data. This allows for better communication over time as more information is being learned which the AI can feed off of.

The image on the right portrays how these models are implemented to communicate with users trying to learn about information and things around the world.

 
Live use of ChatGPT to learn about the Wales Tiananmen Square.

Applications of Generative AI

edit

Generative AI for language use has been applicate to industries and markets across the world such as customer service, games, translation, and other technical tasks such as understanding large chunks of data. Focusing in customer service, AI chatbots such as ChatGPT and Bard AI utilize natural language processing (NLP) to work, understand, and communicate with users live to offer responses and opinions depending on the questions asked. They not only mimic human interaction but represent themselves as their own being which allows for one-on-one interaction with users by developing language and their own way of talking. In the field of gaming, non-playable characters (NPC's) are used to better the in game experience by providing insights from the bots and other characters that are implemented in many story-mode and first person shooter (FPS) games. In addition, when using for translation, these generative AI's are able to understand thousands of other languages and translate them to help the user understand information. This is helpful and leads to a larger appeal of an audience. These applications are evolving over time and portray the various uses of language through AI in industries, markets, and daily situations.

Challenges and Limitations of AI Language Creation

edit

Although AI seems to be evolving at a rapid rate, there are many challenges that are faced on a technical standpoint. For example, there are many cases when the language of the AI is very vague and makes it confusing for the user to understand what the AI is trying to explain. In addition, there is a "black-box problem"[11][10] in which there is a lack of transparency and interpretability in the language of AI outputs. In addition, as premium versions of AI chatbots come forward, it can scrape data from the web which may lead to biases in the information it presents. As these AI models are trained from words and sentenced, they could accidentally form an opinion based on the information they are trained from. This may potentially reflect stereotypes that should not be supported as a neutral minded AI.

These limitations and challenges are meant to be corrected over time as the models learn more language through conversations and information they receive over time. This will strengthen language creation and aid in the conversational prowess and understanding of the AI which can then be implemented at the standard of human expectation.

Ethical Risks in AI Language Development

edit

Remembering the information about the challenges of AI language development and conversation, there are many ethical risks imposed within this. The misuse of these chatbots to create fake information or manipulate others can lead to many ethical considerations. In addition, there is a strong privacy concern when using chatbots. Many are concerned with the AI saving and selling information. There are many guidelines from journals such as IEEE and the EU that mention the necessary measures "to ensure privacy preservation...involving sensitive information"[12][11]. In this article, there is a calling for responsible AI use, especially in the case of understanding sensitive medical data which is explained within the article.

As these technologies advance, making sure ethical standards are met is critical for privacy of information and to maintain a neutral standpoint within the language development and communication with users.[10][12][11][13]

Future of AI Language Creation

edit

In conclusion, as AI technology continue to evolve, the goal is to develop refined systems in which there is a neutral, but informative standpoint from the AI. There are many types of upcoming deep learning and neural network models that will be used to dive deeper and develop multiple layers of checking which will be helpful for the NLP as it will ensure enhanced interactions with users. These integrations and stronger models will lead to a safer environment of communication to prevent biases, any irrational claims, and a better environment within games, customer service, VR/AR systems, and translation within thousands of languages. There is a future towards medical scribing and communication with doctors during live surgeries. The future is promising for generative AI language as it will continue to grow by being trained on millions of new words, sentences, and dialect day by day through the use of intricate computational models[14].

File:Deep Learning in Natural Language Processing.jpeg (this image portrays the intricate modeling of NLP and how it ensures its accuracy during communication)

See also

edit

References

edit
  1. ^ a b "Chatbots learn how to negotiate and drive a hard bargain". New Scientist. 14 June 2017. Retrieved 24 January 2018.
  2. ^ a b c d Baraniuk, Chris (1 August 2017). "'Creepy Facebook AI' story sweeps media". BBC News. Retrieved 24 January 2018.
  3. ^ "Facebook robots shut down after they talk to each other in language only they understand". The Independent. 31 July 2017. Retrieved 24 January 2018.
  4. ^ a b Field, Matthew (1 August 2017). "Facebook shuts down robots after they invent their own language". The Telegraph. Retrieved 24 January 2018.
  5. ^ a b LaFrance, Adrienne (20 June 2017). "What an AI's Non-Human Language Actually Looks Like". The Atlantic. Retrieved 24 January 2018.
  6. ^ "It Begins: Bots Are Learning to Chat in Their Own Language". WIRED. 16 March 2017. Retrieved 24 January 2018.
  7. ^ Mordatch, I., & Abbeel, P. (2017). Emergence of Grounded Compositional Language in Multi-Agent Populations. arXiv preprint arXiv:1703.04908.
  8. ^ Das, A., Kottur, S., Moura, J. M., Lee, S., & Batra, D. (2017). Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning. arXiv preprint arXiv:1703.06585.
  9. ^ Johnson, M., Schuster, M., Le, Q. V., Krikun, M., Wu, Y., Chen, Z., ... & Hughes, M. (2016). Google's multilingual neural machine translation system: enabling zero-shot translation. arXiv preprint arXiv:1611.04558.
  10. ^ a b khan, Bangul; Fatima, Hajira; Qureshi, Ayatullah; Kumar, Sanjay; Hanan, Abdul; Hussain, Jawad; Abdullah, Saad (2023-02-08). "Drawbacks of Artificial Intelligence and Their Potential Solutions in the Healthcare Sector". Biomedical Materials & Devices (New York, N.y.): 1–8. doi:10.1007/s44174-023-00063-2. ISSN 2731-4812. PMC 9908503. PMID 36785697.
  11. ^ a b Martinelli, Fabio (28 September 2020). "Enhanced Privacy and Data Protection using Natural Language Processing and Artificial Intelligence". Institute of Electrical and Electronics Engineers.
  12. ^ Goodman, Joshua (2001-08-09), A Bit of Progress in Language Modeling, doi:10.48550/arXiv.cs/0108005, retrieved 2024-10-14
  13. ^ Rita, Mathieu; Michel, Paul; Chaabouni, Rahma; Pietquin, Olivier; Dupoux, Emmanuel; Strub, Florian (2024-03-18), Language Evolution with Deep Learning, doi:10.48550/arXiv.2403.11958, retrieved 2024-10-14