User:WeatherWriter/LLM Experiment 2

This experiments purpose is to identify if large language models (i.e. Chat-GPT4) can be used to locate sourced and unsourced information within the following Wikipedia article: 2024 western Russia incursion.

Previous experiment

edit

Brief summary

edit

This experiment will be tested on 2024 western Russia incursion, which is documenting an ongoing event. For this experiment, I will be providing the large language model with a direct copy/paste of the sources used for a particular section in the Wikipedia article. This will then be followed up with a copy/paste of the Wikipedia article to identify what is sources and unsourced.

Experiment

edit

For this experiment, I will be testing the 13 March section in the article as of this perm-linked version.

Section Copy/Paste

edit

The Freedom of Russia Legion urged civilians in Belgorod and Kursk to evacuate amid them being "forced to inflict fire damage on military positions in Belgorod and Kursk." They also reported to have destroyed a Russian control centre.[1]

In the morning the Legion published a video with "greetings from the liberated part of Tyotkino", announcing a raid deep into the village. However, analysis by Agentstvo showed that the footage was filmed in the Ukrainian border village of Ryzhivka, 300 meters from Tyotkino. A representative of the Freedom of Russia Legion told Agentstvo that the video with the fighters was indeed recorded on the edge of Ryzhivka, even "before the storming of Tyotkino". According to his data, at the time the video was published, the military were already in Tyotkino, but he could not provide the video from there, explaining that he risked calling fire on the fighters. There was no independent confirmation that the Freedom of Russia Legion entered Tyotkino.[2][3]

  1. ^ Orlova, Alisa; Struck, Julia (13 March 2024). "'We Urge Everyone to Leave Cities Immediately' – Pro-Kyiv Militias Warn of Strikes on Russian Border Cities". Kyiv Post. Retrieved 13 March 2024.
  2. ^ Cite error: The named reference :1 was invoked but never defined (see the help page).
  3. ^ Bailey, Riley; Harward, Christina; Wolkov, Nicole; Hird, Karolina; Kagan, Frederick W. "Russian Offensive Campaign Assessment, March 13, 2024". Institute for the Study of War. Retrieved 14 March 2024.

Details

edit

Noting, reference 2 in this direct copy/paste was not listed in the exact section and was defined by a ref name for a reference whose content is outside the section. This first test will see what information is located by the sources whose information is present. Due to this "<ref name=":1" />" will be removed from the copy/paste into the LLM. In the actual article, source 1 is reference 20, source 2, the one being removed in test 1, is reference 18, and source 3 is reference 21.

Test 1

edit

Step 1

edit
  1. After beginning a new chat with Chat-GPT4, the A.I. was provided with the following message: You can going to be provided with accurate information related to the 2024 western Russia incursion. Each source provided to you will be labeled with the author's name as well as a [#]. You must keep track of the information from each source. Do not mix information together from different sources. Do you understand your task?

Step 2

edit
  1. To this prompt, the A.I. responded with: "Yes, I understand the task. I'll keep track of information from each source individually and reference them as specified. Please provide the sources or tell me how to proceed."

Step 3

edit

Sources are given to Chat-GPT4
3.1. Chat-GPT4 was provided with the information from the first source, the Kyiv Post, with the following prompt: Source #1: Keep track of this source as the following:

Kyiv Post [20]
Title: ‘We Urge Everyone to Leave Cities Immediately’ – Pro-Kyiv Militias Warn of Strikes on Russian Border Cities
Subtitle: This announcement comes one day after the groups claimed to have seized Tyotkino village in Russia’s Kursk region.
Content: (Copy/paste of news article. Content of exact copy/paste is not being posted for copyright regulations.)

3.11. To this prompt, it responded:

Understood, I'll keep track of the information from Kyiv Post [20] as provided. If there's more information or another source to be added, please let me know.

3.2. Chat-GPT4 was provided with the information from the second sources the Institute for the Study of War, with the following prompt: Source #2: Keep track of this source as the following:

Institute for the Study of War [21]
Title: Russian Offensive Campaign Assessment, March 13, 2024
Content: (Copy/paste of news article. Content of exact copy/paste is not being posted for copyright regulations.)

Noting: The source is very large, so only the direct section related to this region (as the source is divided by regions) was included.

3.21. To this prompt, it responded:

Noted, the information from the Institute for the Study of War [21] is now recorded. If there's more to add or another source to track, just let me know!

Step 4

edit

4. The following prompt was given to Chat-GPT4: All sources have been provided. You will now be given a text of information. Your task will be to look at the information given to you and identify what source or sources state the specific information in the text. Do you understand your task?

4.1. To this prompt, the A.I. responded: Yes, I understand the task. Go ahead and provide the text, and I'll identify which source or sources state the specific information it contains.

Step 5

edit

5. Chat-GPT4 was provided with this prompt: Here is the text. Please identify which source or sources mention each sentence of this text. Each source was assigned a [#]. List the "[#]"s after each sentence... After this prompt, the sections text was copy and pasted into the A.I. without any source numbers ([#]) after the sentences.

5.1. To this prompt, Chat-GPT4 responded with:

The provided text includes details not directly found in the sources you've given me ([20] and [21]). Here's a breakdown based on the information from the sources provided:
"The Freedom of Russia Legion urged civilians in Belgorod and Kursk to evacuate amid them being 'forced to inflict fire damage on military positions in Belgorod and Kursk.' They also reported to have destroyed a Russian control centre." This information aligns with Kyiv Post [20], where the Freedom of Russia Legion, alongside other groups, warned of strikes on Belgorod and Kursk and reported the destruction of a Russian control center.
The rest of the text, discussing the video's location being misrepresented as Tyotkino when it was actually filmed in Ryzhivka, the admission from a representative of the Freedom of Russia Legion about the video's true location, and the lack of independent confirmation of the Legion's presence in Tyotkino, does not match the specific details from either source [20] or [21] you provided. This additional narrative suggests information that might come from another, unprovided source, as it discusses specifics about video footage analysis and locations that were not mentioned in the provided summaries from Kyiv Post [20] or Institute for the Study of War [21].
Therefore, for the first part of your text, the source is Kyiv Post [20]. The second part does not directly correspond to information from the provided sources.

Step 6

edit

Test third source. To test the third source, the citation ([1]) was retrieved from 12 March section of the article.

6.1. Chat-GPT4 was then provided with the following prompt: I am going to give you an additional source. Please test the content of that source against the text above in the same way you tested the others. Do you understand your task?

6.11. To this, Chat-GPT4 responded with Yes, I understand the task. Please provide the additional source, and I'll test the content of that source against the text above in the same manner as before.

6.2. Chat-GPT4 was then given the following:

Агентство [18]
Title: Опубликованное легионом «Свобода России» «видео из Теткино» оказалось снято в Украине
Content: (Copy/paste of news article. Content of exact copy/paste is not being posted for copyright regulations.)

NOTE: This source was in Russian, not English.

6.21 To this prompt, it responded:

Based on the additional source provided, Агентство [18], here's how the text aligns with the sources:
"The Freedom of Russia Legion urged civilians in Belgorod and Kursk to evacuate amid them being 'forced to inflict fire damage on military positions in Belgorod and Kursk.' They also reported to have destroyed a Russian control centre." This is confirmed by Kyiv Post [20].
"In the morning the Legion published a video with 'greetings from the liberated part of Tyotkino', announcing a raid deep into the village. However, analysis by Agentstvo showed that the footage was filmed in the Ukrainian border village of Ryzhivka, 300 meters from Tyotkino. A representative of the Freedom of Russia Legion told Agentstvo that the video with the fighters was indeed recorded on the edge of Ryzhivka, even 'before the storming of Tyotkino'. According to his data, at the time the video was published, the military were already in Tyotkino, but he could not provide the video from there, explaining that he risked calling fire on the fighters. There was no independent confirmation that the Freedom of Russia Legion entered Tyotkino." This is confirmed by Агентство [18].
Therefore, for the specific details regarding the video's location and the commentary from the representative of the Freedom of Russia Legion, it aligns with Агентство [18].

Conclusion

edit

Editors should always be aware of the known fact that large language models can provide false information. However, as seen in this experiment, a use for these large language models can be to help identify what sources content in a Wikipedia article originate from. Part of the proposed Wikipedia policy for large language models (Wikipedia:Large language models) specifically states that editors can use them, if the content is scrutinized heavily. The results of this experiment showed that editors can copy/paste a source into a large language model, followed by a portion of a Wikipedia article and use the large language model to help identify what information is sourced by what sources and what is unsourced.

edit

For transparency as well as verification, a complete link to the Chat-GPT4 conversation is able to be found here: https://chat.openai.com/share/47a3733e-1ae9-4830-9576-b981be27d76f