Artificial intelligence: Difference between revisions

Line 1:

{{ToneWarning}}~~{{Incomplete|Issue 1=Article is currently under renovation.}}~~

'''Artificial intelligence''' (AI) is a field of computer science producing software. Since the November 2022 launch of [[ChatGPT]], [[wikipedia:Large language model|large language model]] (LLM) chatbots have been a main focus of the industry with billions of dollars in funding allocated to producing more "intelligent" LLMs. Also a significant focus are [[wikipedia:Text-to-image model|text-to-image models]], which "draw" an image using written instructions, and [[wikipedia:Text-to-video model|text-to-video models]], which extend the text-to-image concept across several smooth video frames.

~~The current well~~-~~funded~~, ~~lucrative~~ industry of ~~artificial intelligence tools has resulted~~ in ~~rampant unethical use of content~~, ~~both in its creation~~ and ~~in its implementation~~.

'''Artificial intelligence''' (AI) is a field of computer science producing software that aims to ultimately replace all manual labor. AI is not a new concept - it has been of interest as early as the 1950s. Since the November 2022 launch of [[ChatGPT]], [[wikipedia:Large language model|large language model]] (LLM) chatbots have been a main focus of the industry, with billions of dollars in funding allocated to producing more "intelligent" LLMs. Also a significant focus are [[wikipedia:Text-to-image model|text-to-image models]], which "draw" an image using written instructions, and [[wikipedia:Text-to-video model|text-to-video models]], which extend the text-to-image concept across several smooth video frames.

== ~~Background~~ ==

[[wikipedia:Generative artificial intelligence|Generative artificial intelligence]] models are trained through vast amounts of existing human-generated content. Using the example of an LLM, by learning about common trends in sentence structure, the model is able to form complete sentences and show artificial "knowledge" of a topic. The artificial nature may cause [[wikipedia:Hallucination (artificial intelligence)|hallucination]] through confidently-written, but mostly or entirely incorrect, output.

The current well-funded, lucrative industry of artificial intelligence tools has resulted in rampant unethical use of content. Startups intending to produce AI services have been scraping the internet for content to train future models at a fast pace, and members of the field are concerned that they are approaching the limit of publicly-available content to train from.<ref>{{Cite web |last=Tremayne-Pengelly |first=Alexandra |date=16 Dec 2024 |title=Ilya Sutskever Warns A.I. Is Running Out of Data—Here’s What Will Happen Next |url=https://observer.com/2024/12/openai-cofounder-ilya-sutskever-ai-data-peak/ |website=Observer}}</ref>

==Unethical website scraping==

While "mainstream" companies such as [[OpenAI]], [[Anthropic]], and [[Meta]] appear to correctly follow industry-standard practice for web crawlers~~{{Citation needed}}~~, others ignore them, causing [[wikipedia:Denial-of-service attack|distributed denial of service attacks]] which damage access to freely-accessible websites. This is particularly an issue for websites that are large or contain many dynamic links.

Further Reading: [[Nonconsensual Scraping|Nonconsensual scraping]]

While "mainstream" companies such as [[OpenAI]], [[Anthropic]], and [[Meta]] appear to correctly follow industry-standard practice for web crawlers, others ignore them, causing [[wikipedia:Denial-of-service attack|distributed denial of service attacks]] which damage access to freely-accessible websites. This is particularly an issue for websites that are large or contain many dynamic links.

Ethical website scrapers, known as "spiders" that crawl the web, follow a certain set of minimum guidelines. Specifically, they follow [[wikipedia:robots.txt|robots.txt]], a text file found at the root of a domain that indicates:

Line 95:

Line 97:

In some cases, these AI models can also be hijacked for malicious purposes. Demonstrated from the usage of Comet ([[Perplexity]]), users can run arbitrary prompts to the browser's built-in AI assistant via hiding text in the HTML comments, non-visible webpage text, or simple comments on a webpage.<ref>{{Cite web |date=Aug 20, 2025 |title=Tweet from Brave |url=https://xcancel.com/brave/status/1958152314914508893#m |access-date=Aug 24, 2025 |website=X (formerly [[Twitter]])}}</ref> These arbitrary prompts can then be abused to hijack sensitive information, or worse, break into high-value accounts, such as for banking or game libraries.<ref>{{Cite web |date=Aug 23, 2025 |title=Tweet from zack (in SF) |url=https://xcancel.com/zack_overflow/status/1959308058200551721 |access-date=Aug 24, 2025 |website=X (formerly [[Twitter]])}}</ref>

~~== Wider impact ==~~

~~=== Enshittification ===~~

~~The most widespread usage and direct consumer relationship that affects even those who do not directly engage with LLM's is the furthering enshittification of the internet at large.~~

==References==

@@ Line 1: / Line 1: @@
-{{ToneWarning}}{{Incomplete|Issue 1=Article is currently under renovation.}}
+{{ToneWarning}}
-'''Artificial intelligence''' (AI) is a field of computer science producing software. Since the November 2022 launch of [[ChatGPT]], [[wikipedia:Large language model|large language model]] (LLM) chatbots have been a main focus of the industry with billions of dollars in funding allocated to producing more "intelligent" LLMs. Also a significant focus are [[wikipedia:Text-to-image model|text-to-image models]], which "draw" an image using written instructions, and [[wikipedia:Text-to-video model|text-to-video models]], which extend the text-to-image concept across several smooth video frames.
-The current well-funded, lucrative industry of artificial intelligence tools has resulted in rampant unethical use of content, both in its creation and in its implementation.
+'''Artificial intelligence''' (AI) is a field of computer science producing software that aims to ultimately replace all manual labor. AI is not a new concept - it has been of interest as early as the 1950s. Since the November 2022 launch of [[ChatGPT]], [[wikipedia:Large language model|large language model]] (LLM) chatbots have been a main focus of the industry, with billions of dollars in funding allocated to producing more "intelligent" LLMs. Also a significant focus are [[wikipedia:Text-to-image model|text-to-image models]], which "draw" an image using written instructions, and [[wikipedia:Text-to-video model|text-to-video models]], which extend the text-to-image concept across several smooth video frames.
-== Background ==
+[[wikipedia:Generative artificial intelligence|Generative artificial intelligence]] models are trained through vast amounts of existing human-generated content. Using the example of an LLM, by learning about common trends in sentence structure, the model is able to form complete sentences and show artificial "knowledge" of a topic. The artificial nature may cause [[wikipedia:Hallucination (artificial intelligence)|hallucination]] through confidently-written, but mostly or entirely incorrect, output.
+The current well-funded, lucrative industry of artificial intelligence tools has resulted in rampant unethical use of content. Startups intending to produce AI services have been scraping the internet for content to train future models at a fast pace, and members of the field are concerned that they are approaching the limit of publicly-available content to train from.<ref>{{Cite web |last=Tremayne-Pengelly |first=Alexandra |date=16 Dec 2024 |title=Ilya Sutskever Warns A.I. Is Running Out of Data—Here’s What Will Happen Next |url=https://observer.com/2024/12/openai-cofounder-ilya-sutskever-ai-data-peak/ |website=Observer}}</ref>
 ==Unethical website scraping==
-While "mainstream" companies such as [[OpenAI]], [[Anthropic]], and [[Meta]] appear to correctly follow industry-standard practice for web crawlers{{Citation needed}}, others ignore them, causing [[wikipedia:Denial-of-service attack|distributed denial of service attacks]] which damage access to freely-accessible websites. This is particularly an issue for websites that are large or contain many dynamic links.
+ Further Reading: [[Nonconsensual Scraping|Nonconsensual scraping]]
+While "mainstream" companies such as [[OpenAI]], [[Anthropic]], and [[Meta]] appear to correctly follow industry-standard practice for web crawlers, others ignore them, causing [[wikipedia:Denial-of-service attack|distributed denial of service attacks]] which damage access to freely-accessible websites. This is particularly an issue for websites that are large or contain many dynamic links.
 Ethical website scrapers, known as "spiders" that crawl the web, follow a certain set of minimum guidelines. Specifically, they follow [[wikipedia:robots.txt|robots.txt]], a text file found at the root of a domain that indicates:
@@ Line 95: / Line 97: @@
 In some cases, these AI models can also be hijacked for malicious purposes. Demonstrated from the usage of Comet ([[Perplexity]]), users can run arbitrary prompts to the browser's built-in AI assistant via hiding text in the HTML comments, non-visible webpage text, or simple comments on a webpage.<ref>{{Cite web |date=Aug 20, 2025 |title=Tweet from Brave |url=https://xcancel.com/brave/status/1958152314914508893#m |access-date=Aug 24, 2025 |website=X (formerly [[Twitter]])}}</ref> These arbitrary prompts can then be abused to hijack sensitive information, or worse, break into high-value accounts, such as for banking or game libraries.<ref>{{Cite web |date=Aug 23, 2025 |title=Tweet from zack (in SF) |url=https://xcancel.com/zack_overflow/status/1959308058200551721 |access-date=Aug 24, 2025 |website=X (formerly [[Twitter]])}}</ref>
-== Wider impact ==
-=== Enshittification ===
-The most widespread usage and direct consumer relationship that affects even those who do not directly engage with LLM's is the furthering enshittification of the internet at large.
 ==References==