Artificial intelligence: Difference between revisions

Drakeula (talk | contribs)
Unethical website scraping: Moved web scraping content to sub-article about training
-u-n- (talk | contribs)
m Unethical website scraping: Rename section, add brief information
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
{{Irrelevant}}{{ToneWarning}}
{{Irrelevant}}{{ToneWarning}}


'''Artificial intelligence''' (AI) is a field of computer science producing systems that aim to solve problems which humans solve by using intelligence. So far, no AI solutions are intelligent.  AI is not a new concept - it has been of interest as early as the 1950s. AI is a catch-all, it encompasses many areas and techniques, so merely saying that something uses AI tells one little about it.
'''Artificial intelligence''' (AI) is a field of computer science producing systems that aim to solve problems which humans solve by using intelligence. Under the consumer and industry space, it is commonly referred to as chatbots or [[wikipedia:Large language model|large language models]] (LLMs), which have been a main focus of industry
since the November 2022 launch of [[ChatGPT]], with tens of billions of dollars in funding allocated to producing more popular LLMs. Also a significant focus are [[wikipedia:Text-to-image model|text-to-image models]], which "draw" an image using written prompt, and less commonly, [[wikipedia:Text-to-video model|text-to-video models]], which extend the text-to-image concept across several smooth video frames.


Since the November 2022 launch of [[ChatGPT]], [[wikipedia:Large language model|large language model]] (LLM) chatbots have been a main focus of industry, with tens of billions of dollars in funding allocated to producing more popular LLMs. Also a significant focus are [[wikipedia:Text-to-image model|text-to-image models]], which "draw" an image using written prompt, and [[wikipedia:Text-to-video model|text-to-video models]], which extend the text-to-image concept across several smooth video frames.
So far, no AI solutions are intelligent.  AI is not a new concept - it has been of interest as early as the 1950s. AI is a catch-all, it encompasses many areas and techniques, so merely saying that something uses AI tells one little about it.


[[wikipedia:Generative artificial intelligence|Generative artificial intelligence]] models are trained through vast amounts of existing human-generated content. Using the example of an LLM, by gathering statistics on patterns of words that people use, the model can generate sequences of words that seem similar to what a person might have written.  LLM do not understand anything, they can not reason.  Everything they generate is just a randomly modulated pattern of tokens.  People reading sequences of tokens sometimes see things they think of as being true.  Sequences which do not make sense to the reader, or which are false are called [[wikipedia:Hallucination (artificial intelligence)|hallucination]].  LLM are typically trained to produce output which is pleasing to people, exhibiting [[dark patterns]], for example they often produce output which seems confidently-written, use patterns which praise the user (sycophancy) and emotionally manipulative language.   
[[wikipedia:Generative artificial intelligence|Generative artificial intelligence]] models are trained through vast amounts of existing human-generated content. Using the example of an LLM, by gathering statistics on patterns of words that people use, the model can generate sequences of words that seem similar to what a person might have written.  LLM do not understand anything, they can not reason.  Everything they generate is just a randomly modulated pattern of tokens.  People reading sequences of tokens sometimes see things they think of as being true.  Sequences which do not make sense to the reader, or which are false are called [[wikipedia:Hallucination (artificial intelligence)|hallucination]].  LLM are typically trained to produce output which is pleasing to people, exhibiting [[dark patterns]], for example they often produce output which seems confidently-written, use patterns which praise the user (sycophancy) and emotionally manipulative language.   
Line 13: Line 14:
The current well-funded, industry of artificial intelligence tools has resulted in rampant unethical use of content. Startups intending to produce AI services have been scraping the internet for content to train future models at a fast pace, and members of the field are concerned that they are approaching the limit of publicly-available content to train from.<ref>{{Cite web |last=Tremayne-Pengelly |first=Alexandra |date=16 Dec 2024 |title=Ilya Sutskever Warns A.I. Is Running Out of Data—Here’s What Will Happen Next |url=https://observer.com/2024/12/openai-cofounder-ilya-sutskever-ai-data-peak/ |website=Observer}}</ref>
The current well-funded, industry of artificial intelligence tools has resulted in rampant unethical use of content. Startups intending to produce AI services have been scraping the internet for content to train future models at a fast pace, and members of the field are concerned that they are approaching the limit of publicly-available content to train from.<ref>{{Cite web |last=Tremayne-Pengelly |first=Alexandra |date=16 Dec 2024 |title=Ilya Sutskever Warns A.I. Is Running Out of Data—Here’s What Will Happen Next |url=https://observer.com/2024/12/openai-cofounder-ilya-sutskever-ai-data-peak/ |website=Observer}}</ref>


==Unethical website scraping==
==Why is it a problem==
===Unethical training of data===
:Further reading: [[Artificial intelligence/training]]


Further Reading: [[Artificial intelligence/training]]
User's works are sometimes silently trained without the user's explicit consent, as was the case for [[Adobe's AI policy]].


==Privacy concerns of online AI models==
===Privacy concerns of online AI models===
There are several concerns with using online AI models like [[ChatGPT]] ([[OpenAI]]), not only because they are proprietary, but also because there is no guarantee to where your data ends up being stored or used for. Recent developments in local AI models are an alternative to these online AI models, as they work offline once they are downloaded from platforms like [https://huggingface.co/ HuggingFace]. Common models to run are like Llama ([[Meta]]), DeepSeek ([[DeepSeek]]), Phi ([[Microsoft]]), Mistral ([[Mistral AI]]), Gemma ([[Google]]).
There are several concerns with using online AI models like [[ChatGPT]] ([[OpenAI]]), not only because they are proprietary, but also because there is no guarantee to where your data ends up being stored or used for. Recent developments in local AI models are an alternative to these online AI models, as they work offline once they are downloaded from platforms like [https://huggingface.co/ HuggingFace]. Common models to run are like Llama ([[Meta]]), DeepSeek ([[DeepSeek]]), Phi ([[Microsoft]]), Mistral ([[Mistral AI]]), Gemma ([[Google]]).