Generative AI: Difference between revisions

Line 29:

==Specific controversies involving generative AI==

===Reddit ~~training~~ AI on ~~posts~~===

===Reddit - Training AI on user-generated content===

In late 2024, [[Reddit]] announced the release of 'Reddit Answers,' a large language model (LLM) that was publicly stated<ref>https://support.reddithelp.com/hc/en-us/articles/32026729424916-Reddit-Answers-Currently-in-Beta</ref> to use content created by users to train the tool, without requiring prior consent or prior public notice.

===DeviantArt DreamUp===

===DeviantArt - DreamUp===

While more speculative, it is reasonable for users to assume<ref>https://www.reddit.com/r/AI_Generator_Guide/comments/167sbit/what_i_think_about_deviantarts_ai_choices/</ref> that when [[DeviantArt]] initially automatically opted all users into allowing their work to be training data for generative AI<ref>https://www.deviantart.com/izzy-paw/journal/deviantart-s-AI-art-program-and-how-to-opt-out-936581886</ref><ref>https://en.digitalreport.com.tr/deviantart-dreamup-how-to-use-ai-opt-out/</ref>, that all content uploaded to DeviantArt was used as training data for their DreamUp tool, however according to statements from DeviantArt CEO Moti Levy<ref>https://www.theverge.com/2022/11/15/23449036/deviantart-ai-art-dreamup-training-data-controversy</ref>, DeviantArt did not plan or intend to train their tool based on user-generated works and that any user-generated works that were used in their model, were introduced by StabilityAI. Regardless, the introduction of DreamUp to the art sharing platform has both stirred controversy on the platform<ref>https://arstechnica.com/information-technology/2022/11/deviantart-upsets-artists-with-its-new-ai-art-generator-dreamup/</ref>, and also fractured the platform into 2 parties<ref>https://www.youtube.com/watch?v=IGj_3OhMrAU</ref>, those for generative AI (typically those who hold newer accounts) and those against (typically users who have existed on the platform for far longer.) Due to the introduction of DreamUp, the platform has been cluttered by AI generated images, and staff have historically, frequently, and intentionally featured multiple users who exclusively upload GenAI content<ref>https://www.deviantart.com/team/art/DeviantArt-Seller-StygianAI-1077776294</ref><ref>https://www.deviantart.com/team/art/Create-on-DeviantArt-VeilAI-1108146133</ref><ref>https://www.deviantart.com/team/art/DeviantArt-Seller-ExeFelix-1075192370</ref> or post content that uses generative content as a base<ref>https://www.deviantart.com/team/art/Create-on-DeviantArt-AKoukis-1108151629</ref>, with a majority of featured creators being ones who nearly or exclusively upload AI generated content.<!-- I was scrolling through their gallery and most featured artist posts were about AI creators, I stopped my search when I reached posts that released before the generative AI controversies on the platform occurred, which had a rough stopping point of around Q4 2022.

https://www.deviantart.com/team/gallery -->

Line 39:

Many users have had their content scraped by LAION to power their training database, and the only way they can opt out is via a third party<ref>https://haveibeentrained.com/</ref>.

====<big>~~STACKOVERFLOW~~ training data</big>====

====<big>StackOverflow - Overflow AI training data based on user generated content</big>====

In ~~mid~~ 2023, ~~StackOverflow~~ released [https://stackoverflow.blog/2023/07/27/announcing-overflowai/ ~~Overflow AI~~] ~~which uses all~~ users ~~questions and~~ answers ~~for their neural network that they charge enterprises for~~. ~~Despite their strong stance on [~~https://stackoverflow.com/help/gen-ai-policy ~~Generated AI Post Policy] they still require~~ users to not ~~generate AI responses despite scraping your human~~ content for ~~profit. As of now there still~~ is ~~no official way~~ to ~~opt out officially other than deleting~~ all ~~your~~ posts/topics ~~manually~~ before ~~permanently~~ deleting your account~~. But even then~~ their [https://stackoverflow.com/help/delete-content ~~Get Out Clause] still~~ allows them to ~~use it even after you thought you deleted it. They have completely burned their bridge of privacy and user trust for the platform and brand and the users~~ content that ~~the site~~ was ~~built by~~.

In Late July, 2023, [[Overflow AI]] was released by [[StackOverflow]]. The content used to train this AI was built off of questions and answers left by the StackOverflow community.<ref>{{Cite news |last=StackOverflow |date=Jul 27, 2023 |title=Announcing OverflowAI |url=https://stackoverflow.blog/2023/07/27/announcing-overflowai/ |access-date=Mar 30, 2025 |work=StackOverflow blog}}</ref> This action essentially subverts an existing policy from [[StackOverflow]], where users cannot generate AI responses for answers.<ref>{{Cite web |last=StackOverflow |title=Generative AI Policy |url=https://stackoverflow.com/help/gen-ai-policy |access-date=Mar 30, 2025 |website=StackOverflow help center}}</ref> The only way users currently have the capability of not having their content scraped for Overflow AI is to manually delete all posts and topics before deleting your account, however the effectiveness is questionable considering their [[Get Out Clause]],<ref>{{Cite web |last=StackOverflow |title=How do I delete all my contributions? |url=https://stackoverflow.com/help/delete-content |access-date=Mar 30, 2025 |website=StackOverflow help center}}</ref> which allows them to retrieve content that was deleted.

===~~Microsofts Github Copilot training~~ on free user ~~repos~~===

===Microsoft - GitHub CoPilot trained on free user repositories===

~~According to this~~ [https://copilot.github.trust.page/faq?s=v2qe7voltpwtv2usl4ikhs#ip-and-open-source ~~Copilot FAQ] topic it will~~ not ~~train on people that are on PRO or ENTERPRISE plans but says nothing about~~ the ~~people using Github under~~ a ~~free account~~. ~~With~~ that ~~being said we can assume that any and all accounts that copilot is~~ in ~~currently is being used to further train the model. As of now there is no known way of opting out~~ of this ~~besides privatizing your repo (from context~~ it ~~seems only public ones are targeted for now) making the reach of your open source project limited~~ to ~~individuals you share it with~~. In some ~~cases~~ this has eroded ~~community~~ [https://github.com/orgs/community/discussions/152229 ~~Users Trust] with the platform that is built around developers content.~~

Labeled on the FAQ for [[GitHub CoPilot]],<ref>{{Cite web |last=CoPilot |title=FAQ |url=https://copilot.github.trust.page/faq?s=v2qe7voltpwtv2usl4ikhs#ip-and-open-source |access-date=Mar 30, 2025 |website=github}}</ref> users who pay for either a ''Pro'' or ''Enterprise'' tier plan do not have their repositories (''repos'') scanned for the purposes of training CoPilot. This can be considered a form of [[racketeering]], as consumers are forced into paying if they wish to not have their content be indirectly profited off of by [[Microsoft: Family 365 subscripcion forced upsell|Microsoft]]. There are theories that private repos may not be used for training purposes,{{Citation needed}} but it is unable to be verified at this time. Users on this platform have shown some backlash since this has eroded trust in [[GitHub]].<ref>{{Cite web |last=Dlindmark |date=Feb 23, 2025 |title=Does GitHub Copilot use any code from individual users to train GitHub's model (or any successor model)? #152229 |url=https://github.com/orgs/community/discussions/152229 |access-date=Mar 30, 2025 |website=GitHub}}</ref>

==References==

@@ Line 29: / Line 29: @@
 ==Specific controversies involving generative AI==
-===Reddit training AI on posts===
+===Reddit - Training AI on user-generated content===
 In late 2024, [[Reddit]] announced the release of 'Reddit Answers,' a large language model (LLM) that was publicly stated<ref>https://support.reddithelp.com/hc/en-us/articles/32026729424916-Reddit-Answers-Currently-in-Beta</ref> to use content created by users to train the tool, without requiring prior consent or prior public notice. <!-- Needs further coverage here -->
-===DeviantArt DreamUp<!-- Considering the over 2 year long history that continues to have new drama stir from this, we should look into eventually making a dedicated article focused on DreamUp -->===
+===DeviantArt - DreamUp<!-- Considering the over 2 year long history that continues to have new drama stir from this, we should look into eventually making a dedicated article focused on DreamUp -->===
 While more speculative, it is reasonable for users to assume<ref>https://www.reddit.com/r/AI_Generator_Guide/comments/167sbit/what_i_think_about_deviantarts_ai_choices/</ref> that when [[DeviantArt]] initially automatically opted all users into allowing their work to be training data for generative AI<ref>https://www.deviantart.com/izzy-paw/journal/deviantart-s-AI-art-program-and-how-to-opt-out-936581886</ref><ref>https://en.digitalreport.com.tr/deviantart-dreamup-how-to-use-ai-opt-out/</ref>, that all content uploaded to DeviantArt was used as training data for their DreamUp tool, however according to statements from DeviantArt CEO Moti Levy<ref>https://www.theverge.com/2022/11/15/23449036/deviantart-ai-art-dreamup-training-data-controversy</ref>, DeviantArt did not plan or intend to train their tool based on user-generated works and that any user-generated works that were used in their model, were introduced by StabilityAI. Regardless, the introduction of DreamUp to the art sharing platform has both stirred controversy on the platform<ref>https://arstechnica.com/information-technology/2022/11/deviantart-upsets-artists-with-its-new-ai-art-generator-dreamup/</ref>, and also fractured the platform into 2 parties<ref>https://www.youtube.com/watch?v=IGj_3OhMrAU</ref>, those for generative AI (typically those who hold newer accounts) and those against (typically users who have existed on the platform for far longer.) Due to the introduction of DreamUp, the platform has been cluttered by AI generated images, and staff have historically, frequently, and intentionally featured multiple users who exclusively upload GenAI content<ref>https://www.deviantart.com/team/art/DeviantArt-Seller-StygianAI-1077776294</ref><ref>https://www.deviantart.com/team/art/Create-on-DeviantArt-VeilAI-1108146133</ref><ref>https://www.deviantart.com/team/art/DeviantArt-Seller-ExeFelix-1075192370</ref> or post content that uses generative content as a base<ref>https://www.deviantart.com/team/art/Create-on-DeviantArt-AKoukis-1108151629</ref>, with a majority of featured creators being ones who nearly or exclusively upload AI generated content.<!-- I was scrolling through their gallery and most featured artist posts were about AI creators, I stopped my search when I reached posts that released before the generative AI controversies on the platform occurred, which had a rough stopping point of around Q4 2022.
 https://www.deviantart.com/team/gallery --><!-- Due to my close familiarity with the situation, yes, I developed this section a lot more than initially planned. -->
@@ Line 39: / Line 39: @@
 Many users have had their content scraped by LAION to power their training database, and the only way they can opt out is via a third party<ref>https://haveibeentrained.com/</ref>.
-====<big>STACKOVERFLOW training data</big>====
+====<big>StackOverflow - Overflow AI training data based on user generated content</big>====
-In mid 2023, StackOverflow released [https://stackoverflow.blog/2023/07/27/announcing-overflowai/ Overflow AI] which uses all users questions and answers for their neural network that they charge enterprises for. Despite their strong stance on [https://stackoverflow.com/help/gen-ai-policy Generated AI Post Policy] they still require users to not generate AI responses despite scraping your human content for profit. As of now there still is no official way to opt out officially other than deleting all your posts/topics manually before permanently deleting your account. But even then their [https://stackoverflow.com/help/delete-content Get Out Clause] still allows them to use it even after you thought you deleted it. They have completely burned their bridge of privacy and user trust for the platform and brand and the users content that the site was built by.
+In Late July, 2023, [[Overflow AI]] was released by [[StackOverflow]]. The content used to train this AI was built off of questions and answers left by the StackOverflow community.<ref>{{Cite news |last=StackOverflow |date=Jul 27, 2023 |title=Announcing OverflowAI |url=https://stackoverflow.blog/2023/07/27/announcing-overflowai/ |access-date=Mar 30, 2025 |work=StackOverflow blog}}</ref> This action essentially subverts an existing policy from [[StackOverflow]], where users cannot generate AI responses for answers.<ref>{{Cite web |last=StackOverflow |title=Generative AI Policy |url=https://stackoverflow.com/help/gen-ai-policy |access-date=Mar 30, 2025 |website=StackOverflow help center}}</ref> The only way users currently have the capability of not having their content scraped for Overflow AI is to manually delete all posts and topics before deleting your account, however the effectiveness is questionable considering their [[Get Out Clause]],<ref>{{Cite web |last=StackOverflow |title=How do I delete all my contributions? |url=https://stackoverflow.com/help/delete-content |access-date=Mar 30, 2025 |website=StackOverflow help center}}</ref> which allows them to retrieve content that was deleted.
-===Microsofts Github Copilot training on free user repos===
+===Microsoft - GitHub CoPilot trained on free user repositories===
-According to this [https://copilot.github.trust.page/faq?s=v2qe7voltpwtv2usl4ikhs#ip-and-open-source Copilot FAQ] topic it will not train on people that are on PRO or ENTERPRISE plans but says nothing about the people using Github under a free account. With that being said we can assume that any and all accounts that copilot is in currently is being used to further train the model. As of now there is no known way of opting out of this besides privatizing your repo (from context it seems only public ones are targeted for now) making the reach of your open source project limited to individuals you share it with. In some cases this has eroded community [https://github.com/orgs/community/discussions/152229 Users Trust] with the platform that is built around developers content.<!-- Possibly move this topic into a main page as forms of racketeering is pretty serious -->
+Labeled on the FAQ for [[GitHub CoPilot]],<ref>{{Cite web |last=CoPilot |title=FAQ |url=https://copilot.github.trust.page/faq?s=v2qe7voltpwtv2usl4ikhs#ip-and-open-source |access-date=Mar 30, 2025 |website=github}}</ref> users who pay for either a ''Pro'' or ''Enterprise'' tier plan do not have their repositories (''repos'') scanned for the purposes of training CoPilot. This can be considered a form of [[racketeering]], as consumers are forced into paying if they wish to not have their content be indirectly profited off of by [[Microsoft: Family 365 subscripcion forced upsell|Microsoft]]. There are theories that private repos may not be used for training purposes,{{Citation needed}}<!-- Mentioned in previous version of this section --> but it is unable to be verified at this time. Users on this platform have shown some backlash since this has eroded trust in [[GitHub]].<ref>{{Cite web |last=Dlindmark |date=Feb 23, 2025 |title=Does GitHub Copilot use any code from individual users to train GitHub's model (or any successor model)? #152229 |url=https://github.com/orgs/community/discussions/152229 |access-date=Mar 30, 2025 |website=GitHub}}</ref><!-- Possibly move this topic into a main page as forms of racketeering is pretty serious -->
 ==References==
 <references />