aifaq.wtf

"How do you know about all this AI stuff?"
I just read tweets, buddy.

#link

Page 3 of 17

AI is already reshaping newsrooms, AP study finds - Poynter

#journalism   #link  

Nearly 70% of newsroom staffers from a variety of backgrounds and organizations surveyed in December say they’re using the technology for crafting social media posts, newsletters and headlines; translation and transcribing interviews; and story drafts, among other uses. One-fifth said they’d used generative AI for multimedia, including social graphics and videos.

OpenAI transcribed over a million hours of YouTube videos to train GPT-4

#training data   #youtube   #openai   #ethics   #link  

FABLES: Evaluating faithfulness and content selection in book-length summarization

#hallucinations   #summarization   #context   #content window   #link   #claude   #openai   #mixtral   #gpt-4  

An analysis of the annotations reveals that most unfaithful claims relate to events and character states, and they generally require indirect reasoning over the narrative to invalidate.

What kinds of things are AI tools especially bad at?

Something about calling an AI's work "well-done" feels far more anthropomorphic than it should.

While LLM-based auto-raters have proven reliable for factuality and coherence in other settings, we implement several LLM raters of faithfulness and find that none correlates strongly with human annotations, especially with regard to detecting unfaithful claims

Of course this needs a link to my favorite hallucination leaderboard. It's tough since of course it costs money to do this in a way that doesn't rely on LLMs to create and score the dataset. Which leads to...

Collecting human annotations on 26 books cost us $5.2K, demonstrating the difficulty of scaling our workflow to new domains and datasets.

$5k is is somehow cost prohibitive between UMass, Princeton, Adobe, and an AI institute? That... I don't know, seems like not very much money. I get the understanding that this is "best" done for pennies, but if someone had to cough up $5k each year to repeat this with newly-unknown data I don't think it would be the worst thing in the world.

Finally, we move beyond faithfulness by exploring content selection errors in book-length summarization: we develop a typology of omission errors related to crucial narrative elements and also identify a systematic over-emphasis on events occurring towards the end of the book.

Here's the omission types:

Elon Musk's X pushed a fake headline about Iran attacking Israel. X's AI chatbot Grok made it up.

#misinformation and disinformation   #journalism   #grok   #twitter   #link  

Permission is hereby granted | Suno

#audio   #music   #song   #link  

LEDITS++ - a Hugging Face Space by editing-images

#leditplusplus   #image editing   #link  

It's just a fun little image editing space.

Google Books Is Indexing AI-Generated Garbage

#training data   #model collapse   #google books   #dead internet theory   #spam content and pink slime   #google ngram viewer   #link  

How Tech Giants Cut Corners to Harvest Data for A.I.

#training data   #youtube   #openai   #ethics   #link  

If it exists online, they're gonna take it.

Some OpenAI employees discussed how such a move might go against YouTube’s rules, three people with knowledge of the conversations said. YouTube, which is owned by Google, prohibits use of its videos for applications that are “independent” of the video platform.

Ultimately, an OpenAI team transcribed more than one million hours of YouTube videos, the people said. The team included Greg Brockman, OpenAI’s president, who personally helped collect the videos, two of the people said. The texts were then fed into a system called GPT-4, which was widely considered one of the world’s most powerful A.I. models and was the basis of the latest version of the ChatGPT chatbot.

Doesn't matter what original license you might have granted or what makes sense, it's alllll theirs.

Last year, Google also broadened its terms of service. One motivation for the change, according to members of the company’s privacy team and an internal message viewed by The Times, was to allow Google to be able to tap publicly available Google Docs, restaurant reviews on Google Maps and other online material for more of its A.I. products.

I think my favorite point in the piece is how Meta came from a weaker position because Facebook users don't post essay-like content.

world_sim

#uncategorized   #link  

Models All The Way Down

#uncategorized   #link  

AI NPCs Have Potential, But Not The Kind Big Video Game Companies Want - Aftermath

#uncategorized   #link  

Diving Deeper into AI Package Hallucinations

#uncategorized   #link  

Extracting Social Support and Social Isolation Information from Clinical Psychiatry Notes: Comparing a Rule-based NLP System and a Large Language Model

#uncategorized   #link  

Can you trust ChatGPT’s package recommendations?

#uncategorized   #link  

Pluralistic: Humans are not perfectly vigilant (01 Apr 2024) – Pluralistic: Daily links from Cory Doctorow

#uncategorized   #link