This is dumb. But, there’s a hint of an interesting idea in there. If LLMs sample all human text and produce statistical averages from it, there’s a sense in which they contain a statistically average opinion.
It’s basically like how if you use Google to search for “how many calories are there in a” it will suggest the next word. The word it suggests is the statistically average way to finish that sentence. That also means it’s the food item for which people most want to know the calories. At least, it’s the item they most type into a Google search box. It’s just matching text patterns, but it reveals something about people that say a fast food company might find useful.
If you scale up the population of humanity to 8 trillion people and have thousands of years of data in these LLMs, maybe you actually do get useful insights about what people care about. And, maybe that’s how you get psychohistory from Asimov’s foundation series.
Even if it did spit out the average value, it would be the average of the training data. I don’t think people/opinions are evenly distributed in LLM training data.
Just look into how racist computer vision models can be.
This is dumb. But, there’s a hint of an interesting idea in there. If LLMs sample all human text and produce statistical averages from it, there’s a sense in which they contain a statistically average opinion.
It’s basically like how if you use Google to search for “how many calories are there in a” it will suggest the next word. The word it suggests is the statistically average way to finish that sentence. That also means it’s the food item for which people most want to know the calories. At least, it’s the item they most type into a Google search box. It’s just matching text patterns, but it reveals something about people that say a fast food company might find useful.
If you scale up the population of humanity to 8 trillion people and have thousands of years of data in these LLMs, maybe you actually do get useful insights about what people care about. And, maybe that’s how you get psychohistory from Asimov’s foundation series.
Even if it did spit out the average value, it would be the average of the training data. I don’t think people/opinions are evenly distributed in LLM training data.
Just look into how racist computer vision models can be.
Its one thing if you present your findings like that, but pretending like its a poll is complete bullshit.
Yeah, definitely.