Wisdom of the silicon crowd: LLM ensemble prediction capabilities rival human crowd accuracy
The authors show that the ability of using an ensemble fo LLMs making probabilistic predictions about 31 binary questions yielded no statistical deviation between humans and LLM predictions. They also find human-like biases, and the ability to improve the prediction when exposed to the median human prediction.