04:00
Gata releases a large-scale real user ChatGPT dialogue dataset, ChatGPT-RealUser-2.2M.
ChainCatcher news, decentralized AI infrastructure company Gata announced the launch of the global large-scale real user ChatGPT dialogue dataset ChatGPT-RealUser-2.2M. This dataset was collected through Gata's GPT-to-Earn program (voluntary participation by users), gathering over 2.24 million real dialogues and nearly 3.56 million Q&A pairs from more than 15,000 real users, covering interactions with GPT-3.5, GPT-4, and o1.
According to the introduction, this dataset is about twice the size of previous similar datasets from the Allen Institute for AI, covering real scenarios and multi-turn dialogues, and due to the on-chain incentive mechanism, it contains a large number of cryptocurrency-related interactions. The preview version has been released.
GPT-3.24%