Wow. This is a HUGE 24-trillion-token web dataset with document-level metadata available on a popular AI platform



apache-2.0 li…
TOKEN2.71%
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • 7
  • Repost
  • Share
Comment
0/400
ForkMastervip
· 08-06 22:51
Play people for suckers AI, raise three cubs, and use the proceeds from money laundering for charity.
View OriginalReply0
ser_ngmivip
· 08-06 03:54
24 trill tokens wtf sheesh... absolute madness ser
Reply0
FlatlineTradervip
· 08-05 19:22
This dataset is really f***ing awesome.
View OriginalReply0
HashBardvip
· 08-03 23:59
lmao data is eating the whole web... ngmi when ai gets too thicc fr fr
Reply0
ForkLibertarianvip
· 08-03 23:59
This is ridiculous, who can use it all?
View OriginalReply0
CoffeeNFTsvip
· 08-03 23:59
What's the use of so much data? It's exhausting.
View OriginalReply0
SerLiquidatedvip
· 08-03 23:55
Isn't that an exaggeration? Can you really learn it?
View OriginalReply0
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate app
Community
English
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)