Synthical
Your space
Profile
Activity
Favorites
Folders
Feeds
All articles
Claim page
Quentin Lhoest
Follow
Activity
Upvotes
Folders
Articles
8
Croissant: A Metadata Format for ML-Ready Datasets
9 December 2024 by
Mubashara Akhtar
and
others
at
Google
Machine Learning
,
Artificial Intelligence
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages
4 April 2023 by
Chris Chinenye Emezue
and
others
Computation and Language
The BigScience ROOTS Corpus: A 1.6TB Composite Multilingual Dataset
7 March 2023 by
Hugo Laurençon
and
others
at
Leipzig
Computation and Language
,
Artificial Intelligence
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
11 December 2022 by
Bigscience Workshop
and
others
Computation and Language
Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements
6 October 2022 by
Leandro Von Werra
and
others
Machine Learning
Training Transformers Together
7 July 2022 by
Alexander Borzunov
and
others
Machine Learning
,
Distributed, Parallel, and Cluster Computing
Distributed Deep Learning in Open Collaborations
8 November 2021 by
Michael Diskin
and
others
Machine Learning
,
Distributed, Parallel, and Cluster Computing
Datasets: A Community Library for Natural Language Processing
7 September 2021 by
Quentin Lhoest
and
others
Computation and Language
This is an AI-generated summary
Key points
Topics
Machine Learning
Computation and Language
Artificial Intelligence
Distributed, Parallel, and Cluster Computing
Databases
Information Retrieval