Synthical
Your space
Profile
Activity
Favorites
Folders
Feeds
All articles
Claim page
Tim Dettmers
Follow
Activity
Upvotes
Folders
Articles
18
Holistically Evaluating the Environmental Impact of Creating Language Models
3 March 2025 by
Jacob Morrison
and
others
Computers and Society
,
Artificial Intelligence
OLMoE: Open Mixture-of-Experts Language Models
3 March 2025 by
Niklas Muennighoff
and
others
at
University of Washington
Computation and Language
,
Artificial Intelligence
MatFormer: Nested Transformer for Elastic Inference
15 December 2024 by
Devvrit
and
others
Machine Learning
,
Computation and Language
OLMoE: Open Mixture-of-Experts Language Models
3 September 2024 by
Niklas Muennighoff
and
others
Computation and Language
,
Artificial Intelligence
Scaling Retrieval-Based Language Models with a Trillion-Token Datastore
9 July 2024 by
Rulin Shao
and
others
Computation and Language
,
Artificial Intelligence
Distributed Inference and Fine-tuning of Large Language Models Over The Internet
13 December 2023 by
Alexander Borzunov
and
others
Machine Learning
,
Distributed, Parallel, and Cluster Computing
Towards A Unified View of Sparse Feed-Forward Network in Pretraining Large Language Model
24 October 2023 by
Zeyu Leo Liu
and
others
Computation and Language
,
Machine Learning
Stable and low-precision training for large-scale vision-language models
17 October 2023 by
Mitchell Wortsman
and
others
Machine Learning
,
Computer Vision and Pattern Recognition
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient
29 June 2023 by
Max Ryabinin
and
others
Distributed, Parallel, and Cluster Computing
,
Machine Learning
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
5 June 2023 by
Tim Dettmers
and
others
at
University of Washington
Computation and Language
,
Machine Learning
QLoRA: Efficient Finetuning of Quantized LLMs
23 May 2023 by
Tim Dettmers
and
others
at
University of Washington
Machine Learning
Petals: Collaborative Inference and Fine-tuning of Large Models
2 March 2023 by
Alexander Borzunov
and
others
at
University of Washington
Machine Learning
,
Distributed, Parallel, and Cluster Computing
The case for 4-bit precision: k-bit Inference Scaling Laws
28 February 2023 by
Tim Dettmers
and
Luke Zettlemoyer
Machine Learning
,
Neural and Evolutionary Computing
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
11 December 2022 by
Bigscience Workshop
and
others
Computation and Language
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
10 November 2022 by
Tim Dettmers
and
others
Machine Learning
,
Artificial Intelligence
Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models
5 August 2022 by
Margaret Li
and
others
Computation and Language
Training Transformers Together
7 July 2022 by
Alexander Borzunov
and
others
Machine Learning
,
Distributed, Parallel, and Cluster Computing
BASE Layers: Simplifying Training of Large, Sparse Models
30 March 2021 by
Mike Lewis
and
others
Computation and Language
Sparse Networks from Scratch: Faster Training without Losing Performance
23 August 2019 by
Tim Dettmers
and
Luke Zettlemoyer
Machine Learning
,
Neural and Evolutionary Computing
Jack the Reader - A Machine Reading Framework
20 June 2018 by
Dirk Weissenborn
and
others
Computation and Language
,
Machine Learning
Convolutional 2D Knowledge Graph Embeddings
8 July 2017 by
Tim Dettmers
and
others
Machine Learning
8-Bit Approximations for Parallelism in Deep Learning
22 November 2015 by
Tim Dettmers
Neural and Evolutionary Computing
,
Machine Learning
This is an AI-generated summary
Key points
Topics
Machine Learning
Computation and Language
Distributed, Parallel, and Cluster Computing
Neural and Evolutionary Computing
Computer Vision and Pattern Recognition
Artificial Intelligence