Synthical
Your space
Profile
Activity
Favorites
Folders
Feeds
All articles
Simple
Original
Articles by
Martin Jaggi
Could ChatGPT get an Engineering Degree? Evaluating Higher Education Vulnerability to AI Assistants
27 November 2024 by
Beatriz Borges
and
others
Computers and Society
,
Artificial Intelligence
Analyzing & Reducing the Need for Learning Rate Warmup in GPT Training
31 October 2024 by
Atli Kosson
and
others
Machine Learning
QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs
29 October 2024 by
Saleh Ashkboos
and
others
Machine Learning
Improving Stochastic Cubic Newton with Momentum
25 October 2024 by
El Mahdi Chayti
and
others
Optimization and Control
,
Machine Learning
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations
17 October 2024 by
Alexander Hägele
and
others
Machine Learning
HyperINF: Unleashing the HyperPower of the Schulz's Method for Data Influence Estimation
7 October 2024 by
Xinyu Zhou
and
others
Machine Learning
On-Device Collaborative Language Modeling via a Mixture of Generalists and Specialists
1 October 2024 by
Dongyang Fan
and
others
Machine Learning
,
Computation and Language
CoBo: Collaborative Learning via Bilevel Optimization
9 September 2024 by
Diba Hashemi
and
others
Machine Learning
,
Distributed, Parallel, and Cluster Computing
A New First-Order Meta-Learning Algorithm with Convergence Guarantees
5 September 2024 by
El Mahdi Chayti
and
Martin Jaggi
at
EPFL
Machine Learning
,
Optimization and Control
Unified Convergence Theory of Stochastic and Variance-Reduced Cubic Newton Methods
5 September 2024 by
El Mahdi Chayti
and
others
at
EPFL
Optimization and Control
,
Machine Learning
Personalized Collaborative Fine-Tuning for On-Device Large Language Models
6 August 2024 by
Nicolas Wagner
and
others
at
EPFL
Computation and Language
,
Machine Learning
Rotational Equilibrium: How Weight Decay Balances Learning Across Neural Networks
3 June 2024 by
Atli Kosson
and
others
at
EPFL
Machine Learning
Effective Interplay between Sparsity and Quantization: From Theory to Practice
31 May 2024 by
Simla Burcu Harma
and
others
Machine Learning
,
Artificial Intelligence
Accuracy Booster: Enabling 4-bit Fixed-point Arithmetic for DNN Training
31 May 2024 by
Simla Burcu Harma
and
others
Machine Learning
Deep Grokking: Would Deep Neural Networks Generalize Better?
29 May 2024 by
Simin Fan
and
others
at
EPFL
Machine Learning
InterpretCC: Intrinsic User-Centric Interpretability through Global Mixture of Experts
29 May 2024 by
Vinitra Swamy
and
others
at
EPFL
Machine Learning
,
Computers and Society
The Privacy Power of Correlated Noise in Decentralized Learning
3 May 2024 by
Youssef Allouah
and
others
Machine Learning
,
Cryptography and Security
DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging
21 March 2024 by
Matteo Pagliardini
and
others
at
EPFL
Computation and Language
,
Machine Learning
Layer-wise Linear Mode Connectivity
19 March 2024 by
Linara Adilova
and
others
Machine Learning
Towards an empirical understanding of MoE design choices
20 February 2024 by
Dongyang Fan
and
others
Machine Learning
,
Artificial Intelligence
On Convergence of Incremental Gradient for Non-Convex Smooth Functions
12 February 2024 by
Anastasia Koloskova
and
others
Machine Learning
,
Optimization and Control
Spectral Preconditioning for Gradient Methods on Graded Non-convex Functions
7 February 2024 by
Nikita Doikov
and
others
at
EPFL
Optimization and Control
Attention with Markov: A Framework for Principled Analysis of Transformers via Markov Chains
6 February 2024 by
Ashok Vardhan Makkuva
and
others
at
EPFL
Machine Learning
,
Computation and Language
LASER: Linear Compression in Wireless Distributed Optimization
6 February 2024 by
Ashok Vardhan Makkuva
and
others
Neural and Evolutionary Computing
,
Artificial Intelligence
DoGE: Domain Reweighting with Generalization Estimation
5 February 2024 by
Simin Fan
and
others
at
EPFL
Machine Learning
,
Artificial Intelligence
Ghost Noise for Regularizing Deep Neural Networks
19 December 2023 by
Atli Kosson
and
others
Machine Learning
Provably Personalized and Robust Federated Learning
18 December 2023 by
Mariel Werner
and
others
Machine Learning
,
Distributed, Parallel, and Cluster Computing
MEDITRON-70B: Scaling Medical Pretraining for Large Language Models
1
27 November 2023 by
Zeming Chen
and
others
at
EPFL
Computation and Language
,
Artificial Intelligence
Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing
22 November 2023 by
Sai Praneeth Karimireddy
and
others
Machine Learning
Landmark Attention: Random-Access Infinite Context Length for Transformers
20 November 2023 by
Amirkeivan Mohtashami
and
Martin Jaggi
at
EPFL
Computation and Language
,
Machine Learning
Load more