Entropy Blog
2024
LogitLens from scratch with Hugging Face Transformers
In this short tutorial, we’ll implement LogitLens to inspect the inner representations of a pre-trained Phi-1.5. LogitLens is a straightforward yet effective...
Vision Transformer in pure JAX.
I decided to do this for two reasons. The first reason is that, for years, I had to bear my Ph.D. advisor coming into the lab while I was happily coding my P...
Visualizing attention maps in pre-trained Vision Transformers from Timm
Goal: Visualizing the attention maps for the CLS token in a pretrained Vision Transformer from the timm library.
Short notes on types of parallelism for training neural networks
As neural networks grow larger (see LLMs, though now it looks like we also have a trend towards smaller models with Gemma2-2b ) and datasets become more mass...
Efficiency Metrics in Machine Learning
In the world of machine learning, efficiency is a buzzword we hear all the time. New methods or models often come with the claim of being more efficient than...
Flops with Pytorch built-in flops counter
It is becoming more and more common to use FLOPs (floating point operations) to measure the computational cost of deep learning models. For Pytorch users, un...
Adaptive Computation Modules
This brief post summarizes a project I have been working on over the past months. You can find further details about this work here
2023
Manifold learning
I stumbled across this concept a lot of times, so here I am writing a brief recap to myself.
Explainability for Graphs with Pytorch Geometric and Captum
In this Colab Notebook we show how to use explainability methods on Graph Neural Networks.
Entropy and Self Information
This post contains short notes on entropy and self information and why machine learning adopted them from information theory.
A Primer on Graph Neural Networks with Pytorch Geometric
In this Colab Notebook we show how to train a simple Graph Neural Network on the MUTAG dataset.