5. April 2025

Parallellm Pump

Large Language Model (LLM) tools, such as ChatGPT and DeepSeek, have become a key part of people’s workflow, in professional and everyday usage. However, there are dozens of different providers now offering a myriad of options all at different price points; even a single provider has a multitude of models to choose from.

So where do you begin? The Parallellm Pump offers developers a power tool for making response comparisons, asynchronously, to let you be the judge of which provider returns the best result. Still not sure? You can even ask the LLMs themselves to make the decision for you!

19. February 2025

DeepSeek in the Cloud

In this post, I will share my experiences of running one of the DeepSeek open-weights models (DeepSeek-R1-Distill-Qwen-32B) directly on AWS hardware in the cloud - no need for API tokens.

The good news is that it’s easier than you think - modern libraries, such as PyTorch and the Hugging Face (🤗) transformers package, facilitate much of the heavy lifting. I found some extra tips and tricks along the way to speed things up and I will share these with you in this post.

20. December 2024

Let it Segment: A Gift from SAM

With the release of the Segment Anything Model¹ (SAM) released by Meta AI Research last year, the lie of the land changed quite substantially in Computer Vision, as now images could be segmented easily, with great results even zero-shot. With the release of SAM2² earlier this year, I wanted to get hands on and experiment with these models myself.

This post walks you through how SAM2 could be used in practice, provides a mini analysis of segmentation results and will be released with code so that you can explore further if you want to. This could be expanded to interesting use cases, such as facilitating object grasping in robotic systems, branded product addition or removal in marketing images, or mapping changes in forested areas from satellite imagery over time for environmental monitoring.

6. December 2024

Construction Timelapse

In this project a stunning timelapse video was created from an image stock of over 3,000 photos of a construction site, tracking the progress of a new residential building from breaking ground to completion, a process lasting more than three years. Those images were taken without a tripod, so the variability in camera positions and angles was of course high. To correct for this, Computer Vision techniques were used to predict key points in the images, that could then be used to straighten them and produce the final, steady timelapse.

4. October 2020

Atari Pong

This is a short post to describe my practical introduction to Reinforcement Learning (RL), where I trained a simple agent to play the classic Atari game Pong via a Deep Q-Network.

In English, this means we teach a novice computer to play the classic paddle game by allowing it to observe what happens when it performs various movements at different times and stages of gameplay (against the same, fairly strong opponent). Then, after making a sequence of movement choices, our agent either gets a point (reward of +1) or loses one (reward of -1). After a lot of trial and error, the agent will have observed enough situations to learn what is a good move to make at a given moment in the game.

28. April 2019

Find Tune

The objective of this project is to create a program that listens to a continuous stream of sound and identifies when a particular song - the target track - is playing. This is similar to how home assistants such as Amazon’s ‘Alexa’ function, except they seek out a different sound (their name). Ultimately, this project will be used to replay the detected positive sound to a speaker, serving as a doorbell amplifier.