Below you will find pages that contain the key word “MachineLearning”:
Let it Segment: A Gift from SAM
With the release of the Segment Anything Model1 (SAM) released by Meta AI Research last year, the lie of the land changed quite substantially in Computer Vision, as now images could be segmented easily, with great results even zero-shot. With the release of SAM22 earlier this year, I wanted to get hands on and experiment with these models myself.
This post walks you through how SAM2 could be used in practice, provides a mini analysis of segmentation results and will be released with code so that you can explore further if you want to. This could be expanded to interesting use cases, such as facilitating object grasping in robotic systems, branded product addition or removal in marketing images, or mapping changes in forested areas from satellite imagery over time for environmental monitoring.
Construction Timelapse
In this project a stunning timelapse video was created from an image stock of over 3,000 photos of a construction site, tracking the progress of a new residential building from breaking ground to completion, a process lasting more than three years. Those images were taken without a tripod, so the variability in camera positions and angles was of course high. To correct for this, Computer Vision techniques were used to predict key points in the images, that could then be used to straighten them and produce the final, steady timelapse.
Atari Pong
This is a short post to describe my practical introduction to Reinforcement Learning (RL), where I trained a simple agent to play the classic Atari game Pong via a Deep Q-Network.
In English, this means we teach a novice computer to play the classic paddle game by allowing it to observe what happens when it performs various movements at different times and stages of gameplay (against the same, fairly strong opponent). Then, after making a sequence of movement choices, our agent either gets a point (reward of +1) or loses one (reward of -1). After a lot of trial and error, the agent will have observed enough situations to learn what is a good move to make at a given moment in the game.
Find Tune
The objective of this project is to create a program that listens to a continuous stream of sound and identifies when a particular song - the target track - is playing. This is similar to how home assistants such as Amazon’s ‘Alexa’ function, except they seek out a different sound (their name). Ultimately, this project will be used to replay the detected positive sound to a speaker, serving as a doorbell amplifier.