TIL How to Take Hundred of Images Through Google Colab
Aisha
Posted on April 8, 2024
I recently embarked on a fascinating journey inspired by Nicholas Renotte's tutorial on building a Facial Recognition Model from a research paper to code. Despite its complexity, the tutorial offered clear guidance, yet I encountered a few hurdles, particularly with Google Colab's integration with webcams.
Within the 8 part series we had to create sample images of our face using the web camera. Initially, I struggled with accessing my webcam in Google Colab. Nicholas was using OpenCV and that was not working via Google Colab. It wasn't until I discovered the hidden gem of code snippets that I realized there was a solution at my fingertips.
Lesson #1: Google Colab offers invaluable resources through its code snippets feature.
Forgive me if you already knew this but I didn't so that counts as a lesson.
However, my challenges didn't end there. I needed to capture hundreds of facial images, which proved to be a tedious task with the default camera capture snippet. Enter
Lesson #2. Google Colab's ability to execute JavaScript within the notebook.
This newfound knowledge revolutionized my approach. I customized the camera capture process. I replaced closing the camera after the picture was taken to a button-driven image capture that closed on command. Harnessing JavaScript's power, I extended functionality, maintaining the camera connection until my task was complete.
All my focus has been on Data Science lately and not one time until now did I realize that you can run Javascript within the notebook. Honestly I never had to, but that's not the point.
Yet, even armed with JavaScript, I encountered my final obstacle: integrating Python functions with JavaScript callbacks. It took some troubleshooting and a return to the Google Colab documentation to pinpoint the missing link. With
Lesson #3. Orchestrating a seamless interaction between JavaScript and Python.
Turns out not only does Javascript have to call the python function.
// Send the base64-encoded image data to Python
const is_saved = await google.colab.kernel.invokeFunction('notebook.save_image', [imageData, full_file_path], {});
But you also have to register the call back.
output.register_callback('notebook.save_image', save_image)
I mean like, duh. Be patient with me I'm being vulnerable here.
The culmination of these lessons resulted in a robust workflow, enabling the capture and processing of over 400 facial images for model training. Witnessing the model in action was immensely gratifying, and I'm eager to share my journey with others, hoping to spare them similar trials. For those interested in exploring my progress or delving into Nicholas Renotte's invaluable tutorials, I've shared my work on GitHub and encourage following Nicholas for further inspiration.
Check out my progress on GitHub: GitHub
And don't forget to follow Nicholas Renotte: Nicholas Renotte's YouTube Channel
Oh and here's are my code snippets.
crops the images
def crop_square(img, size, interpolation=cv2.INTER_AREA):
h, w = img.shape[:2]
min_size = min(h, w)
# Centralize and crop
center_x, center_y = w // 2, h // 2
half_size = min_size // 2
crop_img = img[center_y - half_size:center_y + half_size, center_x - half_size:center_x + half_size]
resized = cv2.resize(crop_img, (size, size), interpolation=interpolation)
return resized
saves the images
def save_image(image_data, full_file_path):
try:
# Decode the base64-encoded image data
binary = base64.b64decode(image_data)
# Load the image directly as a NumPy array
image_array = cv2.imdecode(np.frombuffer(binary, dtype=np.uint8), cv2.IMREAD_COLOR)
# Process the image
processed_image = crop_square(image_array, 250)
# Save the processed image to a file
cv2.imwrite(full_file_path, processed_image)
return True
except Exception as e:
print(f'Error saving image: {e}')
return False
output.register_callback('notebook.save_image', save_image)
opens the webcam for some clickety click action
def take_photo(dir, generate_file_name=True, quality=0.8, verification_callback=None, capture_btn_name='Capture'):
if generate_file_name:
# name the new file with unique identifier
full_file_path = os.path.join(dir, f"{uuid.uuid1()}.jpg")
else:
full_file_path = dir
# create javascript function
js = Javascript('''
async function takePhoto(full_file_path, quality, verification_callback, capture_btn_name) {
// create shell with div and capture button
const div = document.createElement('div');
div.style.padding = '10px';
// create capture button
const capture = document.createElement('button');
capture.textContent = capture_btn_name;
capture.style.marginRight = '10px';
div.appendChild(capture);
// create close button
const close = document.createElement('button')
close.textContent = 'Close';
div.appendChild(close);
// create video space
const video = document.createElement('video');
video.style.display = 'block';
// add video block to div
div.appendChild(video);
// create video element
const stream = await navigator.mediaDevices.getUserMedia({video: true});
// add dive to the body
document.body.appendChild(div);
// set the source and play the media
video.srcObject = stream;
await video.play();
// resize the output to fit the video element.
google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);
capture.onclick = async () => {
// create a canvas element to draw the video frame
const canvas = document.createElement('canvas');
// set the canvas width and height to match the video dimensions
canvas.width = video.videoWidth;
canvas.height = video.videoHeight;
// draw the current video frame onto the canvas
canvas.getContext('2d').drawImage(video, 0, 0);
// Convert the canvas content to a base64-encoded JPEG image data URL
// with the specified quality
const dataUrl = canvas.toDataURL('image/jpeg', quality);
// Split the data URL to extract the base64-encoded image data
const imageData = dataUrl.split(',')[1];
// Send the base64-encoded image data to Python
const is_saved = await google.colab.kernel.invokeFunction('notebook.save_image', [imageData, full_file_path], {});
// If the verification callback is provided, invoke it
if(is_saved.data && verification_callback !== undefined){
const result = await google.colab.kernel.invokeFunction(verification_callback, [], {});
console.log(result);
}
return is_saved.data;
};
close.onclick = () => {
// stop the video stream to release resources
stream.getVideoTracks()[0].stop();
// remove the video element and its containing div from the DOM
div.remove();
};
}
''')
# create and display the javascript function
display(js)
# call the javascript function
display(Javascript(f'takePhoto("{full_file_path}", "{quality}", "{verification_callback}", "{capture_btn_name}")'))
Posted on April 8, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.