πŸ¦„ Memoire: Create Narrated Videos with AI in Minutes!

omzi

Omezibe Obioha

Posted on October 14, 2024

πŸ¦„ Memoire: Create Narrated Videos with AI in Minutes!

This is a submission for the The Pinata Challenge

βœ’οΈ Introduction

Creating captivating videos with engaging narratives can be time-consuming and complex. It may even end up unprofessional. Ever try outsourcing a narration/voiceover to someone? Get ready to cough up a good amount of money for that. What if there was a way to simplify this process using AI? And cheaper too?

Meet Memoire, an AI-powered tool designed to create narrated videos in minutes. Whether you're a content creator, a marketer, or just someone who loves sharing stories, Memoire is here to transform your ideas into stunning videos effortlessly.

In this article, I'll walk you through Memoire, showcasing its features, the challenges faced during development, and the exciting possibilities it offers.

Memoire Landing Page

πŸ” Key Features

1/ Full-Featured Authentication: Memoire ensures security and user experience with its robust authentication system powered by NextAuth, allowing only verified users to access the app. The system includes beautifully designed emails for account verification and password resets, enhancing both functionality and user engagement.

Memoire Feature Screenshot, #1

2/ Upload Media and Generate Descriptions: You can upload your photos, and Memoire will generate accurate and engaging descriptions for them. If the description is missing important context, you can easily add your input and regenerate a more fitting description.

Memoire Feature Screenshot, #2

3/ Media Transitions: Elevate your video storytelling with Memoire's diverse media transitions, offering options like "fade," "wipeleft," "slideup," and more. These transitions provide a professional touch, ensuring smooth and visually appealing scene changes in your videos.

Memoire Feature Screenshot, #3

4/ Sortable Media List: Uploading photos in batches can sometimes lead to an unpredictable order of completion. With Memoire, you can easily drag and drop media boxes to arrange them in the order you prefer.

Memoire Feature Screenshot, #4

5/ AI Script Generation: Memoire uses Google's Gemini 1.5 Pro model to generate scripts for your videos. This ensures high-quality, contextually relevant scripts that enhance your video narratives.

Memoire Feature Screenshot, #5

6/ AI Audio Generation with Selectable Voices: Powered by OpenAI's TTS-1 model, Memoire offers customizable voices for your narrations. Choose from Echo, Alloy, Fable, Onyx, Nova, and Shimmer to find the perfect voice for your project.

Memoire Feature Screenshot, #6

7/ Project Settings: Customize your project by adding a description, which helps the AI generate better scripts. You can also change your project's aspect ratio and frame rate to suit your needs.

Memoire Feature Screenshot, #7

8/ In-Browser Output Generation: Memoire uses Remotion to generate video previews directly in your browser. Although the preview has some minor differences from the final output, fixes are underway to improve it.

Memoire Feature Screenshot, #8

9/ AI Music Generation: Memoire leverages Meta's Music Gen model to generate background music for your videos. This feature is still a work in progress and is not available for public testing yet.

10/ AI Powered Subtitle Generation: Using OpenAI's Whisper model, Memoire can generate subtitles for your videos. This feature is also in development and will be available soon.

πŸ› οΈ Tech Stack

  • FrontEnd: TypeScript, Next.js, DND Kit

  • BackEnd: Next.js API Routes, Server Actions, Prisma

  • Styling: Tailwind CSS, shadcn/ui components

  • File Storage: Pinata

  • Rate Limit: Upstash

  • Authentication: Next Auth

  • AI Models: Google's Gemini 1.5 Pro, OpenAI's TTS-1, Meta's Music Gen, OpenAI's Whisper

  • In-Browser Preview: Remotion

πŸ¦„ How I Used Pinata

I had fun trying out a couple of stuff with Pinata! Here they are:

1/ Multi-File Upload Component (w/ Progress Tracking) (MediaPane.tsx):
Pinata's raw API endpoint was leveraged to create a robust multi-file upload component with real-time progress tracking. This approach offers more control, and a better user experience compared to using the SDK.

Key Features:

  • Direct upload to Pinata using axios
  • JWT-based authentication for secure uploads
  • Real-time upload progress tracking

Here's how it works:

a. Fetch JWT for authentication:

const keyRequest = await fetch('/api/key');
const keyData = await keyRequest.json() as { JWT: string };
Enter fullscreen mode Exit fullscreen mode

b. Prepare and send the upload request:

const UPLOAD_ENDPOINT = `https://uploads.pinata.cloud/v3/files`;
const formData = new FormData();
formData.append(`file`, addedFileState.file);

const { data: uploadResponse }: AxiosResponse<{ data: PinataUploadResponse }> = await axios.post(UPLOAD_ENDPOINT, formData, {
    headers: {
        Authorization: `Bearer ${keyData.JWT}`
    },
    onUploadProgress: async (progressEvent) => {
        if (progressEvent.total) {
            const percentComplete = (progressEvent.loaded / progressEvent.total) * 100;
            updateFileProgress(addedFileState.key, percentComplete);
        }
    }
});
Enter fullscreen mode Exit fullscreen mode

c. Track upload progress:

onUploadProgress: async (progressEvent) => {
    if (progressEvent.total) {
        const percentComplete = (progressEvent.loaded / progressEvent.total) * 100;
        updateFileProgress(addedFileState.key, percentComplete);
    }
}
Enter fullscreen mode Exit fullscreen mode

d. Handle the upload response and prepare metadata:

await new Promise(resolve => setTimeout(resolve, 1000));
updateFileProgress(addedFileState.key, 'COMPLETE');

const data = addedFileState.type === 'PHOTO'
    ? await getPhotoDimensions(addedFileState.preview)
    : await getVideoDimensions(addedFileState.preview);

const metadata = { ...data, cid: uploadResponse.data.cid, type: addedFileState.type };
Enter fullscreen mode Exit fullscreen mode

This implementation allows for a seamless upload experience with visual feedback, enhancing user interaction during the potentially time-consuming process of uploading media files.

2/ Custom Image Component (PinataImage.tsx):

A custom PinataImage component was created to efficiently handle image retrieval, caching, and display. This component optimizes performance by reducing unnecessary network requests and leveraging browser storage.

Key Features:

  • Local caching using IndexedDB
  • Signed URL generation for secure access
  • Lazy loading and skeleton placeholders

Here's a breakdown of its functionality:

a. Check for cached images:

const cachedImage = await db.images.where({ cid, width, height }).first();
if (cachedImage) {
    setImageUrl(URL.createObjectURL(cachedImage.blob));
    return;
}
Enter fullscreen mode Exit fullscreen mode

b. Generate signed URL for secure access:

const params = new URLSearchParams({
    cid,
    width: width?.toString() || '',
    height: height?.toString() || '',
    expires
});

const response = await fetch(`/api/getSignedUrl?${params}`);
if (!response.ok) {
    throw new Error('Failed to fetch signed URL');
}

const data = await response.json() as { url: string };
Enter fullscreen mode Exit fullscreen mode

c. Fetch and cache the image:

const imageResponse = await fetch(`/api/getImage?url=${encodeURIComponent(data.url)}`);
if (!imageResponse.ok) {
    throw new Error('Failed to fetch image');
}

const blob = await imageResponse.blob();
const objectUrl = URL.createObjectURL(blob);
setImageUrl(objectUrl);

await db.images.put({ cid, width: Number(width), height: Number(height), blob });
Enter fullscreen mode Exit fullscreen mode

d. Render the image or a skeleton placeholder:

const renderedImage = useMemo(() => {
    if (imageUrl) {
        return (
            <Image
                src={imageUrl}
                unoptimized={!!src}
                width={Number(width)}
                height={Number(height)}
                alt={alt}
                className={className}
                crossOrigin='anonymous'
                {...props}
            />
        );
    } else {
        return (
            <Skeleton className={className} />
        );
    }
}, [imageUrl, width, height, src, alt, className, props]);
Enter fullscreen mode Exit fullscreen mode

This component ensures efficient loading and display of images stored on Pinata, improving the overall performance and user experience of Memoire.

3/ Media Management and Retrieval (VideoPreview.tsx):

In addition to uploading and displaying images, Pinata is used for storing and retrieving various types of media, including audio and video files. This is evident in the VideoPreview component:

a. Retrieve media files using their CIDs:

const getMediaUrl = useCallback(async (cid: string, projectId: string, type: 'media' | 'audio'): Promise<string> => {
    try {
        if (typeof window === 'undefined') {
            return '';
        }

        const table = type === 'media' ? db.media : db.audio;
        let item = await table.where({ cid }).first();
        if (item) {
            return URL.createObjectURL(item.file);
        }

        const response = await fetch(`/api/getFile?cid=${encodeURIComponent(cid)}`);
        if (!response.ok) {
            throw new Error(`HTTP error! status: ${response.status}`);
        }

        const blob = await response.blob();

        await table.put({
            cid,
            file: blob,
            projectId
        });

        return URL.createObjectURL(blob);
    } catch (error) {
        return ''
    }
}, []);
Enter fullscreen mode Exit fullscreen mode

b. Load audio files for narration:

const loadAudio = useCallback(async () => {
    if (narration?.audioCid) {
        const audioUrl = await getMediaUrl(narration.audioCid, project.id, 'audio');
        setLoadedAudioUrl(audioUrl);
        setNarration({ audioUrl });
    }
    // eslint-disable-next-line react-hooks/exhaustive-deps
}, [narration?.audioCid, project.id, getMediaUrl]);
Enter fullscreen mode Exit fullscreen mode

c. Load and sort media items:

const loadMediaItems = useMemo(() => async () => {
    try {
        const loadedItems = await Promise.all(
            mediaItems.map(async (media) => ({
                ...media,
                url: await getMediaUrl(media.cid, project.id, 'media')
            }))
        );

        const sortedMediaItems = [...loadedItems].sort((first, next) =>
            project.mediaOrder.indexOf(first.id) - project.mediaOrder.indexOf(next.id)
        );

        // Compare sortedMediaItems with loadedMediaItems
        const hasChanged = loadedMediaItems.length === 0 ||
            sortedMediaItems.length !== loadedMediaItems.length ||
            sortedMediaItems.some((item, index) => {
                const loadedItem = loadedMediaItems[index];
                return !loadedItem ||
                    item.duration !== loadedItem.duration ||
                    item.transition !== loadedItem.transition;
            });

        if (hasChanged) {
            setLoadedMediaItems(sortedMediaItems);
        }

        await loadAudio();
    } catch (error) {
        console.error('Error loading media items :>>', error);
    }
}, [mediaItems, loadedMediaItems, getMediaUrl, project.id, project.mediaOrder, loadAudio]);
Enter fullscreen mode Exit fullscreen mode

This comprehensive approach to media management allows for efficient storage, retrieval, and playback of various media types within Memoire.

πŸ’ͺ Challenges Faced

1/ Pinata Integration: Working with Pinata was an intriguing experience. Their JavaScript SDK for uploading files presented a challenge: it lacked a built-in method for tracking upload progress, which was crucial for my project to provide users with real-time feedback. Determined to find a solution, I dove into their documentation and discovered that I could use the API directly to achieve this.

Also, instead of following the conventional approach of prefetching signed URLs, I opted for a different route. I made API calls directly from the front end and cached the responses using IndexedDB. This innovative strategy allowed me to load each file only once, significantly minimizing the number of API calls to Pinata and ultimately saving on credits 😬. It was a rewarding challenge that pushed me to think creatively and efficiently!

2/ AI Integration: Integrating AI services for narration and script generation was a significant challenge. Ensuring that the AI produces high-quality output required extensive testing and fine-tuning. I also ran into rate limits while I was testing aggressively.

3/ User Experience: Creating an intuitive and user-friendly interface was crucial. I spent a considerable amount of time designing and iterating on the UI to ensure it meets users' needs while being aesthetically pleasing. This was a lot tougher for me because I didn't have the time to bring in a designer to work with me ;(.

πŸ“Έ Screenshots

Memoire Screenshots

πŸ”— Project Link

Link: https://dub.sh/MemoireDemo

πŸ’» Code Repository

Link: https://git.new/MemoireRepo

⚠ Known Issues

1/ Narration audio not syncing up with video.
2/ Video preview component flickers unnecessarily on first load.

✨ Conclusion

Memoire is designed to simplify video creation. By harnessing the power of AI, I've made it possible to produce high-quality narrated videos in minutes for dirt cheap. Whether you're looking to create content for social media, marketing campaigns, or personal projects, Memoire has you covered.

I'm excited to see what you'll create with Memoire. Feel free to share your feedback and let me know how I can improve. Stay tuned for more updates and features!

πŸ’– πŸ’ͺ πŸ™… 🚩
omzi
Omezibe Obioha

Posted on October 14, 2024

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related