RESTful API for managing and deploying YOLO machine learning models. Features model upload/download, real-time predictions on images/videos, session management, and shared media handling. Deployed on Render with live demo available.
RESTful API for managing and deploying YOLO machine learning models. Features model upload/download, real-time predictions on images/videos, session management, and shared media handling. Deployed on Render with live demo available.
This FastAPI application provides a platform for managing and interacting with YOLO machine learning models for computer vision tasks. Built during a paid ML training course (~30 students learning computer vision through hands-on project work), I volunteered to build the REST API and deployment infrastructure as an additional contribution, leveraging my prior full-stack development experience while most students focused on data annotation and model training tasks.
The platform features session-based model selection, video processing at 2-second intervals with frame extraction and GIF generation, and shared media management. Successfully deployed on Render as a demo for the course's client project—detecting objects in satellite imagery from Russia/Eastern Europe regions. The models were trained on this specific geographical landscape and perform well for that context, though performance may vary for other regions. Users can test it using screenshots from Yandex Maps or Google Maps satellite views of similar terrain.
**Key Learning Project**: This was my first ML deployment project, built rapidly to demonstrate the trained models for the course deliverable. While functional for the demo use case, it has several architectural shortcomings that reflect my growth since then:
• **Architecture**: The entire application lives in a 700+ line main.py file with global state for model management. Today I'd structure this with proper separation of concerns (routes, services, models) and use dependency injection instead of global variables.
• **Security**: No authentication, no input validation on file uploads, potential path traversal vulnerabilities, and unsafe model loading without torch.load(weights_only=True). A production system would need JWT/API keys, rate limiting, and proper validation.
• **Code Quality**: Uses print() statements instead of proper logging, has commented-out code blocks, and inconsistent type hints. The deprecated PIL.textsize() method needs updating to textbbox().
• **API Design**: Lacks versioning, pagination, and standardized error responses. The /project_structure endpoint is an information disclosure risk that should be removed.
Despite these issues, the project successfully delivered a working demo to the client and taught me core ML deployment concepts: handling video processing pipelines, managing PyTorch memory in production, implementing async file uploads, and deploying containerized ML services. Taking initiative to build this API infrastructure while learning alongside the ML team taught me both technical skills and the value of leveraging existing experience to contribute in new domains.
Optimizing video processing for real-time inference
Implemented frame sampling at 2-second intervals using OpenCV and PyTorch for inference. Got it working, but in retrospect, should have used a proper async task queue (Celery/Redis) instead of processing synchronously in the request handler. Current approach blocks the request for long videos.
Managing multiple model versions with global state
Built a model registry using global dictionaries (loaded_model, model_info_dict) with file-based storage. Works for single-instance deployment on Render, but this approach breaks with horizontal scaling. Should have used dependency injection with proper state management or external model storage (S3 + database).
Implementing session management for concurrent users
Used Starlette's session middleware with in-memory storage. Each session stores the selected model reference. This works for the demo but doesn't persist across restarts and won't scale horizontally. Production version would need Redis or database-backed sessions.
Building API infrastructure as an additional contribution during ML course
Volunteered to build the REST API and deployment infrastructure while most students (~30 people enrolled) focused on assigned data annotation and model training tasks. As is typical with course projects where few actively contribute beyond requirements, I took initiative to deliver working API infrastructure. Built with memory management (garbage collection, psutil monitoring) and deployed to Render for the course deliverable. The rushed timeline and learning environment meant sacrificing code organization (700-line main.py) and proper error handling. Successfully delivered a functional demo that worked well for the specific use case (Russia/Eastern Europe satellite imagery), though the code quality reflects the constraints. Taught me the importance of balancing speed-to-demo with maintainable architecture, and the value of taking initiative beyond minimum requirements.
Successfully deployed working demo API for course project deliverable on Render
Delivered functional REST API infrastructure beyond assigned coursework
Models perform well for intended use case (Russia/Eastern Europe satellite imagery)
Learned production ML deployment challenges and architectural trade-offs
Let's discuss how I can help with your similar requirements.
Get in Touch