2025 · Founder + Product Engineer
Backlab
Web scraping API and TypeScript SDK — describe what you want to extract in plain English, get back structured data.
- Python
- FastAPI
- Crawl4AI
- Playwright
- PostgreSQL
- Redis
- Celery
What it was
Traditional web scraping breaks constantly — a site updates its markup, and every CSS selector or XPath query stops working. Backlab flipped the model: instead of describing the page structure, you describe what you want. Pass a URL and a plain-English description of each field, and the API returns structured data. Under the hood it uses a FastAPI Python backend for AI-powered extraction, and a TypeScript SDK built on AWS serverless infrastructure — Lambda for headless Chrome rendering, DynamoDB for job tracking, SQS for batch processing, and proxy rotation to avoid rate limits.
My role
Designed and built the full stack across two phases: the Python API with AI extraction first, then a TypeScript SDK published as @backlab/sdk with the AWS infrastructure underneath. Also built the web app and handled DevOps — Docker for local dev, CloudWatch for observability in production.
What I learned
Dynamic pages were the hard constraint the entire architecture had to be designed around. Static HTTP requests are fast and cheap; JavaScript-rendered content requires headless Chrome, which means Lambda cold starts, memory limits, and a completely different cost profile. Every product decision downstream — batching, job tracking, proxy rotation — existed because of that single constraint.
The other lesson: an API-first product lives or dies by its SDK. The Python API worked well, but adoption only made sense once there was a typed TypeScript client that felt native to the codebases people were actually building in.