GenAI Evals, Monitoring, and Automated Prompt Engineering with MLFlow, DSPy
Event description
Date: TBA Location: TBA, Melbourne Presented by: MLAI
Join us for an evening diving deep into the practical side of productionizing GenAI — the unglamorous but critical work that separates proof-of-concept demos from reliable, production-grade AI systems.
We'll explore how to evaluate LLM outputs at scale, monitor model behavior in the wild, and automatically optimize prompts using cutting-edge open-source tools like MLFlow and DSPy. Whether you're struggling with inconsistent responses, trying to catch regressions before users do, or tired of manually tweaking prompts, this session is for you.
Expect practical demos, real-world case studies, and honest conversations about what actually works (and what doesn't) when taking GenAI from experimentation to production. This is for builders who want to ship AI systems they can trust — and sleep soundly at night.
There'll be talks, hands-on demonstrations, and plenty of time to network over drinks and dinner with Melbourne's AI engineering community.
🎯 What You'll Learn
How to systematically evaluate LLM quality (beyond vibes)
Setting up monitoring pipelines to catch model drift and failures
Automated prompt optimization with DSPy
Integrating evals and monitoring into your MLOps workflow with MLFlow
Real patterns and anti-patterns from production GenAI systems
👥 Who Should Attend
AI Engineers and Data Scientists working with LLMs
Product teams building GenAI features
Anyone trying to make their AI systems more reliable and maintainable
Curious developers who want to level up their GenAI engineering game
Come ready to learn, ask questions, and connect with others solving the same hard problems. See you there! 🚀
Tickets for good, not greed Humanitix dedicates 100% of profits from booking fees to charity