Santiago Vargas
Fresh grad into data engineering, AI, and building things that actually ship.
About Me
I got into software through game dev and web projects as a kid. I liked making things, and that never really changed. I recently graduated from Curtin with a Software Development degree and spent my final months as a Data Integration & Analytics intern at Mineral Resources.
During my internship I worked on SQL transformations, dimensional modelling, and building tools that made other people's work easier. The part I enjoyed most was shipping something that actually got used, whether that was a data model that cleaned up reporting for an entire department or a small app that saved someone an hour of manual formatting.
I'm always tinkering with something. An AI pipeline one week, a web app the next. I get an idea and I can't not build it. Most of my personal projects come from wanting to solve a real problem or just wanting to see if I can pull it off.
Role
Data Engineering Intern (prev.)
Company
Mineral Resources
University
Curtin University
Degree
Bachelor of Science, Software Development
Graduation
Graduated 2026
Location
Perth, Australia
Coursework
Experience
Data Engineering Intern
Mineral Resources
Built production data models and analytics tooling for iron ore sales, shipping, and operations data across the DIA program.
- Built bronze-to-silver data transformations in dbt, modelling dimension tables with full column-level documentation
- Authored 8 gold layer models then iterated to V2 after reverse-engineering Power BI reports, eliminating 17+ DAX measures by pre-computing at the SQL layer
- Solved real data quality issues: fanout prevention via deduplication, GUID-to-name resolution, NULL fixes from mismatched keys, and timezone handling
- Built a Databricks cost dashboard from scratch using system tables and Lakeview, then presented it to the DIA team
- Developed a custom MCP server on Databricks giving developers AI-assisted access to Unity Catalog schemas and business logic
Projects
Databricks Custom MCP Integration
Gave 6+ developers AI-powered access to a massive Unity Catalog, including schemas, table relationships, and business logic, without needing to know the data model by heart. Deployed as a Databricks App using Asset Bundles for one-command deploys, with a lightweight auth flow via personal access tokens. No local repo cloning needed.
Vision2Summary (V2S)
Drop in scanned documents, get structured data out. Uses Gemini Vision for high-res OCR with tiled processing, auto-classifies document types, extracts key-value pairs and tables, and validates output with confidence scoring. Includes a Streamlit UI with batch processing, a chat interface for querying results, and export to six formats including DOCX and Excel.
Format King
Kills repetitive formatting work. Handles JSON parsing, text cleanup, and document formatting that was eating hours of manual effort. Simple, fast, and still in daily use.