Airtree Logo

Airtree

Founding Applied AI Engineer (Eval-Driven)

Reposted Yesterday
Be an Early Applicant
In-Office
Sydney, New South Wales
Mid level
In-Office
Sydney, New South Wales
Mid level
Build and ship AI-powered workflows for construction cost management. Responsibilities include evaluation problem definition, maintenance of evaluation harnesses, and workflow implementation in collaboration with stakeholders.
The summary above was generated by AI

This role is for one of our portfolio companies, not internally at Airtree. Your application will be reviewed by the founder, not an Airtree employee.

This early stage company, still operating in stealth, is on a mission to eradicate cost overruns in construction, a $3 trillion problem that slows down cities, destroys margins, and erodes trust between builders and clients.

Over 90% of construction projects go over budget, eroding builder margins and stalling progress across the industry. Our platform takes a new approach to helping builders stay on budget by detecting and preventing costly variations before they spiral. Our AI-powered early warning system gives builders control over project costs — protecting time, margin, and reputation on every build.

We are at the start of something big and we’re looking for an Applied AI Engineer (Eval-driven) to build and ship design-audit workflows that consistently meet measurable quality bars. This role blends ML engineering and data science, with a heavy emphasis on problem definition, evaluation, and reliability in real customer workflows.

Responsibilities
  • Define evaluation problems: success criteria, failure modes, datasets, labelling guidelines, and score functions.
  • Build and maintain an evaluation harness: regression tests, edge-case suites, and quality dashboards to prevent backsliding.
  • Implement workflow systems end-to-end (data → model/LLM components → post-processing → acceptance testing) until they pass eval thresholds.
  • Partner with product and domain stakeholders to translate messy real-world requirements into testable specs.

Requirements
  • Strong Python skills and practical experience shipping ML/AI systems (not just experimentation).
  • Demonstrated experience designing evals for ML/LLM systems (offline metrics, gold sets, error analysis, regression testing, monitoring).
  • Comfort working across data science + engineering tasks: data wrangling, feature/label design, model/LLM iteration, and productionization.
  • High ownership and intensity: persistence in closing the loop from “fails eval” to “passes consistently.”
Nice to have
  • Experience with document understanding (OCR, parsing, classification/extraction) and structured outputs (schemas, validators).
  • Familiarity with AEC/construction workflows (design coordination, QA/compliance, BIM concepts like IFC/Revit).
  • Experience building human-in-the-loop review systems and adjudication processes to improve training/eval data.

Top Skills

AI
Llm
Ml
Ocr
Python
HQ

Airtree The Hills, New South Wales, AUS Office

131 Devonshire St, The Hills, New South Wales, Australia, 2010

Similar Jobs

4 Hours Ago
In-Office or Remote
Sydney, New South Wales, AUS
Mid level
Mid level
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Build and maintain the Rovo mobile app using React Native and native iOS/Android tech. Collaborate with design and engineers to develop features, ship regular releases, perform code reviews, fix complex bugs, lead projects end-to-end, and mentor junior engineers.
Top Skills: React Native,Swift,Kotlin
4 Hours Ago
In-Office or Remote
Sydney, New South Wales, AUS
Expert/Leader
Expert/Leader
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Lead Jira Service Management sales for Australia/New Zealand public sector: develop and execute territory strategy, close new business, drive Service Collection revenue, manage forecasts and funnels, collaborate with partners and cross-functional teams, represent Atlassian at events, and build long-term customer relationships.
Top Skills: Jira Service Management,Jira,Confluence,Align,Jpd,Itsm,Csm
4 Hours Ago
In-Office or Remote
Sydney, New South Wales, AUS
Senior level
Senior level
Cloud • Information Technology • Productivity • Security • Software • App development • Automation
Lead a team of Enterprise Deal Managers across EMEA to manage quote-to-cash for strategic accounts: deal shaping, quoting, pricing, contracts, financial analysis, governance, process improvements, product operationalization, and compliance.
Top Skills: Excel,Cpq,Qtc,Salesforce,Oracle Fusion,Netsuite,Jira,Confluence,Docusign,Cloud Marketplaces

What you need to know about the Sydney Tech Scene

From opera to comedy shows, the Sydney Opera House hosts more than 1,600 performances a year, yet its entertainment sector isn't the only one taking center stage. The city's tech sector has earned a reputation as one of the fastest-growing in the region. More specifically, its IT sector stands out as the country's third-largest, growing at twice the rate of overall employment in the past decade as businesses continue to digitize their operations to stay competitive.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account