Pluralis Research Logo

Pluralis Research

Machine Learning Engineer - ML Training Platform

Reposted 6 Days Ago
Be an Early Applicant
In-Office
Sydney, New South Wales, AUS
Senior level
In-Office
Sydney, New South Wales, AUS
Senior level
Design and implement robust, large-scale distributed ML training systems optimized for low-bandwidth, high-latency environments. Build model-parallel training strategies, checkpointing and recovery, GPU and memory optimizations, P2P networking, NAT traversal, and monitoring to ensure resilient, efficient multi-participant training.
The summary above was generated by AI
Overview

Pluralis Research carries out foundational research on Protocol Learning: multi-participant training of foundation models where no single participant has, or can ever obtain, a full copy of the model. The purpose of Protocol Learning is to facilitate the creation of community-trained and community-owned frontier models with self-sustaining economics.

We're looking for Senior/Staff engineers with 5+ years of experience in distributed systems and ML large-scale training. You'll be implementing a novel substrate for training distributed ML models that work under consumer grade internet connection.

Responsibilities

Distributed Training Architecture & Optimization
  • Design and implement large-scale distributed training systems optimized for heterogeneous hardware operating under low-bandwidth, high-latency conditions.

  • Develop and optimize model-parallel training strategies (data, tensor, pipeline parallelism) with custom sharding techniques that minimize communication overhead.

  • Optimize GPU utilization, memory efficiency, and compute performance across distributed nodes.

  • Implement robust checkpointing, state synchronization, and recovery mechanisms for long-running, fault-prone training jobs.

  • Build monitoring and metrics systems to track training progress, model quality, and system bottlenecks.

Decentralized Networking & Resilience
  • Architect resilient training systems where nodes can fail, networks can partition, and participants can dynamically join or leave.

  • Design and optimize peer-to-peer topologies for decentralized coordination across non-co-located nodes.

  • Implement NAT traversal, peer discovery, dynamic routing, and connection lifecycle management.

  • Profile and optimize communication patterns to reduce latency and bandwidth overhead in multi-participant environments.

What You’ll Bring
  • Strong experience building and operating distributed systems in production.

  • Hands-on expertise with distributed training frameworks (FSDP, DeepSpeed, Megatron, or similar).

  • Deep understanding of model parallelism (data, tensor, pipeline parallelism).

  • Expert-level Python with production experience (concurrency, error handling, retry logic, clean architecture).

  • Strong networking fundamentals: P2P systems, gRPC, routing, NAT traversal, distributed coordination.

  • Experience optimizing GPU workloads, memory management, and large-scale compute efficiency.

What We Offer
  • Equity-heavy compensation with meaningful ownership in a mission-driven company

  • Competitive base salary for senior engineering roles in Australia

  • Visa sponsorship available for exceptional candidates

  • Remote-first with optional access to our Melbourne hub

  • World-class team — team mates were previously at at Google, Amazon, Microsoft, and leading startups

Backed by Union Square Ventures and other tier-1 investors, we're a world-class, deeply technical team of ML researchers and engineers. Pluralis is unapologetically ideological. We view the world as a better place if we are able to implement what we are attempting, and Protocol Learning as the only plausible approach to preventing a handful of massive corporations monopolising model development, access and release, and achieving massive economic capture. If this resonates, please apply.

Top Skills

Deepspeed
Fsdp
Grpc
Megatron
Nat Traversal
P2P
Python

Similar Jobs

13 Minutes Ago
In-Office
Entry level
Entry level
Aerospace • Information Technology • Software • Cybersecurity • Design • Defense • Manufacturing
As an Aerospace Tradesperson, you'll be responsible for assembling aircraft components, following work instructions, measuring tolerances, and using various hand and power tools within a production environment. Training is provided and no prior experience is required for candidates with a trades qualification.
Top Skills: ChemicalsComputer SkillsHand ToolsPower ToolsSealants
28 Minutes Ago
Hybrid
Senior level
Senior level
Artificial Intelligence • Hardware • Information Technology • Security • Software • Cybersecurity • Big Data Analytics
The Fullstack Software Engineer will design and build applications, maintain existing apps, participate in documentation, testing, deployment, and backend development using .Net and Azure Cloud technologies.
Top Skills: AngularAzure CloudC#CSSDockerGitlabHTMLJavaScriptJestJSONNode.jsNpmReactReduxRestSQL ServerVs CodeWebpackYarn
29 Minutes Ago
In-Office
Senior level
Senior level
Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
As a Senior Product Security Engineer, you'll enhance security for applications, respond to threats, and manage cybersecurity projects and systems, collaborating with various teams to analyze and improve security protocols.
Top Skills: Alibaba CloudCi/Cd ToolingCloud PlatformsGCPGoogle SuiteJavaKotlinOktaPythonSoftware Development ToolsSplunkVpn Services

What you need to know about the Sydney Tech Scene

From opera to comedy shows, the Sydney Opera House hosts more than 1,600 performances a year, yet its entertainment sector isn't the only one taking center stage. The city's tech sector has earned a reputation as one of the fastest-growing in the region. More specifically, its IT sector stands out as the country's third-largest, growing at twice the rate of overall employment in the past decade as businesses continue to digitize their operations to stay competitive.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account