At Contoro Robotics, we're on a mission to solve labor challenges through advanced robotic solutions. Headquartered in Austin, TX, our fast-growing startup is transforming the supply chain industry with our flagship warehouse automation technology. Our team is made up of top-tier experts in robotics, AI, and logistics, working together to push the boundaries of automation.
We’re looking for talented and ambitious individuals to join us on this journey—helping shape the future of robotics while growing alongside a world-class team. If you're passionate about innovation, problem-solving, and making a real-world impact, we want to hear from you!
Contoro Robotics is an Austin based startup focused on warehouse automation. We design a state-of-the-art autonomous truck unloading system capable of lifting boxes over 60 lbs.
We are hiring a robotics engineer to maximize the accuracy of our box detection and singulation. You will own the machine learning pipeline that turns raw sensor data into reliable box detections - model training, dataset curation, evaluation, and edge deployment - working alongside our perception engineers to push detection and singulation accuracy across the full range of box sizes and container conditions we see in production. These models run in production across a fleet of active robots, and their accuracy directly drives unloading throughput.
Train, evaluate, and deploy instance segmentation and detection models that improve box detection and singulation accuracy, including for small, occluded, deformed, and tightly-packed boxes
Build and maintain automated dataset curation and ground-truth generation pipelines, including foundation-model-assisted labeling (e.g., SAM) to scale training data
Own a deterministic benchmarking and regression framework that evaluates model performance across real and simulated datasets, stratified by box size, container type, and failure mode
Optimize models for real-time inference on edge hardware using TensorRT and quantization, balancing accuracy against latency and memory budgets
Debug and resolve production detection failures through log analysis, failure-case review, and targeted retraining
Collaborate with perception engineers on calibration, localization, and the interface between detections and downstream planning
Participate in design reviews and contribute to module-level technical decisions
B.S. or M.S. in Computer Science, Robotics, Electrical Engineering, or a related field
3+ years of professional experience developing and deploying computer vision / ML models for real-world systems
Proficiency in Python and PyTorch in a production environment; working knowledge of C++
Hands-on experience with instance segmentation or object detection models (e.g., Mask R-CNN, Detectron2, YOLO, SAM)
Experience building dataset curation, labeling, or evaluation pipelines
Experience deploying models to edge hardware (NVIDIA Jetson or similar) with TensorRT or comparable inference optimization
Strong debugging skills and the ability to diagnose model and pipeline failures in production
Familiarity with Linux-based development environments and ROS / ROS2
Experience with 3D perception and point cloud processing (PCL, Open3D) alongside 2D detection
Experience with multi-sensor (camera + LiDAR) calibration and synchronized data pipelines
Experience with stratified model evaluation and regression testing for ML systems
Familiarity with Docker-based deployment and cloud-based logging/monitoring
Prior work in warehouse automation, logistics, or pick-and-place applications
Based on 2,418 disclosed Hardware salaries on RoleSuite, the role pays a median of $136K/year, with most offers between $110K and $171K (10th–90th percentile: $92K–$206K).
See the full Hardware salary breakdown →