Role Overview
This role will be the technical foundation builder for the company’s AI transformation. You will design and build the company-wide knowledge infrastructure and context layer that powers future AI applications. This is a highly hands-on role requiring strong backend engineering capability, LLM application experience, product sense, and the ability to operate independently in a fast-moving, ambiguous environment.
Key Responsibilities
- Design and build the company-wide AI knowledge infrastructure, including company wiki, internal knowledge base, retrieval layer, and context management system.
- Develop scalable LLM application architecture, including RAG pipelines, vector database integration, prompt workflows, API services, monitoring, and deployment.
- Own the end-to-end technical delivery of internal AI tools, from backend architecture and basic frontend integration to deployment, testing, and monitoring.
- Work closely with business, brand, PR, IR, and leadership stakeholders to translate ambiguous business needs into practical AI systems and technical roadmaps.
- Optimize system performance, including token efficiency, latency, caching strategy, retrieval quality, data architecture, and model inference flow.
- Evaluate and integrate AI coding tools, LLM frameworks, vector databases, and third-party APIs to improve development efficiency and product quality.
- Mentor junior engineers or interns when needed, and help establish technical standards, documentation practices, and reusable engineering workflows.
Requirements
4–7 years of backend engineering experience, with at least 2 years of hands-on LLM application development experience.
Strong backend development skills in Python; experience with Node.js or Go is a plus.
Solid computer science fundamentals, including algorithms, system design, database design, API architecture, distributed systems, caching, and performance optimization.
Production-level LLM application experience, not limited to demos or prototypes. Experience should include prompt engineering at scale, model selection, inference pipeline design, or RAG architecture.
Hands-on experience with RAG and vector databases such as Pinecone, Weaviate, Chroma, or similar tools.
Experience owning full engineering delivery, including backend services, basic frontend integration, API deployment, monitoring, and troubleshooting.
Heavy user of AI coding tools such as Cursor, Claude Code, GitHub Copilot, or similar tools.
Mandarin fluency is required; English working proficiency is required.
Able to work independently under ambiguous instructions and make sound technical decisions without waiting for detailed specifications.