About Tres Vista
Tres Vista is a global enterprise whose business model is built to deliver enduring value. Tres Vista combines best practices, technology enabled execution, and industry-leading talent to drive meaningful results. By integrating advisory capabilities with scalable delivery, Tres Vista helps clients operate smarter and grow stronger. Tres Vista’s services include investment diligence, industry research, valuation, fund administration, accounting, and data analytics.
About Department:
The Operational and Digital Excellence (ODEX) team is a newly established, cross-functional task force within Tres Vista, focused on driving firm-wide transformation. Our mandate is to reimagine and elevate the way we operate across internal support functions and client-facing teams through process optimization, technology enablement, and digital innovation.
ODEX sits at the intersection of strategy, operations, and technology. The team leads high-impact initiatives that improve efficiency, scalability, and client delivery. This includes applying principles of process re-engineering, automation, and data-driven decision-making, as well as exploring the responsible and practical use of AI.
It is a highly selective and agile team made up of high-performing individuals who bring a deep understanding of our business, high integrity, willingness to learn, and a strong appetite for change. Joining ODEX means stepping into a visible, high-impact role with the opportunity to shape the future of our operations and contribute to firm-wide excellence.
Role Overview:
We are looking for a Lead / Staff AI Engineer to own the architecture, standards, and long-term direction of our generative AI systems. This role sits above day-to-day feature delivery and below pure management.
You will design systems that scale across teams, ensure technical excellence, and turn generative AI from experimentation into a reliable, reusable capability for the organization. You will be responsible for building intelligent agents from the ground up including prompt design, retrieval pipelines, fine-tuning models, and deploying them in a secure, scalable cloud environment. You’ll also implement caching strategies, handle backend integration, and prototype user interfaces for internal and client testing.
This role requires deep technical skills, autonomy, and a passion for bringing applied AI solutions into real-world use. This is a high-impact individual contributor role with significant technical authority.
Key Role Deliverables:
Define and own the end-to-end architecture for generative AI systems across multiple use cases and teams
Establish and enforce standards for RAG, agent architectures, prompt and version management, evaluation, observability, and deployment
Decide when to build, buy, fine-tune, or replace models, tools, and frameworks based on technical and business constraints
Design, evolve, and govern shared AI platforms, including reusable RAG pipelines, agent orchestration frameworks, prompt management systems, and evaluation/monitoring infrastructure
Drive reuse and standardization, eliminating one-off AI solutions and reducing long-term technical debt
Architect complex AI workflows, including multi-agent systems, tool orchestration, and long-running or asynchronous tasks
Design AI systems resilient to hallucinations, noisy inputs, partial failures, and model degradation
Optimize AI systems for latency, cost, reliability, scalability, and explainability at production scale
Lead technical design reviews, act as a technical authority, and unblock complex architectural and implementation challenges
Mentor and raise the technical bar for senior and junior engineers across the generative AI stack
Define and enforce guardrails for data security, privacy, compliance, and responsible AI usage
Proactively identify model risks, operational failure modes, and scaling bottlenecks
Translate long-term business and product goals into concrete, extensible AI platform capabilities
Design, build, and optimize retrieval-augmented generation (RAG) pipelines using vector databases (e.g., Qdrant, Pinecone, FAISS) to power semantic search and intelligent document workflows
Fine-tune and adapt LLMs using Hugging Face Transformers, Lo RA/PEFT, Deep Speed, or Accelerate where domain adaptation is required
Integrate knowledge graphs (e.g., Neo4j, AWS Neptune) into agent pipelines for enhanced context, reasoning, and relationship modeling
Implement cache-augmented generation strategies (semantic caching, Redis, vector similarity) to reduce latency, cost, and output inconsistency
Build and maintain scalable backend services using Fast API or Flask and support lightweight user interfaces or prototypes using Streamlit, Gradio, or React when needed
Monitor and evaluate model and agent performance using prompt testing, benchmarks, human-in-the-loop feedback, observability tools, and safe AI practices
Stay current with advancements in cloud platforms (AWS/GCP/Azure), LLMs, agentic frameworks, and AI infrastructure, and incorporate improvements where appropriate
Prerequisites:
Strong Python development skills, including API development and service integration
Proven track record of designing and scaling AI systems used by real teams or clients
Expert-level Python and strong software engineering fundamentals
Deep, hands-on expertise with LLM APIs and open-source models, RAG architecture and vector search strategies, agent-based systems and tool calling and prompt engineering at scale
Experience with model fine-tuning, adapters, or hybrid architecture
Strong background in distributed systems and API design, Docker, CI/CD, and cloud infrastructure, and async workflows, queues, and background processing
Experience implementing observability for AI systems (metrics, logs, tracing, cost monitoring)
Experience:
6–10 years of experience in AI/ML, with at least 2 years focused on large language models, applied NLP, or agent-based systems
Demonstrated ability to build and ship real-world AI-powered applications or platforms, preferably involving agents or LLM-centric workflows
Strong analytical, problem-solving, and communication skills
Ability to work independently in a fast-moving, collaborative, and cross-functional environment
Prior experience in startups, innovation labs, or consulting firms is a plus
Experience with AI governance, model audits, and compliance frameworks is a plus
Education: