When
May 8th, 2026
Where
Mercer Island Community & Events Center
Format
In Person
Event Highlights
Diverse range of technical & career development talks
2 Panel Discussions
3 Interactive Workshops (Space Limited)
Networking Activities
Topic highlights
Agentic AI in Practice
Applied Data Science Across Industries
Career Growth, Leadership, Layoffs and Navigating Uncertainty
Data Science for Social Good
For more details on speakers, talks, panels, mentoring sessions, and workshops:
2026 Conference Agenda
Thanks to our 2026 Sponsors!
2026 Conference Volunteers
Overall Leads
Content Team
Marketing Team
Sponsorship Team
Events Team
Ayesha Darekar - CoLead
Kamala Jagannathan - CoLead
Workshops Team
2026 Speakers
Akriti Chadda
Akriti Chadda is an applied machine learning scientist specializing in search, relevance and generative AI systems deployed at scale. Her work focuses on the full lifecycle of AI, from modeling and experimentation to production deployment, monitoring and long-term system reliability. In recent years, she has been deeply involved in agentic and generative AI systems, where non-determinism and autonomy introduce new technical and leadership challenges.
Beyond technical execution, Akriti is passionate about communication, mentorship and helping data professionals grow into thoughtful leaders. She frequently speaks about operating complex AI systems responsibly, aligning stakeholders around uncertainty and translating advanced ML concepts into practical, real-world impact.
From Models to Teammates: Operating, Monitoring and Trusting Agentic AI in Production
Anjali Viramgama
Building a side hustle with AI
Anjali Viramgama is a software engineer at Microsoft and the world's fifth largest female tech creator with over 500,000 followers across LinkedIn and Instagram. She has been featured on Forbes, Times Square, LinkedIn News and Adobe Live. She hosts women in tech events in Seattle for a 1000+ member community, has helped 600+ students get jobs, and spoken at Universities like Stanford, Berkeley, and UT Austin. Her work sits at the intersection of engineering, education, and community, with a focus on making tech accessible for underrepresented women and first-generation students.
Catherine Nelson
Every LLM Call Counts: The Environmental Cost of AI, and How Data Scientists Can Reduce It
Catherine Nelson is an experienced data scientist and ML engineer, and the author of two O'Reilly books: Software Engineering for Data Scientists (2024) and Building Machine Learning Pipelines (2020). Previously, she was a Principal Data Scientist at SAP Concur, where she deployed NLP models to production and created innovative features including ML-powered carbon emissions analytics. She is currently consulting for startups on AI evaluation and developer relations. Catherine holds a PhD in Geophysics from Durham University and a Masters in Earth Sciences from Oxford University.
Shaili Guru
How to Work with Your PM (When They Don't Speak AI)
Shaili Guru is an AI product leader and educator with 10+ years of experience building AI products at Amazon, Disney, Nike, and T-Mobile. She currently teaches AI Product Management at the University of Washington's Global Innovation Exchange and runs Bluenox.ai, helping organizations and product teams adopt AI effectively. Her Substack newsletter, AI Product Management Guru, is read by over 4,000 PMs worldwide. Shaili holds a Technology Management MBA from the UW Foster School of Business and a BS in Biology from Baldwin-Wallace University.
Sridevi Wagle
Leveraging AI to Support Evidence-Based Wildlife and Permit Management
Sridevi Wagle is a Machine Learning Engineer at Pacific Northwest National Laboratory with a master’s degree in Computational Science. She has experience developing AI and machine learning tools for extracting and analyzing information from large-scale, multimodal scientific data. Her work includes building systems for knowledge retrieval and semantic search using advanced language models and data integration techniques. Sridevi’s research interests include explainable AI, uncertainty quantification, and visualization methods to support data-driven decision-making in scientific domains.
Hoda Soltani
When Time Tells: Using Sequence Modeling to Understand Transfer Student Retention
I am a civil engineer and data scientist with six years of professional engineering experience, three years of Data Science, and eight years of academic research focused on predictive modeling of complex dynamic systems. I hold a PhD in Civil Engineering, where my research applied system identification, time-series analysis, and state-space modeling to large-scale experimental data to study the seismic response of foundations and support infrastructure resilience. My work has been published in peer-reviewed journals and presented at international conferences and workshops.
Following my doctorate, I worked at Shannon & Wilson, a leading geotechnical consulting firm in Seattle, contributing to high-impact projects in the Pacific Northwest and San Francisco Bay Area, including seismic resilience analyses and large-scale numerical simulations for critical infrastructure. I later transitioned into data science, completing advanced training in computer science, machine learning, deep learning, and AI. I currently work as a data scientist in higher education, applying predictive modeling to student success and retention initiatives.
Erin Zionce and Sandy Rech
A Data Science Approach to Quantifying Fish Passage Through Dams, Assessing Fish Injury, and Advancing Fisheries Research
Erin Zionce is a Data Scientist at the Pacific Northwest National Laboratory with a background in fisheries ecology. Her research contributes to juvenile and adult fish passage studies by integrating ecological expertise with data science through statistical modeling, machine learning, and computational tools to support environmental science and hydropower systems management.
Sandy Rech is an Earth Scientist at Pacific Northwest National Laboratory, where she contributes to fish telemetry projects to study salmonid migration through dams. She has a background in computer science, mathematics, and oceanography, with previous work in mathematical modeling, oyster restoration, and ecological data management. She is passionate about integrating ecology and data science to address complex environmental challenges.
Rachel Wagner-Kaiser
AI Beyond English: Building Multi-Lingual and Non-English AI Solutions
Rachel Wagner-Kaiser has 15 years of experience in data and AI, entering the data science field after completing her PhD in astronomy. She specializes in building NLP and AI solutions for real-world problems constrained by limited or messy data. Rachel leads technical teams to design, build, deploy, and maintain NLP solutions, and her expertise has helped companies organize and decode their unstructured data to solve a variety of business problems and drive value through automation. Rachel is also the author of the recent book "Teaching Computers to Read" (http://amazon.com/dp/1032484357) and corresponding code companion.
Emma Rosenthal & Stephanie Chen
From Bots to Bookings: Agentic AI in the Real World @ Expedia
Emma Rosenthal is a Data Scientist at Expedia Group where she works on the Checkout team, focusing on AI Driven Insights, AI integrations into the checkout flow with ChatGPT, A/B testing, and data-driven product optimization. Prior to Expedia, Emma received her Master’s in Computer Science and Bachelor’s of Economics from the University of Chicago.
Stephanie Chen is a product analytics leader with over a decade of experience in financial services, payments, and technology. She spent seven years at JPMorgan Chase in credit card and digital payments, followed by four years at PayPal leading product analytics for P2P, Venmo, and charitable giving. At Expedia Group, Stephanie focuses on applying agentic AI to consumer travel products, embedding AI-assisted decisioning into experimentation and personalization. Her work advances analytics as an active decision partner and shapes how travelers discover options, evaluate trade-offs, and complete bookings with greater clarity and confidence.
Shikha Verma
From Individual Contributor to Data Leader: How to Unblock your team & Influence Strategy
Ph.D. in Machine Learning with 5+ years of industry experience working in high-performance, worldwide scale projects on fraud detection, warehouse management & promotion targeting across fintech, e-commerce & healthcare. She is skilled in supervised & unsupervised machine learning algorithms, building end-to-end ML pipelines, applied statistics, Python & SQL.
She has presented her research at various academic and practitioner conferences like Grace Hopper Celebrations (India), the Women in Machine Learning workshop at NeurIPS & ICML, and ACM Conference on Machine Learning and Human-Computer Interaction (2020). She has served as a visiting faculty for courses on AI, ML, and business analytics across management institutes in India.
Anastasia Bernat
GeoAI for the Built Environment: Siting and Permitting
I am a Senior Data Scientist at the Pacific Northwest National Laboratory (PNNL) specialized in processing and modeling energy and Earth system data for impactful decision-making science. At PNNL, I am the lead architect of novel GeoAI data pipelines and manage several AI-driven and/or cloud-native applications for U.S. energy and environmental mission areas. This includes six research, agentic, and generative AI applications to streamline federal permitting reviews, a techno-economic simulator for advanced geothermal systems (GeoCLUSTER), and a U.S. energy feasibility mapper (GRIDCERF). Combining data science, computational modeling, and environmental science, I am also deft in geographic information systems (GIS) and statistical modeling used to better enable intelligent mapping and environmental monitoring analyses. My leadership has guided data product teams to deliver impact and value to sponsors across the Department of Energy, including earning project-level recognition by the White House in the “AI for Good” space and in “America’s AI Action Plan”.
Swapnil Agrawal
Soft Skills Are Not Optional: Why Early-Career Data Professionals Need Them Most
Hello, I’m Swapnil. I was born and raised in India and moved to the United States in 2018. I earned my BTech from the Indian Institute of Technology, Delhi, and my master’s degree from Carnegie Mellon University in Pittsburgh.
I’ve built my career as a Data Scientist across diverse organizations. I began at a startup in Pittsburgh, then spent nearly three years at Lubrizol Corporation in Houston, Texas. Currently, I work as a Data Scientist at Microsoft, specializing in product data science.
Outside of work, I enjoy painting, reading, and cooking. I’m very outdoorsy and love cycling, kayaking, hiking, and camping. I also have a five-year-old German Shepherd who keeps life busy, active, and joyful.
Nandita Krishnan
AI As Your Personal Data Science Intern
Nandita Krishnan is a Consultant-turned-Data Scientist who brings a unique blend of strategic thinking and technical expertise to her work. Currently part of Adobe's team, she focuses on enhancing user experience for flagship products such as Premiere Pro by uncovering user needs hidden within complex data.
Beyond her day-to-day role, Nandita is deeply curious about the evolving tech landscape and is constantly exploring and experimenting with the latest tools and technologies, expanding her skill-set and staying at the forefront of data science innovation.
Passionate about creating pathways for others, Nandita is also an active advocate for women in STEM. She regularly mentors aspiring Data Scientists and participates in speaking engagements to inspire the next generation in tech.
Ayushi Das
Taxonomy-Agnostic Hybrid Recommendation System for Procurement Classification
I am from Kolkata and have a strong academic foundation in mathematics and applied sciences. I completed my undergraduate and master’s degrees in Mathematics from Banaras Hindu University, followed by an M.Tech in Cryptology and Security from the Indian Statistical Institute, Kolkata. In 2023, I was selected for a six-month internship at Amazon, where I worked on applied machine learning problems at scale. In January 2024, I joined Amazon as a full-time Data Scientist in the AFT organization and subsequently transitioned to the FinAuto team. My current work focuses on building production-grade data science and Generative AI systems, including taxonomy-agnostic classification and supplier-aware recommendation solutions for enterprise procurement. Besides work, I love to cook, dance, and spend time with animals.
Livestream Exclusives!
These talks will be played for the livestream feed during the breaks in the in person conference. In person attendees will be able to view these online after the event.
Ojasvi Khanna
Forecasting You: How Data Science Powers Personalized Marketing
I have been doing AI and ML-modeling for Xbox for the past 4 years as a Data Scientist. My work helps create better marketing outputs, helping gamers play their next game faster! I went to UC Berkeley, enjoy skiing, tennis and biking when the weather allows.
Sneha Sivakumar
Beyond the Prompt: Building Autonomous AI Agents for High-Stakes Adversarial Environments- such as finance, fraud & abuse
Sneha Sivakumar is a Product leader at Amazon, where she leads Content Risk Moderation (CRM) for the self-publishing books business, Kindle Direct Publishing. Prior to Amazon, she worked at KPMG where she led the Technology Risk practice advising large public companies on mechanisms to quantify and mitigate technology, financial and social risk. She is experienced in building consumer and enterprise products that help organizations manage risk through the use of ML, automation and human inputs. Sneha holds a BS in Engineering from Anna University, India, an MS in Industrial and Systems Engineering from USC and an MBA from Kellogg School of Management.
Erin Wilson
Biomanufacturing for a better world
I am a data scientist pursuing a career at the intersection of computing, biology, and sustainability. My experience includes working at biotech companies like Amyris (engineer yeast to convert sugar into alternatives to petroleum based products) and LanzaTech (convert carbon emissions to ethanol with bacteria), and completing a PhD in the Computer Science program at UW (using ML techniques to model DNA patterns in methane-eating bacteria). When I'm not nerding out about climate biotech, you can find me enjoying fresh air on PNW trails or rolling dice to explore D&D fantasy realms.
Harsheeta Venkoba Rao
Designing Reliable Agentic AI Systems: Design Patterns for Production
Harsheeta Venkoba Rao is a Founding software engineer at Gone.com with extensive experience in agentic AI, machine learning, and building reliable end-to-end software systems. She holds a master’s degree in Electrical and Computer Engineering, specializing in machine learning and data science.
Neelam Koshiya
Model Context Protocol (MCP): The Next Frontier of Generative AI
I'm a Principal Applied AI Architect at AWS with 17+ years of experience in architecture, including 10+ years focused on cloud and AI. As a thought leader in AI-driven solutions, I regularly speak at global tech conferences including AWS re:Invent, AWS re:Inforce, AWS Summits, NRF, and Grace Hopper Celebration (GHC), where I share insights on bridging the gap between AI and real-world business applications.
My expertise spans cloud architecture, generative AI, and retail innovation, making me a recognized voice in the industry. I'm the author of the published book AWS Solutions Architect Associate Certification Guide and a contributor to the Responsible AI Lens for the AWS Well-Architected Framework.
My work has been recognized with several prestigious awards, including the Advancing Women in Technology (AWT) 2023 Rising Stars Award, Success Quarterly 2025, the Globee Award for thought leadership in artificial intelligence, and the Global Recognition Award as a standout leader in the industry.
I'm passionate about helping organizations unlock the transformative potential of AI through practical, scalable solutions—from property inspection and document processing to customer experience enhancement and workforce productivity. I've successfully identified and prioritized AI use cases across diverse industries, including finance and real estate.
Aashreen Raorane
More Than a Retrain: How to Monitor, Diagnose, and Explain Drift in Production ML Models
Aashreen Raorane is a Senior Data Scientist specializing in analytics, machine learning, and cross-functional decision support. She holds a Master’s in Computer Science from the University of Southern California, where she focused on data science and applied ML. Her work centers on bringing clarity to complex problems and influencing strategic direction through practical, well-framed questions. She is passionate about sharing skills that help others think more clearly and lead more effectively in their roles.
Alisha Gala & Booma S Balasubramani & Jingyi Du
When (and When Not) to Leverage Agentic AI: Practical Lessons from Building Projects and Autonomous Data Workflows
Jingyi Du is a Principal Data Science Manager with over a decade of experience spanning software engineering and applied data science. Jingyi leads Windows engagement strategy and conduct large-scale experimentation at Microsoft, driving growth through data-driven decision frameworks. With a background in Computer Science and Decision Science, Jingyi has authored research presented at Microsoft ML & Data Science Conference and delivered 50+ talks to audiences from technical teams to senior leadership.
Alisha Gala is a Senior Data Scientist at Microsoft with over eight years of experience across software engineering, applied data science, and AI-driven experimentation. Her work focuses on driving engagement and growth across Windows through metric design, large-scale experimentation, personalization and recommendation systems, and cohort-based behavioral analysis. Alisha partners closely with product and engineering teams to translate complex data into clear, decision-ready insights. She is a frequent speaker in technical and cross-functional forums, known for turning analysis into narratives that directly inform product strategy.
Booma Sowkarthiga Balasubramani is a Senior Data Scientist at Microsoft with deep expertise in AI and Data Science, built over a decade across academia, research, and large-scale industry applications. Holding a Ph.D. in Computer Science (Data Science) from the University of Illinois at Chicago, Booma has developed frameworks for ontology engineering, ontology matching, predictive modeling, and geospatial data analytics. At Microsoft, Booma leads work involving Copilot on Taskbar, predictive modeling, metric design, experimentation, and data-driven growth strategies for Windows. A seasoned speaker and educator, Booma has delivered talks at global conferences, featured sessions at developer events as well as academic events, and taught university courses, consistently making complex AI and Data Science concepts accessible to diverse audiences.
2026 Panels
From HIPAA rules to illegible clinic notes, the obstacles to data-driven innovation in the medical field are nearly as abundant as the data. These challenges are strewn across every subset of health care and medical research industries. We’re looking to fill a panel with diverse speakers who can share insights into different aspects of the industry, the impediments they encounter, and how they adapt to deliver data solutions.
Kristin Mussar
Associate Director
Kristin is an associate director at Pfizer, where she leads a team of programmers building data pipelines to automate biomarker data quality control. This work improves data accessibility and accelerates decision‑making in clinical trials. She is passionate about data standardization and about preparing messy data for meaningful analysis. Kristin aims to create an environment where data‑driven decision making can thrive and to bridge the gap between raw data and the scientists eager to translate it into breakthroughs. Outside of work, she enjoys gardening, ceramics, painting, reading, and board games.
Vaishnavi Subramanian
Data Engineer
I am a Data Engineer and Machine Learning professional with over 8 years of experience building high-impact data solutions for industry leaders such as Microsoft, Fred Hutch Cancer Center, and T-Mobile. Currently, I am pursuing my Masters in Data Science and Analytics at Georgia Tech, where I specialize in architecting unified data environments and productionalizing ML models for complex fields ranging from quantum computing to immunotherapy research.
Beyond my work with high-dimensional datasets, I am a passionate advocate for "Data Science for Social Good." I have volunteered my expertise as an NLP data scientist for the investigative news organization WhoWhatWhy and have conducted research for the Art of Living to scientifically validate the impact of meditation on stress reduction.
As a certified yoga instructor and a public school art volunteer, I pride myself on my ability to translate complex technical concepts into accessible, human-centric narratives. My goal is to use data not just for optimization, but as a tool to inspire action and drive meaningful social change.
Estelle Giraud
CEO & Co-Founder Trellis Health
Estelle Giraud is the founder and CEO of Trellis Health, a generational health AI company building the private health intelligence layer for families. She holds a PhD with distinction in population genomics, published dozens of scientific papers, and spent nearly a decade as a commercial operator at the forefront of precision medicine at Illumina, where she built a $400M business and led products used by more than 20 million people across clinical genomics and health technology. She left to fix a problem she saw from inside the system: no one is building health infrastructure that treats the family as the unit of care. Trellis reconstructs a decade-plus of medical history from 50,000+ provider sites nationwide in under a minute, and uses proprietary AI to deliver longevity guidance, at-home diagnostics, and ongoing support across generations. The company has grown 50% month-on-month since launch, is live across 49 states, and is backed by Palette Ventures, Swizzle Ventures, NextBlue, and the founders of Flo Health and Care.com.
Layoffs are a looming reality that we cannot afford to ignore. We aim to bring together voices from across the data community who have experienced, navigated, and overcome being laid off. We invite panelists who want to share their honest stories, practical strategies, and lessons learned about resilience, reinvention, and growth after a difficult setback.
Giselle Doolan
Chief of Staff for Data & AI
Strategic leader with 17+ years experience in product and tech operations and strategy. Currently the Chief of Staff for Data & AI at Expedia Group. Previous experience includes Google, Vivint Smart Home, Ancestry.com and The New York Times.Tao Tao
Jessica Marx
SEnior Data Scientist
Sr Data Scientist, Data Engineer, and ML Engineer with 8+ years of experience building production ML systems and data infrastructure.
At Nordstrom, co-launched Smart Markdown, the company's first dynamic pricing optimization model and designed Nordycast, an open source internal ML Ops platform (subject of talk at WiDS Puget Sound 2022).
At Textio, built core production NLP systems including BERT-based discrimination detection, led cross-functional data migrations, and launched the company’s first experimentation framework. Currently an AI/LLM Engineer at a stealth-mode startup.
Irina Virnik
Data Engineer
Data engineering leader with 15+ years of experience building data platforms and helping teams make better decisions. Currently a Principal Data Engineer at JumpCloud. Previously worked at OfferUp, Disney, and Visa, leading large-scale data and analytics initiatives across cloud environments.
2026 Workshops
Registered attendees will receive information on how to sign up for the workshops closer to the day of conference. Keep an eye out for an email!
When AI “Works” but Still Fails: The Safety Problem
AI safety isn't abstract -- it's a judgment call that data scientists make every day, often without realizing it. In this interactive workshop, you'll work through five real-world scenarios involving AI systems that failed, behaved unexpectedly, or created harm despite working exactly as designed. Through small-group discussion, we'll surface why these situations are genuinely hard -- and why the disagreements matter. We'll close by connecting those practical tensions to CSA's Trusted AI Safety Expert (TAISE) certificate developed in collaboration with Northeastern University. No prior AI safety knowledge required.
Participation requirements: No laptop or software needed. Attendees should be prepared to discuss real scenarios in small groups.
Larry Hughes
Vice President of Research and Development at the Cloud Security Alliance (CSA)
Larry Hughes is the Vice President of Research and Development at the Cloud Security Alliance (CSA), where he leads global research programs focused on cloud and AI security. In this role, Larry oversees CSA's portfolio of research initiatives spanning AI Safety, the AI Controls Matrix (AICM), the Security Trust Assurance and Risk (STAR) program, including STAR for AI, and several dozen more.
With 30 years of experience in the security industry, including 10 years in security Governance, Risk, and Compliance (GRC), Larry brings deep practitioner expertise to CSA's mission of developing vendor-neutral guidance that helps organizations navigate the rapidly evolving landscape of cloud and AI risk. His work spans the full research lifecycle -- from working group formation and publication development to compliance automation and control mapping.
Larry holds the CCSK, CCSP, and CISSP certifications.
Anna Campbell McKee
Director of Training Programs at the Cloud Security Alliance (CSA)
Anna Campbell McKee is the Director of Training Programs at the Cloud Security Alliance (CSA), where she leads global education programs focused on cybersecurity best practices. She oversees CSA’s flagship certifications, the Certificate of Cloud Security Knowledge (CCSK) and the Certificate of Competence in Zero Trust (CCZT), which have received top industry honors, including recognition for Best Cloud Security Certification and Cutting-Edge Cybersecurity Training.
Anna is currently leading CSA’s newest initiative: the development of training and certificate programs, such as Trusted AI Expert (TAISE), focused on AI safety and security to address the emerging governance and risk challenges of AI-enabled systems.
With over a decade of experience in cybersecurity, governance, data integrity, and public sector collaboration, Anna is known for translating complex technical topics into accessible, standards-driven education. She contributes to cybersecurity research, speaks on workforce development and emerging technology risks, and supports mentorship and civic leadership through her service on her local city Planning and Utilities Commission.
Anna holds an MBA from Western Washington University and undergraduate degrees in Biology and Integrated Science from the University of Washington.
Build an AI Quiz Generator: From Local Dev to SageMaker Endpoint with Kiro
In this 50-minute workshop, you'll build an AI-powered tool that generates multiple-choice quiz questions from machine learning research papers. Starting with Kiro — an AI-assisted development environment — you'll develop and debug the application locally, then move to AWS where you'll run the pipeline using Amazon Bedrock, evaluate output quality with an LLM-as-a-judge approach, and deploy it as a live API endpoint. A provisioned AWS environment is provided so you can follow along and continue experimenting for 24–48 hours after the session. Basic Python experience assumed.
Mingwei Shen
Senior Manager, Applied Science at AWS Training and Certification
Mingwei Shen leads Applied Science at AWS Training and Certification, driven by one question: as AI transforms jobs, what should people learn? A 10-year Amazon veteran who led global teams delivering >$100M/year in ML automation savings, he now focuses on upskilling over automation. He is an avid GenAI tinkerer.
Building your own Developer Advocate with Deep Agents and Elasticseach
Learn how to implement a deep-thinking research assistant in Python using LangChain’s Deep Agents and Elasticsearch. This session walks through building a sub-agent in a prebuilt agentic harness application with tools for vetted research and comparative analysis, perfect for anyone exploring AI-powered workflows.
Justin Castilla
Senior Developer Advocate @ Elastic
Justin Castilla started his Software Engineering career as a Web Development Boot Camp Instructor where he developed a passion for exciting others with new concepts and empowering individuals with the tools needed to excel in their own right. As an Advocate at Redis, Justin created numerous videos breaking down Data Structures into easy-to-understand, relatable examples with real-world use cases. Now at Elastic, he has expanded into the realm of enhanced search, monitoring, and observability capabilities.
Using AI to Improve Data Science Workflows
This workshop explains the shift from AI as a line-by-line coding assistant to autonomous data science agents that take a high-level goal, search data assets, plan and run multi-step notebook workflows, retry and fix errors, and produce summaries and visuals—inside Databricks’ governed lakehouse. We contrast the old manual orchestration loop with agents that combine an LLM, tools (SQL, notebooks, UC, serving), and orchestration frameworks, grounded by UC semantics and lineage, session memory, and MLflow / AI Gateway observability. We describe how to reuse curated prior notebooks and pipelines, encode standards and domain logic (instructions, UC functions, RAG), and treat evaluation as a first-class system. We demo an example (enterprise churn) where humans set constraints and validate while agents do the heavy analysis, arguing this raises productivity and governance and elevates data scientists to designers and stewards rather than replacing them.
Ginger Holt & Grace Yang
2026 Career Mentorship Sessions
Tiffany Dedeaux
Leading When the System Is Changing: Human Skills for Technical Leaders in Uncertain Times
Data science is practiced inside systems that are constantly evolving — reorganizations, new technologies, shifting expectations, and accelerating timelines. While tools and models change quickly, the human demands placed on professionals often go unnamed: leading without authority, navigating ambiguity, and making ethical decisions when certainty is unavailable.
This session centers the human skills required to lead well when the system itself is changing.
Drawing from leadership development, organizational change work, and lived experience supporting professionals in high-impact environments, this session explores how data practitioners can cultivate discernment, clarity, and steadiness amid ongoing transformation. This session offers practical leadership lenses that help individuals remain effective during periods of uncertainty.
Participants will learn how unspoken expectations create invisible pressure, and explore how narrative, boundaries, and ethical self-trust function as stabilizing forces in rapidly shifting environments. The emphasis is not on “doing more,” but on leading with greater intentionality and sustainability.
Learning Objectives:
Language to name what is actually difficult about leading through change
A simple framework for navigating uncertainty without burning out
Greater clarity about how their presence and decisions influence systems, even without formal authority
This session is designed for women and gender-diverse professionals in data and adjacent technical fields who are stepping into influence — whether or not their role formally reflects it. It complements technical learning by strengthening the human foundations that allow professionals to adapt, communicate, and lead with integrity as the landscape continues to evolve.
About our Mentor:
Tiffany Dedeaux is a Master Certified Coach (MCC) and leadership development practitioner working at the intersection of organizational change, professional identity, and career transition. She is the founder of **[Sacred Time](https://sacred-time.com/)**, where she partners with professionals navigating complexity, ambiguity, and evolving systems — including leaders and practitioners in technical and data-adjacent fields.
With over 15 years of experience, Tiffany supports individuals and groups as they step into influence, clarify their professional narratives, and lead through change with integrity. Her work prioritizes discernment, clear thinking, and sustainable leadership presence over performance for performance’s sake.
Tiffany holds a Master of Arts in Ecopsychology and Cultural Transformation, grounding her work in a systems-level understanding of how people, roles, and environments shape one another. In addition to her coaching practice, she has held senior volunteer and governance leadership roles within professional associations, leading through restructuring, crisis response, and strategic realignment.
She is a frequent speaker and facilitator for conferences and professional communities, including Women in Data Science events, PyData chapters, and career-focused organizations. Tiffany’s work resonates especially with women and gender-diverse professionals navigating transition, visibility, and influence in data-driven and technical environments
Rethinking Career Planning and Technical Resumes in the Age of AI
AI has changed both the technology job market and how resumes are screened, with employers relying on a mix of applicant tracking systems, human reviewers, and AI‑assisted tools—while many candidates now use AI to generate generic, look‑alike resumes.
This interactive session helps data professionals rethink how they present their work. Instead of long lists of tools and duties, participants will learn how to communicate outcomes, real‑world impact, and capabilities in ways that make sense to both technical and non‑technical reviewers.
Attendees will practice articulating their unique capabilities—the combination of skills, talents, interests, and knowledge that drive their best work—so they can craft resumes and LinkedIn profiles that stand out in an AI‑influenced hiring environment.
Learning Objectives:
In a competitive job market, explore practical career paths that help them build strong data skills over time and move confidently toward a data science role.
Identify the specific capabilities that differentiate them as data professionals and express those clearly on their resumes.
Tailor resume content for different data‑oriented roles (data science, analytics, data engineering, data management, AI/ML, etc.) by providing context for projects and results.
About our Mentor:
Jennifer Hay is a career coach and resume writer specializing in technology, data, and analytics careers. As the founder of Tech Career Services and IT Resume Service, she combines technical expertise with career development experience to help clients define goals, create actionable plans, and present their strengths with confidence. She uses a proprietary career assessment and planning methodology (STIK) to guide students, recent graduates, and mid-career professionals in navigating the tech job market. Jennifer is certified in IT Resume Writing (CRS+IT), Student Career Coaching (CSCC), and holds CBIP credentials in Data Analysis and Business Analytics.
Jennifer Hay
Sally Revell
Speak Up, Lead Boldly: The Confidence Toolkit for Women in Tech
Confidence is not something you’re born with. It’s a skill you can build. In this interactive 45-minute session, you’ll learn practical strategies to strengthen confidence in your career by developing emotional intelligence, recognizing your inner saboteur, anchoring to your values, and taking small, brave actions before you feel fully ready. Through reflection and hands-on exercises, you’ll leave with tools you can use immediately to speak up, navigate pressure, and lead with more clarity and self-trust.
About our mentor:
Sally Revell coaches women in tech on confidence, clarity, and courage under pressure. A former executive at AWS, Google Cloud, and Intuit, she's led global teams, navigated reorgs, and launched AI products at scale. Sally is Co-Active trained and combines 25 years of lived leadership experience with executive coaching rigor. She founded Generative Human to help leaders build self-trust and lead without burning out. Based in Kirkland, Washington.
2026 Abstracts
Keynote
From Models to Teammates: Operating, Monitoring and Trusting Agentic AI in Production
Akriti Chadda - Senior Applied Scientist
Agentic AI systems are changing not just how machine learning works, but how teams think, communicate and make decisions. Unlike traditional models, agents plan, act, and adapt over time and often in non-deterministic ways. This creates a new leadership challenge: how do you build trust in systems that don't behave predictably and how do you explain their risks, limitations and failures to non-technical stakeholders?
This talk focuses on the human and organizational skills required to operate agentic AI in production. It explores how to communicate uncertainty, set realistic expectations and influence decision-making when metrics are incomplete and failures are subtle. Attendees will learn practical frameworks for framing agent behavior, aligning cross-functional teams and pushing back on over-automation when agents are the wrong abstraction.
Grounded in real-world production experience, this session helps data professionals grow beyond technical execution into thoughtful leadership thus equipping them to guide teams, stakeholders and organizations through the complexities of deploying agentic AI responsibly and effectively.
Invited Speakers
Every LLM Call Counts: The environmental cost of AI, and how data scientists can reduce it
Catherine Nelson - Data Scientist, ML Engineer, Author, consultant
AI comes with a big environmental cost. Training and serving AI models consumes vast amounts of electricity, water, and raw materials. Already, AI accounts for around 15% of data center energy usage, and the energy demands of AI are projected to double by 2030. But as data scientists using AI, there are some things we can do to reduce our environmental footprint.
In this talk, I'll summarize the latest data on AI's environmental impact, looking in particular at OpenAI, Google, and Anthropic. I'll also highlight what data the providers aren't disclosing, and why that lack of transparency makes it harder for us to make good choices.
In the second half of the talk, I'll give you actionable steps you can take to reduce the impact when you're using an AI model. I'll show you techniques including prompt optimization, model selection strategies, and caching that reduce both environmental impact and costs. I'll also talk about how good evaluation data is essential for these. You'll learn which models are most efficient, how model choices affect emissions, and you'll gain practical knowledge to make more sustainable choices.
How to Work with Your PM (When They Don't Speak AI)
Shaili Guru - AI product leader and educator
You've built something promising. You understand the technical tradeoffs. But your PM keeps asking for timelines you can't commit to, scope that doesn't make sense, or success metrics that miss the point.
After more than a decade on the PM side, I can tell you they're not being difficult on purpose. Data science projects just don't follow the rules they learned managing traditional software. That mismatch causes real problems.
I keep hearing the same frustrations from data scientists. Being treated like a request machine. PMs who have no clue what's easy versus hard in ML. Agile processes that try to cram research into two-week sprints. Work that stays invisible until it ships.
But here's what most data scientists don't see. Your PM is getting squeezed too. They're being asked for roadmaps, to defend your project against competing priorities, and to translate your work for leadership (usually without the context they need to do it well).
This talk is about bridging that gap. I'll share what PMs are actually worried about when they push for certainty, why they frame things the way they do, and what helps them advocate for your work when you're not in the room.
We'll cover how to reframe uncertainty as risk, how to make the exploration phase work visible, and how to build a real partnership with your PM. Not just a transactional one.
You'll leave with approaches you can use in your next sprint planning: language that lands, ways to build trust, and how to educate without being condescending.
Leveraging AI to Support Evidence-Based Wildlife and Permit Management
Sridevi Narayana Wagle - Machine Learning Engineer, Pacific Northwest National Laboratory
The Hanford Site played a central role in the Manhattan Project, producing an extensive corpus of scientific, engineering, and operational records spanning multiple decades. These materials ranging from technical reports and engineering drawings to photographic documentation are essential for contemporary nuclear research, environmental remediation, and historical analysis. While this is publicly accessible through the DOE Declassified Document Retrieval System (DDRS), its analytical value is significantly constrained by inconsistent metadata, limited document-level indexing, heterogeneous file formats, and the lack of full-text search capabilities.
We present a scalable AI-based framework for multimodal archival exploration that integrates semantic search, automated metadata enrichment, and interactive large language model (LLM) interfaces. Using AWS Bedrock embeddings in combination with the Claude Sonnet 3.5 model, our system extracts structured entities, infers relationships, generates technical summaries, and supports conversational querying over text and image-based content. The data processing pipeline processed approximately 1.5 TB of legacy data, including 4 million TIF files, over 70,000 images, and 1,300 PDF documents. Automated deduplication, document reconstruction, and page-level segmentation enabled fine-grained indexing and embedding of previously inaccessible technical details.
The resulting multimodal search platform supports fuzzy matching, retrieval, and contextual filtering, allowing users to locate specific chemical compounds, process descriptions, construction specifications, or equipment references embedded deep within scanned reports or imagery. The AI-driven interface dynamically generates follow-on research questions and interactive knowledge graphs that expose cross-document linkages, enabling new forms of exploratory analysis across historical nuclear workflows and environmental impact data.
This work demonstrates a methodology for transforming complex, low-accessibility scientific archives into AI-ready knowledge systems. Beyond Hanford, the approach establishes a technical foundation for applying advanced AI-driven discovery to other unique DOE collections, accelerating research, improving archival usability, and supporting future innovation.
When Time Tells: Using Sequence Modeling to Understand Transfer Student Retention
Hoda Soltani - civil engineer and data scientist, university of oklahoma
An end-to-end predictive analytics framework to model student dropout, with a focus on transfer students at a four-year university. Student dropout remains one of the most multifaceted and pressing challenges in higher education, arising from a complex interplay of academic, social, economic, and institutional factors that limit both individual potential and broader social mobility.
This study conducts school-level retention prediction using university-specific administrative datasets that are not publicly available. Transfer students--those who have previously earned academic credit at another postsecondary institution--represent a large and academically diverse population whose successful integration into four-year institutions requires timely, evidence-based support informed by both historical academic pathways and early university performance.
The session presents a comprehensive predictive framework examining the academic histories and first-term outcomes of transfer students admitted to Engineering, Business, and Arts and Sciences over a three-year observation period. Drawing on principles from educational data mining, the analysis incorporates multidimensional features including sociodemographic attributes, pre-transfer coursework, enrollment intensity, academic load, financial aid, campus employment, and early indicators of academic engagement.
The modeling pipeline integrates supervised learning for binary classification (retained vs. non-retained), clustering methods to identify latent student subpopulations, and model-interpretation tools to support transparency. Central to the framework is the use of sequence modeling techniques--such as recurrent neural networks, gated recurrent units, and attention-based architectures--to capture temporal dependencies in students' academic trajectories. Rather than relying on static or summary-based features, these models learn patterns across semester-by-semester course enrollments and performance, enabling more accurate and earlier identification of dropout risk by modeling the order, timing, and evolution of academic behaviors. Methodological challenges, including class imbalance, overfitting, and domain-informed feature engineering, are explicitly addressed.
A Data Science Approach to Quantifying Fish Passage Through Dams, Assessing Fish Injury, and Advancing Fisheries Research
Erin Zionce - Data Scientist, Pacific Northwest National Laboratory
Sandy Rech - Earth Scientist, Pacific Northwest National Laboratory
Dams disrupt the natural life cycles of migratory riverine fish, posing significant challenges to their survival. Addressing these connectivity issues requires interdisciplinary collaboration between biologists and data scientists. At Pacific Northwest National Laboratory (PNNL), researchers integrate ecological expertise with data science to study fish passage and survival. PNNL’s fish passage projects focus on anadromous fish species such as salmonids, using various tagging methods (e.g., radio telemetry (RT), balloon-tagging) and injury assessments to analyze migration through dams. Two studies conducted at U.S. Army Corps of Engineers operated dams – Mud Mountain Dam (MMD) in Washington State and Foster Dam in Oregon – used RT to evaluate fish passage and survival. At MMD, adult Chinook salmon (Oncorhynchus tshawytscha) implanted with RT tags were tracked as they returned to spawning grounds from the ocean via a Fish Passage Facility. At Foster Dam, RT-tagged juveniles were monitored to evaluate survival rates and travel times during ocean-bound migration. Efforts to automate fish tracking and streamline data analysis aimed to reduce manual data processing. However, challenges like the noisy nature of RT data and unpredictable fish behavior required tailored algorithms to ensure accurate results. A third study conducted at Howard A. Hanson Dam (HAHD) in Washington State used balloon-tagging with complementary injury assessments to evaluate the biological consequences of dam passage through specific routes. Traditional injury assessment methods at HAHD rely on intensive fish handling and manual assessment, introducing potential human bias, variability, and stress to fish. As an extension of this study, a proof-of-concept approach leveraging AI-driven image analysis was developed to automate and standardize injury assessments, while reducing human bias and minimizing stress to fish. Together, these three projects demonstrate the importance of interdisciplinary research to improve the evaluation of fish passage and survival, and to support the conservation of salmonid populations.
AI Beyond English: Building Multi-Lingual and Non-English AI Solutions
Rachel Wagner-Kaiser - Director, NLP Data Scientist
We will address the core challenges technical teams face when dealing with non-English languages in building effective AI solutions, reinforced by real-life examples. We will outline the complexity of non-English data, from tackling non-Latin character sets and low-resource languages to the practical hurdles of transforming unstructured data (like images and audio) into usable text. We will also go into the options for different technical approaches, including topics such as the complexity of language detection and cross-language processing techniques. The session will also analyze the current role and limitations of LLMs across diverse languages. We will conclude with best practices for designing and deploying high-performance, multilingual NLP systems that deliver value for practical business use cases.
Developing hybrid KG-LLM solutions for reliable information extraction
Anahita Pakiman - Senior Knowledge graph engineer & semantic Architect, amazon
In this talk, we will explore the c"Qualification processes in industrial settings require accurate, equipment-specific inspection criteria from technical documentation and perfect execution to ensure deliverable quality and minimize post-launch downtime and claims. This presents a challenging science problem: how to extract and generate reliable inspection recommendations from heterogeneous data sources, without hallucinations.
We developed a hybrid Knowledge Graph - LLM solution that addresses fundamental limitations of LLM-only approaches. Initial LLM-only approaches exhibited significant hallucinations, generating unreliable inspection values and recommendations that couldn't be validated against domain constraints, prompting our hybrid KG-LLM solution.
Our methodology employs a domain-specific KG that captures semantic relationships between equipment types, failures and inspection requirements. This domain-specific KG extracts and links entities from diverse historical data sources, including failures and unstructured technical documentation, creating a comprehensive semantic network for constrained generation.
By using graph patterns to constrain LLM inputs, we transformed the task from open generation to structured information insertion, significantly reducing hallucinations. Results demonstrate substantial improvement in inspection recommendation accuracy and consistency, while maintaining extraction efficiency. The methodology offers generalizable findings for bridging structured and unstructured data in domains requiring high-precision in AI outputs.
From Bots to Bookings: Agentic AI in the Real World @ Expedia
Emma Rosenthal - Data Scientist, Expedia Group
Stephanie Chen - Senior Manager, data science, expedia group
Agentic AI is reshaping how we build and interact with data systems - and at Expedia, we're harnessing its power to redefine both customer experiences and internal workflows. In this session, we'll share a two-fold perspective on practical implementations of agentic AI in industry.
First, we'll explore how Expedia is integrating conversational AI into our checkout experience. By enabling users to connect Expedia with ChatGPT, travelers can browse products and book trips directly through AI-driven workflows. We'll discuss the architecture behind this integration, the strategies for measuring AI-driven user behavior, and the challenges and opportunities of embedding agentic AI into a high-stakes e-commerce environment.
Second, we'll turn inward to examine how AI is transforming the way data scientists work. From intelligent agents that automate repetitive tasks to workflow optimizations and productivity ""hacks"", we'll showcase how AI tools are accelerating analytics, improving decision-making, and freeing teams to focus on high-value insights. Attendees will gain practical ideas for leveraging AI in their own organizations--whether to enhance customer-facing products or to streamline internal processes.
Join us for a candid look at the promise and limitations of agentic AI in real-world applications, and learn how Expedia is navigating this rapidly evolving landscape to deliver smarter experiences for travelers and data professionals alike.
From Individual Contributor to Data Leader: How to Unblock your team & Influence Strategy
Shikha Verma - Senior manager analytics, toast
The transition from individual contributor to manager is one of the most challenging career shifts in data science--particularly for women, who represent only 26% of data science roles and an even smaller fraction of technical leadership in the US. This gap widens at the management level, where many talented women ICs hesitate to pursue leadership or struggle with the transition because the playbook is unclear.
This session shares hard-won lessons from my journey as a PhD-trained data scientist turned manager of a team of 5. I'll address the identity crisis many of us face: ""If I'm not the best technically anymore, what's my value?"" and provide a concrete roadmap for becoming a leader who is technical enough to unblock your team and strategic enough to influence the roadmap.
You'll walk away with clear frameworks to apply immediately:
-- The 70-20-10 rule: How to evolve your time allocation across technical work, enablement, and strategy
-- The "Am I the Bottleneck?" test: Weekly self-assessment to identify where you're helping vs. hindering
-- The "Strategic Value" filter: Prioritization framework for ruthless decision-making
--The "Technical Enough" checklist: Know when to dive deep vs. delegate
Actionable insights on:
--What to unlearn from the IC mindset (value = output → value = team's multiplied impact)
--Where to stay technical for high leverage (design reviews, unblocking) vs. where to let go (being the fastest coder)
--How to build strategic influence through stakeholder mapping, translating analytics to business language, and saying no effectively
This is for any woman in analytics considering leadership, newly managing, or struggling with the IC-manager balance. Leave with a clear mental model and practical tools to accelerate your transition and lead with confidence.
GeoAI for the Built Environment: Siting and Permitting
Anastasia Bernat - Senior Data Scientist, Pacific Northwest National Laboratory
How do we make sure AI doesn't get lost in space, especially when agencies need to make coordinated decisions on a myriad of environmental reviews for projects planned on U.S. lands? Too often these projects are delayed or over budget due to poor coordination with a variety of federal, state, and local laws dependent on the geographic location of the project site. However, geospatial artificial intelligence (GeoAI) has the potential to transform the pace and precision of permitting. PermitAI is a multimodal large language model testbed led by the Pacific Northwest National Laboratory that uses GeoAI to streamline the environmental permitting review process by turning millions of permitting maps into structured geointelligence. Digitization efforts focus on turning vast document and map repositories amassed through the National Environmental Policy Act (NEPA) into a spatially coherent Geographic Information System (GIS) dataset that charts decades of environmental review across agencies, scales, and formats. This includes capturing key geospatial data that agencies fundamentally rely on to scope baseline environmental conditions, communicate alternatives, weigh footprint constraints, and track mitigations. By then enriching, automating, and generating geospatial data from vast and heterogeneous government georegistries, GeoAI can rapidly build cohesive spatial reasoning for streamlined interagency coordination. This presentation will highlight a GeoAI data pipeline that is transforming how agencies geovisualize and analyze permitting data, reducing time spent navigating static documents and setting the foundations for integrating historic NEPA GIS layers into modern, information-rich digital permitting platforms and decision-support systems.
Soft Skills Are Not Optional: Why Early-Career Data Professionals Need Them Most
Swapnil Agrawal - Data Scientist, Microsoft
A common belief among early-career professionals is that soft skills are something to worry about later, once you become a manager or a leader. Early on, the focus is often placed solely on technical excellence, it could be writing better code, building better models, and delivering accurate results. While technical skills are essential, they are only the starting point.
In reality, soft skills matter more than ever at the beginning of a career. Early-career data professionals frequently work in ambiguous environments, collaborate across teams, and translate complex insights to non-technical stakeholders. Without strong communication, collaboration, and storytelling skills, even the best analysis can fail to influence decisions or create impact.
This talk challenges the myth that soft skills are only relevant for managers and leaders. Drawing from real experiences transitioning from entry-level to mid-level roles, the session demonstrates how early investment in communication, influence, and collaboration accelerates career growth, increases visibility, and builds trust with stakeholders.
Attendees will learn practical techniques to structure compelling data stories, align analysis with business goals, handle pushback on insights, and influence decisions without formal authority. The talk also explores how to demonstrate leadership behaviors--such as ownership, clarity, and empathy--regardless of title.
By reframing soft skills as career accelerators rather than optional extras, this session equips early-career data professionals to maximize their impact, navigate complex organizations, and grow faster and more intentionally in their careers.
AI As Your Personal Data Science Intern
Nandita Krishnan - data scientist
Many data professionals find themselves spending more time debugging AI-generated code than they would have spent writing it themselves, defeating the entire purpose. And then there are those who have abandoned AI tools entirely after repeated, frustrating experiences. This gap between AI’s potential and its practical application in real-world data science work remains frustratingly wide.
This talk bridges that gap by providing practical strategies for leveraging agentic AI, such as Cursor and Claude Code, as your ‘personal intern’: one that can actually excel when you give the right amount of supervision and guidance. Drawing from hands-on experience implementing AI tools for building machine learning models, creating automated pipelines, and generating analysis visualizations, I’ll share concrete strategies that separate productive AI use from time-wasting rabbit holes.
You’ll learn how to identify which tasks benefit most from AI assistance and which are better done manually. I’ll demonstrate prompt and context engineering techniques, including dos and don’ts, to help you avoid common pitfalls. We’ll explore how to establish verification workflows that catch errors early, implement guardrails that prevent catastrophic mistakes, and create feedback loops that improve AI output over time. I‘ll also talk about how to leverage built-in memory features to maintain context across sessions, configure custom rules that enforce your coding standards automatically, and use context files to give AI the proper background knowledge for your specific projects. This is about making agentic AI a reliable partner in your daily work, not just another technology to manage.
Forecasting You: How Data Science Powers Personalized Marketing
Ojasvi Khanna - Data scientist
From the ad-sponsored content we scroll past, to the products we are shown, to the emails that land in our inboxes, personalized ads are quietly influencing countless everyday customer buying decisions. Personalization is no longer a niche application of data science--it is the backbone of modern digital marketing. Today, the $650 billion industry of personalized marketing is being rapidly reshaped by AI and is projected to grow beyond $1.5 trillion by 2035.
While personalized systems often feel intuitive or even magical, the reality is more nuanced. At its core, personalization is a forecasting problem: making informed but uncertain predictions about what future you will do. Understanding this framing helps demystify why personalized marketing works when it does--and why it sometimes fails.
This talk breaks down personalized marketing from the ground up. It explains what data science models power these systems, and how recent advances in AI are accelerating both their scale and their impact. The session covers foundational ideas, modeling approaches, and emerging AI-driven use cases, offering a high-level understanding of how data scientists and AI professionals model, execute, monitor, and evaluated such personalized marketing campaigns in industry.
Lastly, this talk also tackles an important but often overlooked question: why better models and more powerful AI do not always lead to better outcomes. Such challenges reveal a critical truth for data professionals-- performance improvements on paper do not always translate into healthier, more meaningful real-world impact.
Although the examples focus on personalized marketing, the lessons extend far beyond it. This session is designed to equip data professionals with a clearer, more grounded understanding of how AI-driven personalization works--and how to build predictive systems that are not just smarter, but more thoughtful and responsible.
More Than a Retrain: How to Monitor, Diagnose, and Explain Drift in Production ML Models
Aashreen Raorane - Senior Data Scientist
Model degradation in production rarely comes from a single failure--it emerges through subtle shifts in data, upstream pipelines, or behavioral patterns. In real-world environments, a retrain alone doesn't fix these issues. What teams need is a systematic way to detect, diagnose, and explain drift.
This session presents a practical, tool-agnostic framework for understanding model drift based on lessons learned from validating and comparing production models. We will cover:
Detection: identifying feature drift, prediction distribution changes, and version-to-version inconsistencies through input/output checks
Diagnosis: tracing issues to upstream data shifts, schema changes, data quality problems, or model logic mismatches
Explanation: translating technical findings into clear narratives for stakeholders to support retraining, rollback, or remediation decisions
Attendees will gain actionable techniques for monitoring model health and ensuring ML systems remain accurate, stable, and trustworthy over time--without requiring advanced ML Ops infrastructure.
Beyond the Prompt: Building Autonomous AI Agents for High-Stakes Adversarial Environments- such as finance, fraud & abuse
Sneha Sivakumar - Product leader
This talk provides practical, proven methods to build AI agents for real world high stakes business processes such as combating fraud. Moving past simple chatbots to explore how autonomous agents can reason, pivot, and act to stop bad actors in real-time. How do we solve the integration with existing systems and how to prepare your data for a successful launch. The talk will cover (1) how to choose a business problem ready for agentic revolution (2) data preparation and labelling (3) achieving a high, 99% precision (4) launching and learning from the the agent outcomes.
Biomanufacturing for a better world
Erin Wilson - data scientist
Industrial biomanufacturing. The phrase doesn’t quite evoke “social good” imagery, but with a little more context, I hope it will next time you hear it! Biomanufacturing is a sector of industry that aims to produce valuable materials by harnessing the vast catalog of molecules made by Nature. While molecules exist already in natural forms, many require painstaking, unsustainable, extractive processes to isolate key ingredients at scale. We can be more clever: by taking genetic instructions from organisms that naturally produce a useful molecule and installing those instructions in a microbe, we can grow tanks of microbes that “brew” such molecules instead. An ecosystem of biomanufacturing companies is already growing: many feed microbes with renewable inputs, like sugar, while others capitalize on waste streams from other industries, such as gaseous carbon emissions.
To make a dent in climate change, biomanufacturing needs to get big. Industrial scale! While steel pipe networks wrapping around building-size bioreactors may contrast typical leafy green motifs of environmental sustainability work, industrial biomanufacturing is poised for social impact. We can reduce environmental harms caused by agricultural land use and pollution, provide alternatives that displace fossil carbon-based products, and even capture and repurpose carbon emissions before they enter the atmosphere. Biomanufacturing beautifully blends the mechanical and microbial for sustainability.
Many biomanufacturing approaches are maturing, but most are critically held back by underdeveloped data practices. We need better measurement equipment that can detect small-but-critical changes in biological systems; software that can predict and alert when such changes will trigger upsets to operations; deep domain experts that can ALSO troubleshoot with effective data analysis and visualization. Properly applied data science and engineering can help create a clearer window into the complexities of industrial biomanufacturing, and accelerate the field’s progress towards a healthier, more sustainable planet.
Designing Reliable Agentic AI Systems: Design Patterns for Production
Harsheeta Venkoba Rao - founding software engineer
Agentic AI systems promise autonomy, adaptability, and powerful multi-step reasoning but deploying them in production introduces challenges that traditional machine learning systems were never designed to handle. As systems move beyond single prompts to stateful, tool-using workflows, teams often encounter unpredictable behavior, silent quality degradation, and growing concerns around reliability, cost, and trust. This talk focuses on why these challenges emerge and how data professionals can think more systematically about building agentic AI systems that are reliable, observable, and safe.
The session begins by establishing a clear and accessible understanding of what makes a system “agentic,” contrasting prompt-based LLM pipelines with systems that reason over time, interact with external tools, and maintain context across steps. Using real-world scenarios, the talk highlights why agentic systems behave differently from traditional models and why familiar evaluation and monitoring approaches are often insufficient.
The core of the talk introduces design patterns for production agentic systems, emphasizing principles rather than tools. It explores how thoughtful system architecture, intentional monitoring, and well-placed guardrails can improve reliability without limiting usefulness. Monitoring is framed as a design choice rather than an afterthought, helping teams detect issues early, understand system behavior, and maintain confidence as systems evolve.
The talk concludes by examining the tradeoffs between autonomy and control and offering guidance on when agentic architectures add meaningful value and when simpler approaches may be more effective. Attendees will leave with a clear mental model for agentic AI systems, an understanding of why reliability is difficult but achievable, and practical principles they can apply when designing, evaluating, or deploying agentic systems in production environments.
Model Context Protocol (MCP): The Next Frontier of Generative AI
Neelam Koshiya - principle applied ai architect
As generative AI moves from experimentation to enterprise-scale adoption, the need for structure, control, and context becomes critical. Enter the Model Context Protocol (MCP)--a new paradigm that standardizes how applications communicate with foundation models using modular, context-rich instructions. MCP enables safer, more interpretable, and reusable GenAI workflows by separating business logic from prompts and embedding policy and governance into interactions. Model Context Protocol (MCP) has evolved into the universal "USB-C for AI," solving the critical "disconnected models" problem by standardizing how Large Language Models (LLMs) securely access diverse enterprise data and tools. This session explores why organizations are replacing brittle, custom-coded connectors with this model-agnostic layer to eliminate vendor lock-in, and how its client-server architecture--utilizing JSON-RPC--enables seamless integration with platforms like Claude, Bedrock, and Microsoft Copilot. Attendees will learn what capabilities can be unlocked through MCP's core primitives--Resources, Tools, and Prompts--and when to leverage the 10,000+ public integrations already available in the ecosystem to move from pilot projects to full-scale, agentic production deployments.
Taxonomy-Agnostic Hybrid Recommendation System for Procurement Classification
Ayushi Das - data scientist
Spend classification is a capability in procurement management, enabling strategic sourcing, supplier negotiation, cost optimization, and accurate financial reporting. However, classifying purchase orders (POs) and invoices remains challenging due to noisy and unstructured inputs, limited labeled data, evolving taxonomies, and ambiguous category definitions. Traditional supervised approaches struggle to generalize across such complex procurement environments.
This paper presents a taxonomy-agnostic, supplier-aware dual-expert recommendation architecture that combines LLMs with trained embedding-based semantic retrieval for robust procurement classification. The system leverages hierarchically grounded taxonomy descriptions, automatically generated and refined using LLMs, to improve semantic alignment between items and category scopes. Domain-specific embedding models are trained to enhance semantic search accuracy across noisy item descriptions, invoice text, and taxonomy metadata.
The dual-expert design consists of: (1) a retrieval expert that performs hybrid semantic search over taxonomy data, historical procurement records, and supplier intelligence, including normalized supplier descriptions and LLM-generated supplier tags; and (2) a fine-tuned LLM-based reranking expert that performs item-centric classification using structured reasoning, forced decision logic, and supplier-based validation signals. Prompt optimization of the reranking expert improves ranking precision and decision consistency without requiring model fine-tuning or retraining.
The system is evaluated across multiple taxonomies and achieves 85% top-5 accuracy on POs and 90 - 95% on invoices, outperforming the strongest baseline by approximately 20% and 51%, respectively. Recent enhancements yield 75 - 80% top-1 accuracy in real-world invoice classification. Error analysis indicates that remaining failures primarily arise from taxonomy design limitations, such as overlapping categories and insufficiently defined scopes.
Beyond accuracy gains, this work contributes automated taxonomy scope generation, supplier-aware classification via LLM-derived metadata, and scalable, production-ready framework that adapts to evolving taxonomies and unstructured data without retraining, demonstrating strong applicability across enterprise procurement environments.
When (and When Not) to Leverage Agentic AI: Practical Lessons from Building Projects and Autonomous Data Workflows
Alisha Gala - Senior data scientist
Booma Sowkarthiga Balasubramani - senior data scientist
Jingyi Du - principal data science manager
Agentic AI is increasingly promoted as the default paradigm for intelligent systems, promising autonomy, flexibility, and productivity. Yet as agent-based designs move from demos into real world data and decision workflows, teams encounter under-discussed challenges: unclear evaluation metrics, hidden operational costs, reliability risks, and ambiguous boundaries between human- and machine-control.
This talk takes a pragmatic view of Agentic AI through three concrete case vignettes:
Creative matching pipeline (image→theme→verse): A deterministic workflow outperforms agentic orchestration on latency, predictability, and explainability—illustrating when agentic behavior is not needed.
Experimentation-analysis-agent: Reads specifications, aggregates key metrics, and produces rollout recommendations—but doesn’t execute rollout decisions. Instead, the system surfaces tradeoffs, confidence signals, and guardrail-checks for human review. This shows how Agentic AI adds value as analytical decision support without crossing into autonomous control in high-risk contexts.
Autonomous anomaly investigation agent: Plans queries, revises hypotheses, and proposes remediation. It demonstrates genuine agentic properties—and failure modes that grow as control is delegated: error amplification from early wrong assumptions, persuasive but false confidence, observability gaps, and evaluation blindness when teams track completion rather than decision quality, stability, and human override cost.
Across these examples, we analyze Agentic AI as a spectrum of architectural decisions—from simple pipelines to systems that adapt their level of autonomy under uncertainty. We share a decision framework for when autonomy earns its place (including stakes, reversibility, observability) and an evaluation toolkit that operationalizes success beyond task completion (correct-action rate, steps/time saved, latency inflation, etc.).
Attendees will leave with actionable criteria to decide when Agentic AI is justified, how to evaluate its real impact, and avoid common pitfalls, including a clear rubric for defending “no agent” designs when they are the safer, faster, and more reliable choice.