Cart 0
 

When

May 8th, 2026

8:30 AM - 6:30 PM

where

Mercer Island Community & Events Center

Format

In Person


Event Highlights

  • Diverse range of technical & career development talks

  • 2 Panel Discussions

  • 3 Interactive Workshops (Space Limited)

  • Networking Activities


Topic highlights

  • Agentic AI in Practice

  • Applied Data Science Across Industries

  • Career Growth, Leadership, Layoffs and Navigating Uncertainty

  • Data Science for Social Good

    For more details on speakers, talks, panels, mentoring sessions, and workshops:

 
 

Tickets: $125

YOUR TICKET includes:

  • Your choice among 20+ sessions, including
    technical talks, professional development, panels, and workshops

  • Networking activities

  • Catered lunch and refreshments

  • Free parking at venue

  • Sponsor booths & resume bank


Get Tickets

Our events are open to all, regardless of gender.

We have a limited number of tickets available at a reduced rate for students and unemployed individuals. Contact team@widspugetsound.org to inquire.

The conference highlights data-related work in our region, with an emphasis on the important role of women in data-related fields. We aim to bring together data professionals across the Puget Sound area and provide an opportunity to learn about data science applications and research in our community.

 

2026 Conference Volunteers

Overall Leads

 

I joined the data science field 7 years ago after spending my early career working with environmental nonprofits. Since my passion for volunteering still needs an outlet, I leapt at the opportunity to get involved with Data Circles and the Women in Data Science Puget Sound conference. After DS boot camp and a year of consulting, I joined the small, but mighty team of data scientists at Trupanion. My dog greatly appreciates his insurance benefits. Excited to experience another year of knowledge sharing with WiDS!

Hi everyone! I am a Analytics Consultant at Calligo. I moved to Seattle to complete a MS in Data Science, fell in love with PNW and decided to stay! This will be my 2nd year volunteering with WiDS. I am a big proponent of community and am grateful to be a part of the WiDS community. Being around so many great people in this field is really motivating, and I’m constantly learning. Outside of work I love to cook, do yoga, read mystery novels, and take my dog for hikes.

Hi! I’m Yuqi Zhang, a Data Scientist at Amazon based in the Seattle area. I focus on using data and modeling to improve supply chain decisions and customer experience. I grew up in China and earned my M.S. in Statistics from UIUC before moving to Seattle for work.

Outside of work, I love staying active — especially playing soccer. I play regularly and organize beginner-friendly soccer training and pick-up games for women, which has become another way I enjoy building community. I love the community-driven spirit of WiDS, am grateful to be part of it, and am excited to build and learn together!

 
 

Content Team

 

Cecilia is passionate about diversity and gender inclusion in the technology field. She is the co-founder of the Mozambican Women in Technology Association, a co-organizer of Django Girls, and an active member of PyLadies. Currently pursuing a PhD in Big Data and Artificial Intelligence at Gaston Berger University, her research focuses on Artificial Intelligence and Education, with a special interest in Educational Data Mining and intelligent systems in education.

Cecilia Tivir - CoLead

Hi everyone, my name is Amrapali Samanta. I am originally from India and came here for my Masters few years back. I graduated from Seattle University with MS in Business Analytics and currently working as Business Analyst at Amazon. I have worked previously in various domains like banking, airline and bit of retail. My experience had mostly been around data and analytics however in my current job I do apply quite a few statistical analysis and ml algorithms for advanced predictive analytics. There is a lot to learn in the field of data science as it's emerging quickly and would love to learn from this community.

I was a volunteer last year in content team and was in grad school at that time and so much fun working there. This year I look forward to bring my expertise and bring more ideas to the table to make this event a bigger success as a content team lead.

Amrapali Samanta - CoLead

Tongxin (Joyce) Cai is a Ph.D. candidate at the University of Washington. She studies Physical Oceanography. She earned master’s degrees from the University of Washington and Stanford University. Her research looks at ocean movements and the impacts of climate change. She uses big data and machine learning to build better climate models. She is an expert in Python, SQL, and MATLAB. Joyce also focuses on community service and teaching. She has led groups that focus on environmental protection and sustainable food.

 
 

I am currently a student in the MS in Data Science program at the University of Washington. Prior to returning to school, I worked as a Data Analyst for about three and a half years. I enjoy using data to uncover insights and solve meaningful problems, and I am always excited to continue learning and growing in the field of data science. In my free time, I enjoy playing soccer, hiking, doing pottery, and playing with my puppy!

Sridevi Wagle is a machine learning engineer at the Pacific Northwest National Laboratory, experienced in interpretable machine learning and developing data-driven solutions for research applications. She likes mentoring and supporting learning and collaboration within the data science community.

Richa is a Senior Data Scientist. She currently works for CVS Health. She is passionate about using data science skills for solving interesting and impactful challenges. She is preparing to run a half marathon in June and loves playing tennis.

Richa Gupta

I am a Data Scientist and a Civil Engineer. I hold a PhD in Civil Engineering, with extensive research experience in time-series analysis, system identification, and state-space modeling. My doctoral and postdoctoral work focused on developing and validating predictive models for complex dynamic systems using sparse and noisy temporal data—expertise that directly translates to modeling student academic trajectories over time. Prior to my current role, I worked for three years as a geotechnical engineer at Shannon & Wilson, a consulting firm based in Seattle, where I developed predictive and numerical models for high-visibility, high-impact infrastructure projects across the Puget Sound and San Francisco Bay Area. Currently, I work as a Data Scientist at the University of Oklahoma.

To complement this foundation, I have completed advanced coursework and professional training in data structures, machine learning, deep learning, and artificial intelligence, including certifications from MIT, Purdue University, and the University of Washington. My recent work incorporates supervised learning, clustering, sequence modeling, and attention-based architectures applied to longitudinal student records.

I currently serve as a Data Scientist at the University of Oklahoma, where I routinely analyze academic and institutional data to support institutional decision-making related to student retention, identification of at-risk populations, and technology related operations. I have authored more than ten peer-reviewed journal and conference publications centered on predictive modeling, model validation, and interpretability.

 
 

Marketing Team

 

Angela is an Analytics Engineer at Shields Health Solutions and a co-founder and board member of Women Who Do Data (W2D2), based in Boston, MA. She previously worked as a Data Scientist at Memorial Hermann Health System in Houston, TX, where she spent nearly three years leveraging data to drive meaningful insights and impact. Angela holds a Master’s degree in Data Science from Rice University and a dual Bachelor of Science in Computer Science and Mathematics from The University of Texas at Austin. She is passionate about using data and analytics to generate actionable insights and create real-world impact, both through her professional work and her volunteer leadership efforts.

Angela volunteered with the WiDS Puget Sound 2025 Conference as part of the Experience team and is excited to return as a Marketing Team Lead for the 2026 Conference. In her free time, she enjoys going to the gym, playing sports such as golf and pickleball, exploring around the city, traveling, and watching football, especially cheering on the Philadelphia Eagles and the Los Angeles Chargers.

Angela Cao - CoLead

Carina Chen is a Business Intelligence Engineer at Amazon, where she builds scalable analytics and reporting solutions to support data-driven decision-making across the Devices organization. Prior to Amazon, she conducted public health research at the University of Washington, developing a strong foundation in analytical research and applied data methods. Her work bridges industry and academic data practice, with a focus on metric design, automation, data quality, and delivering actionable insights at scale. She is passionate about using data to solve real-world problems and support better product and operational decisions.

I am a computational biologist with experience working at the intersection of immunology and data science, focusing on identifying molecular signals associated with disease outcomes. I hold an M.S. in Bioinformatics and previously worked as a research consultant at the University of Washington and as a computational biologist at Ozette Technologies. I have found the WiDS community to be a great source of encouragement and practical advice for navigating the industry, and I am happy to volunteer this year.

Malisa Smith - CoLead

Teresa is an environmental data analyst for Alta Science and Engineering, Inc. in Boise, Idaho. Her interest in data science was sparked from rubbing shoulders with post-docs at a research forest station in Olympia, Washington, when she was working as a humble forestry technician. Now Teresa gets to work with scientists and engineers to shape the next best course of action for remediation. When not working, she enjoys taking meandering walks with her dog, attending her friends’ readings, and volunteering for WiDs for the third year.

Sandy is an Earth Scientist at Pacific Northwest National Laboratory, where she contributes to telemetry projects studying fish migration through dams. She earned a B.S. in interdisciplinary computing with a biology focus from the University of Kansas and developed an interest in marine science during a study abroad in Australia. Sandy completed her M.S. in biological oceanography at the Florida Institute of Technology, researching water quality and benthic settlement on oyster restoration mats. This is her first year volunteering with WiDS, and she is thrilled to join the community.

Jaspreet Bhamipuri is an MSIM graduate student at the University of Washington, specializing in Data Science, Business Intelligence, and Program/Product Management & Consulting. She earned her B.S. in Real Estate from UW, complemented by minors in Architecture, Informatics, and Data Science. It was during her undergraduate journey, moving between design, technology, and analysis‑focused work, that she discovered her love for data and the power it has to shape stories, insights, and meaningful change. Driven by curiosity, she is a lifelong learner with more hobbies than she can count and is committed to making data more accessible while creating inclusive spaces where people can learn, connect, and grow..

 
 

Sponsorship Team

 

Prachi is an Applied Scientist at Microsoft, where she has built large-scale ML and AI systems powering real-world products. She is a passionate advocate for community building in tech. She loves volunteering, mentoring, and creating spaces where women and underrepresented groups in data science can thrive. In the past, she has volunteered with and helped organize conferences and events through Women in Big Data, PyLadies, and DataCon LA, supporting inclusive learning and networking opportunities in the data community.

Prachi Agrawal - CoLead

Swapnil Agrawal

Fidan is a quantitative researcher and data scientist with a PhD in Psychology and a background in Mathematics. She has worked across public health and research institutions, generating data-driven insights in health and human behavior. She is passionate about responsible data science and building inclusive communities.

Fidan Howell - CoLead

Holly Simosen

I'm Vedica Bafna, a Master's student in Information Management at the University of Washington, specializing in AI and Data Science. With a background in Computer Engineering, I’ve worked on projects ranging from machine learning models to real-time translation engines. I’m passionate about creating tech solutions that improve lives and leading teams that foster innovation and growth. I believe in the power of emotional intelligence in leadership and am committed to equity and inclusion in tech. When I’m not working on AI or data-driven projects, you’ll find me baking, listening to music, stargazing, or hiking!

 
 

Events Team

 

Ayesha is a data scientist at Boeing where she leverages machine learning and generative AI in the Defense & Space division. Prior to her current role, she evaluated large language models on NLP tasks as a Boeing intern. Ayesha is passionate about supporting women and minorities in STEM fields, particularly in data science and physics. She holds two undergraduate degrees in statistics and physics.

 Ayesha Darekar - CoLead

Kamala is a Data Scientist at Nordstrom with 4-5 years of experience across startups, clean tech, and e-commerce. Her work spans a diverse range of data science - from deep learning and statistical analysis to forecasting and optimization. She thrives on untangling complex problems and making things happen. Having benefited greatly from the generosity of mentors and strangers alike throughout her own journey, she is deeply committed to paying that forward. Outside work, she's chasing sunsets, live music, and adventures with her dog.

Hi! I'm Samridhi Vats, currently working as a Business Analyst at Amazon, where I focus on generating insights to drive operational excellence and customer-centric innovations. I hold a Master’s in Business Analytics from Purdue University and have prior experience in advanced analytics and strategy consulting at ZS Associates, primarily in the healthcare domain.

I’m passionate about using data to tell meaningful stories, solve real-world problems, and create impact—especially in spaces that empower women and underrepresented communities in tech. Outside of work, I enjoy writing, reading (mostly non-fiction) and going for hikes.

I am currently completing my Bachelors of Science in Data Science at North Seattle College. This past summer I had the opportunity to work as a Data Science Analytics Intern at Expedia where I developed an interest in data engineering. I am excited to volunteer with WiDS to collaborate with other data enthusiasts! While waiting for my models to train, I like drawing and running with my dog.

Data-driven professional with an MS in Computer Science (Data Science) and hands-on experience designing scalable data pipelines, building machine learning models, and delivering intelligent, real-world software solutions. I specialize in transforming raw data into actionable insights through applied analytics, ML engineering, and data engineering.

Shachi is a Business Intelligence Engineer at Amazon with over 4 years of experience in data science, analytics and visualizations and solving business problems with data-driven insights. She is passionate about learning new technologies and expanding her knowledge in machine learning.

 
 

I am currently in the Master of Science in Data Science program at the University of Washington and will have graduated at the time of the conference.

I began my data science journey at the start of this graduate program, transitioning from a background in legal editing and digital marketing. I am passionate about storytelling and data visualization. My vision is to leverage data to tell compelling stories that convey powerful insights and drive positive change.

 

Workshops Team

 

Sarita Singh is an Associate Teaching Professor with Khoury College of Computer Science at Northeastern University, USA. She holds a PhD in Computer Science as well as a Doctor of Education (EdD) degree. Her area of research interests includes Internet of Things, Cybersecurity and Computer Science Education.

Sarita Singh - CoLead

Somang Han

Karen Dsouza - CoLead

I am a Masters of Data Science student at Harvard Extension School. After earning my Bachelor of Science in Nursing, I worked as a registered nurse specializing in surgery. Through my experience in the hospital, I recognized numerous opportunities where technology could bridge gaps in healthcare, enhancing patient care and improving the efficiency of healthcare delivery. Working in that environment sparked my interest in the technical side, which led me to data science. I love my job on the clinical side, and I hope to continue serving patients outside the operating room. In my free time, I love to travel and read books.

Sarah is a recent graduate of the University of Washington MS in Data Science program. She currently works at Seattle City Light where she helps leverage data to improve customer experience. She loves continuous learning and the women in technology community and is excited to return to the WIDS conference this year.

Simran Goindani

 

 

2026 Speakers

 

 
 

Akriti Chadda

Akriti Chadda is an applied machine learning scientist specializing in search, relevance and generative AI systems deployed at scale. Her work focuses on the full lifecycle of AI, from modeling and experimentation to production deployment, monitoring and long-term system reliability. In recent years, she has been deeply involved in agentic and generative AI systems, where non-determinism and autonomy introduce new technical and leadership challenges.

Beyond technical execution, Akriti is passionate about communication, mentorship and helping data professionals grow into thoughtful leaders. She frequently speaks about operating complex AI systems responsibly, aligning stakeholders around uncertainty and translating advanced ML concepts into practical, real-world impact.

From Models to Teammates: Operating, Monitoring and Trusting Agentic AI in Production

 

 

Catherine Nelson

Every LLM Call Counts: The Environmental Cost of AI, and How Data Scientists Can Reduce It

Catherine Nelson is an experienced data scientist and ML engineer, and the author of two O'Reilly books: Software Engineering for Data Scientists (2024) and Building Machine Learning Pipelines (2020). Previously, she was a Principal Data Scientist at SAP Concur, where she deployed NLP models to production and created innovative features including ML-powered carbon emissions analytics. She is currently consulting for startups on AI evaluation and developer relations. Catherine holds a PhD in Geophysics from Durham University and a Masters in Earth Sciences from Oxford University.

 

 

Shaili Guru

How to Work with Your PM (When They Don't Speak AI)

Shaili Guru is an AI product leader and educator with 10+ years of experience building AI products at Amazon, Disney, Nike, and T-Mobile. She currently teaches AI Product Management at the University of Washington's Global Innovation Exchange and runs Bluenox.ai, helping organizations and product teams adopt AI effectively. Her Substack newsletter, AI Product Management Guru, is read by over 4,000 PMs worldwide. Shaili holds a Technology Management MBA from the UW Foster School of Business and a BS in Biology from Baldwin-Wallace University.

 

 

Sridevi Wagle

Leveraging AI to Support Evidence-Based Wildlife and Permit Management

Sridevi Wagle is a Machine Learning Engineer at Pacific Northwest National Laboratory with a master’s degree in Computational Science. She has experience developing AI and machine learning tools for extracting and analyzing information from large-scale, multimodal scientific data. Her work includes building systems for knowledge retrieval and semantic search using advanced language models and data integration techniques. Sridevi’s research interests include explainable AI, uncertainty quantification, and visualization methods to support data-driven decision-making in scientific domains.

 

 

Hoda Soltani

When Time Tells: Using Sequence Modeling to Understand Transfer Student Retention

I am a civil engineer and data scientist with six years of professional engineering experience, three years of Data Science, and eight years of academic research focused on predictive modeling of complex dynamic systems. I hold a PhD in Civil Engineering, where my research applied system identification, time-series analysis, and state-space modeling to large-scale experimental data to study the seismic response of foundations and support infrastructure resilience. My work has been published in peer-reviewed journals and presented at international conferences and workshops.

Following my doctorate, I worked at Shannon & Wilson, a leading geotechnical consulting firm in Seattle, contributing to high-impact projects in the Pacific Northwest and San Francisco Bay Area, including seismic resilience analyses and large-scale numerical simulations for critical infrastructure. I later transitioned into data science, completing advanced training in computer science, machine learning, deep learning, and AI. I currently work as a data scientist in higher education, applying predictive modeling to student success and retention initiatives.

 

 

Erin Zionce and Sandy Rech

A Data Science Approach to Quantifying Fish Passage Through Dams, Assessing Fish Injury, and Advancing Fisheries Research

Erin Zionce is a Data Scientist at the Pacific Northwest National Laboratory with a background in fisheries ecology. Her research contributes to juvenile and adult fish passage studies by integrating ecological expertise with data science through statistical modeling, machine learning, and computational tools to support environmental science and hydropower systems management.

Sandy Rech is an Earth Scientist at Pacific Northwest National Laboratory, where she contributes to fish telemetry projects to study salmonid migration through dams. She has a background in computer science, mathematics, and oceanography, with previous work in mathematical modeling, oyster restoration, and ecological data management. She is passionate about integrating ecology and data science to address complex environmental challenges.

 

 

Rachel Wagner-Kaiser

AI Beyond English: Building Multi-Lingual and Non-English AI Solutions

Rachel Wagner-Kaiser has 15 years of experience in data and AI, entering the data science field after completing her PhD in astronomy. She specializes in building NLP and AI solutions for real-world problems constrained by limited or messy data. Rachel leads technical teams to design, build, deploy, and maintain NLP solutions, and her expertise has helped companies organize and decode their unstructured data to solve a variety of business problems and drive value through automation. Rachel is also the author of the recent book "Teaching Computers to Read" (http://amazon.com/dp/1032484357) and corresponding code companion.

 

 

Riya Joshi

Agentic AI as Your Personal Wellness Coach

As a Data and Applied Scientist at Microsoft AI division, with seven years of industry experience (previously Data Engineer) across multiple geographies, I focus on building machine learning systems that directly improve user experience in the Microsoft Edge browser. My work spans developing on-device ML models, building personalization and content-understanding systems, and designing reliable experimentation and measurement pipelines that help teams make data-informed product decisions. With a Master’s degree in Computer Science and Artificial Intelligence from the University of Massachusetts Amherst, I bring a balance of strong engineering fundamentals and applied research experience.

In my current role, I work end-to-end across the ML lifecycle—from framing product problems, designing lightweight models that run efficiently at the edge, and integrating LLM-driven features, to evaluating performance and shipping improvements at scale. My focus is always on creating practical, efficient, and user-centric ML solutions, which has become especially important as the industry moves toward more agentic and intelligent browser experiences.

My career has been defined by a dual passion: advancing AI innovation and fostering an inclusive tech community. I've had the honor of sharing my knowledge as a speaker at premier conferences including PyData Global (2024), PyLadies Con (2024) and Women in Data Science (2023) and (2024), and as a featured guest on prominent podcasts like Women in Data and Women in STEM. These platforms have allowed me to advocate for greater diversity in our field while demonstrating AI's transformative potential.

My commitment to mentorship runs deep. As a Career Advisor in the prestigious Women in Data Science Career Catalysts program, I've guided aspiring technologists from over 12 countries, helping shape the next generation of data leaders. This work, which earned me recognition as a top advisor on the platform, reflects my belief that technology advances furthest when we lift others as we climb. Whether through technical innovation or community building, I remain dedicated to creating AI solutions that are as impactful as they are inclusive.

 

 

Emma Rosenthal & Stephanie Chen

From Bots to Bookings: Agentic AI in the Real World @ Expedia

Emma Rosenthal is a Data Scientist at Expedia Group where she works on the Checkout team, focusing on AI Driven Insights, AI integrations into the checkout flow with ChatGPT, A/B testing, and data-driven product optimization. Prior to Expedia, Emma received her Master’s in Computer Science and Bachelor’s of Economics from the University of Chicago.

 

 

Shikha Verma

From Individual Contributor to Data Leader: How to Unblock your team & Influence Strategy

Ph.D. in Machine Learning with 5+ years of industry experience working in high-performance, worldwide scale projects on fraud detection, warehouse management & promotion targeting across fintech, e-commerce & healthcare. She is skilled in supervised & unsupervised machine learning algorithms, building end-to-end ML pipelines, applied statistics, Python & SQL.

She has presented her research at various academic and practitioner conferences like Grace Hopper Celebrations (India), the Women in Machine Learning workshop at NeurIPS & ICML, and ACM Conference on Machine Learning and Human-Computer Interaction (2020). She has served as a visiting faculty for courses on AI, ML, and business analytics across management institutes in India.

 

 

Anastasia Bernat

GeoAI for the Built Environment: Siting and Permitting

I am a Senior Data Scientist at the Pacific Northwest National Laboratory (PNNL) specialized in processing and modeling energy and Earth system data for impactful decision-making science. At PNNL, I am the lead architect of novel GeoAI data pipelines and manage several AI-driven and/or cloud-native applications for U.S. energy and environmental mission areas. This includes six research, agentic, and generative AI applications to streamline federal permitting reviews, a techno-economic simulator for advanced geothermal systems (GeoCLUSTER), and a U.S. energy feasibility mapper (GRIDCERF). Combining data science, computational modeling, and environmental science, I am also deft in geographic information systems (GIS) and statistical modeling used to better enable intelligent mapping and environmental monitoring analyses. My leadership has guided data product teams to deliver impact and value to sponsors across the Department of Energy, including earning project-level recognition by the White House in the “AI for Good” space and in “America’s AI Action Plan”.

 

 

Swapnil Agrawal

Soft Skills Are Not Optional: Why Early-Career Data Professionals Need Them Most

Hello, I’m Swapnil. I was born and raised in India and moved to the United States in 2018. I earned my BTech from the Indian Institute of Technology, Delhi, and my master’s degree from Carnegie Mellon University in Pittsburgh.

I’ve built my career as a Data Scientist across diverse organizations. I began at a startup in Pittsburgh, then spent nearly three years at Lubrizol Corporation in Houston, Texas. Currently, I work as a Data Scientist at Microsoft, specializing in product data science.

Outside of work, I enjoy painting, reading, and cooking. I’m very outdoorsy and love cycling, kayaking, hiking, and camping. I also have a five-year-old German Shepherd who keeps life busy, active, and joyful.

 

 

Nandita Krishnan

AI As Your Personal Data Science Intern

Nandita Krishnan is a Consultant-turned-Data Scientist who brings a unique blend of strategic thinking and technical expertise to her work. Currently part of Adobe's team, she focuses on enhancing user experience for flagship products such as Premiere Pro by uncovering user needs hidden within complex data.

Beyond her day-to-day role, Nandita is deeply curious about the evolving tech landscape and is constantly exploring and experimenting with the latest tools and technologies, expanding her skill-set and staying at the forefront of data science innovation.

Passionate about creating pathways for others, Nandita is also an active advocate for women in STEM. She regularly mentors aspiring Data Scientists and participates in speaking engagements to inspire the next generation in tech.

Ayushi Das

Taxonomy-Agnostic Hybrid Recommendation System for Procurement Classification

I am from Kolkata and have a strong academic foundation in mathematics and applied sciences. I completed my undergraduate and master’s degrees in Mathematics from Banaras Hindu University, followed by an M.Tech in Cryptology and Security from the Indian Statistical Institute, Kolkata. In 2023, I was selected for a six-month internship at Amazon, where I worked on applied machine learning problems at scale. In January 2024, I joined Amazon as a full-time Data Scientist in the AFT organization and subsequently transitioned to the FinAuto team. My current work focuses on building production-grade data science and Generative AI systems, including taxonomy-agnostic classification and supplier-aware recommendation solutions for enterprise procurement. Besides work, I love to cook, dance, and spend time with animals.

 

Livestream Exclusives!

These talks will be played for the livestream feed during the breaks in the in person conference. In person attendees will be able to view these online after the event.

 

Ojasvi Khanna

Forecasting You: How Data Science Powers Personalized Marketing

I have been doing AI and ML-modeling for Xbox for the past 4 years as a Data Scientist. My work helps create better marketing outputs, helping gamers play their next game faster! I went to UC Berkeley, enjoy skiing, tennis and biking when the weather allows.

 

 

Sneha Sivakumar

Beyond the Prompt: Building Autonomous AI Agents for High-Stakes Adversarial Environments- such as finance, fraud & abuse

Sneha Sivakumar is a Product leader at Amazon, where she leads Content Risk Moderation (CRM) for the self-publishing books business, Kindle Direct Publishing. Prior to Amazon, she worked at KPMG where she led the Technology Risk practice advising large public companies on mechanisms to quantify and mitigate technology, financial and social risk. She is experienced in building consumer and enterprise products that help organizations manage risk through the use of ML, automation and human inputs. Sneha holds a BS in Engineering from Anna University, India, an MS in Industrial and Systems Engineering from USC and an MBA from Kellogg School of Management.

 

 

Erin Wilson

Biomanufacturing for a better world

I am a data scientist pursuing a career at the intersection of computing, biology, and sustainability. My experience includes working at biotech companies like Amyris (engineer yeast to convert sugar into alternatives to petroleum based products) and LanzaTech (convert carbon emissions to ethanol with bacteria), and completing a PhD in the Computer Science program at UW (using ML techniques to model DNA patterns in methane-eating bacteria). When I'm not nerding out about climate biotech, you can find me enjoying fresh air on PNW trails or rolling dice to explore D&D fantasy realms.

 

 

Harsheeta Venkoba Rao

Designing Reliable Agentic AI Systems: Design Patterns for Production

Harsheeta Venkoba Rao is a Founding software engineer at Gone.com with extensive experience in agentic AI, machine learning, and building reliable end-to-end software systems. She holds a master’s degree in Electrical and Computer Engineering, specializing in machine learning and data science.

 

 

Neelam Koshiya

Model Context Protocol (MCP): The Next Frontier of Generative AI

I'm a Principal Applied AI Architect at AWS with 17+ years of experience in architecture, including 10+ years focused on cloud and AI. As a thought leader in AI-driven solutions, I regularly speak at global tech conferences including AWS re:Invent, AWS re:Inforce, AWS Summits, NRF, and Grace Hopper Celebration (GHC), where I share insights on bridging the gap between AI and real-world business applications.

My expertise spans cloud architecture, generative AI, and retail innovation, making me a recognized voice in the industry. I'm the author of the published book AWS Solutions Architect Associate Certification Guide and a contributor to the Responsible AI Lens for the AWS Well-Architected Framework.

My work has been recognized with several prestigious awards, including the Advancing Women in Technology (AWT) 2023 Rising Stars Award, Success Quarterly 2025, the Globee Award for thought leadership in artificial intelligence, and the Global Recognition Award as a standout leader in the industry.

I'm passionate about helping organizations unlock the transformative potential of AI through practical, scalable solutions—from property inspection and document processing to customer experience enhancement and workforce productivity. I've successfully identified and prioritized AI use cases across diverse industries, including finance and real estate.

 

 

Somang (So) Han

Surfacing Hidden Potential: ML-Driven Selection and Causal Inference for Rare Event Prediction in Partner Ecosystems

I am a Data Scientist with over six years of experience building production machine learning models at Amazon, spanning partner prioritization strategy, marketing channel attribution, and advertising measurement. I hold a Master's in Data Science from the University of Pennsylvania and a Bachelor's in Mathematics and Statistics from St. Olaf College. Beyond data science, I was a member of the South Korea Junior National Alpine Skiing Team and now pursue amateur baking in my spare time.

 

 


Alisha Gala & Booma S Balasubramani & Jingyi Du

When (and When Not) to Leverage Agentic AI: Practical Lessons from Building Projects and Autonomous Data Workflows

Jingyi Du is a Principal Data Science Manager with over a decade of experience spanning software engineering and applied data science. Jingyi leads Windows engagement strategy and conduct large-scale experimentation at Microsoft, driving growth through data-driven decision frameworks. With a background in Computer Science and Decision Science, Jingyi has authored research presented at Microsoft ML & Data Science Conference and delivered 50+ talks to audiences from technical teams to senior leadership.

Alisha Gala is a Senior Data Scientist at Microsoft with over eight years of experience across software engineering, applied data science, and AI-driven experimentation. Her work focuses on driving engagement and growth across Windows through metric design, large-scale experimentation, personalization and recommendation systems, and cohort-based behavioral analysis. Alisha partners closely with product and engineering teams to translate complex data into clear, decision-ready insights. She is a frequent speaker in technical and cross-functional forums, known for turning analysis into narratives that directly inform product strategy.

Booma Sowkarthiga Balasubramani is a Senior Data Scientist at Microsoft with deep expertise in AI and Data Science, built over a decade across academia, research, and large-scale industry applications. Holding a Ph.D. in Computer Science (Data Science) from the University of Illinois at Chicago, Booma has developed frameworks for ontology engineering, ontology matching, predictive modeling, and geospatial data analytics. At Microsoft, Booma leads work involving Copilot on Taskbar, predictive modeling, metric design, experimentation, and data-driven growth strategies for Windows. A seasoned speaker and educator, Booma has delivered talks at global conferences, featured sessions at developer events as well as academic events, and taught university courses, consistently making complex AI and Data Science concepts accessible to diverse audiences.

 
 
 

2026 Panels


 

Layoffs are a looming reality that we cannot afford to ignore. We aim to bring together voices from across the data community who have experienced, navigated, and overcome being laid off. We invite panelists who want to share their honest stories, practical strategies, and lessons learned about resilience, reinvention, and growth after a difficult setback.

 
 
 

Giselle Doolan

Placeholder

Strategic leader with 17+ years experience in product and tech operations and strategy. Currently the Chief of Staff for Data & AI at Expedia Group. Previous experience includes Google, Vivint Smart Home, Ancestry.com and The New York Times.Tao Tao

Jessica Marx

SEnior Data Scientist

Sr Data Scientist, Data Engineer, and ML Engineer with 8+ years of experience building production ML systems and data infrastructure. 

At Nordstrom, co-launched Smart Markdown, the company's first dynamic pricing optimization model and designed Nordycast, an open source internal ML Ops platform (subject of talk at WiDS Puget Sound 2022). 

At Textio, built core production NLP systems including BERT-based discrimination detection, led cross-functional data migrations, and launched the company’s first experimentation framework. Currently an AI/LLM Engineer at a stealth-mode startup.

Irina Virnik

Data Engineer

Data engineering leader with 15+ years of experience building data platforms and helping teams make better decisions. Currently a Principal Data Engineer at JumpCloud. Previously worked at OfferUp, Disney, and Visa, leading large-scale data and analytics initiatives across cloud environments.

 
 

Data Challenges in Health and Medicine Panel

More details to come!


 
 
 
 

2026 Workshops


Registered attendees will receive information on how to sign up for the workshops closer to the day of conference. Keep an eye out for an email!

 

When AI “Works” but Still Fails: The Safety Problem

When AI “Works” but Still Fails: The Safety Problem

AI safety isn't abstract -- it's a judgment call that data scientists make every day, often without realizing it. In this interactive workshop, you'll work through five real-world scenarios involving AI systems that failed, behaved unexpectedly, or created harm despite working exactly as designed. Through small-group discussion, we'll surface why these situations are genuinely hard -- and why the disagreements matter. We'll close by connecting those practical tensions to CSA's Trusted AI Safety Expert (TAISE) certificate developed in collaboration with Northeastern University. No prior AI safety knowledge required.

Participation requirements: No laptop or software needed. Attendees should be prepared to discuss real scenarios in small groups.

Anna Campbell McKee

Director of Training Programs at the Cloud Security Alliance (CSA)

Anna Campbell McKee is the Director of Training Programs at the Cloud Security Alliance (CSA), where she leads global education programs focused on cybersecurity best practices. She oversees CSA’s flagship certifications, the Certificate of Cloud Security Knowledge (CCSK) and the Certificate of Competence in Zero Trust (CCZT), which have received top industry honors, including recognition for Best Cloud Security Certification and Cutting-Edge Cybersecurity Training.

Anna is currently leading CSA’s newest initiative: the development of training and certificate programs, such as Trusted AI Expert (TAISE), focused on AI safety and security to address the emerging governance and risk challenges of AI-enabled systems.

 
 

With over a decade of experience in cybersecurity, governance, data integrity, and public sector collaboration, Anna is known for translating complex technical topics into accessible, standards-driven education. She contributes to cybersecurity research, speaks on workforce development and emerging technology risks, and supports mentorship and civic leadership through her service on her local city Planning and Utilities Commission.

 

Anna holds an MBA from Western Washington University and undergraduate degrees in Biology and Integrated Science from the University of Washington.

 

 

Build an AI Quiz Generator: From Local Dev to SageMaker Endpoint with Kiro

Build an AI Quiz Generator: From Local Dev to SageMaker Endpoint with Kiro

In this 50-minute workshop, you'll build an AI-powered tool that generates multiple-choice quiz questions from machine learning research papers. Starting with Kiro — an AI-assisted development environment — you'll develop and debug the application locally, then move to AWS where you'll run the pipeline using Amazon Bedrock, evaluate output quality with an LLM-as-a-judge approach, and deploy it as a live API endpoint. A provisioned AWS environment is provided so you can follow along and continue experimenting for 24–48 hours after the session. Basic Python experience assumed.

Mingwei Shen

Senior Manager, Applied Science at AWS Training and Certification

Mingwei Shen leads Applied Science at AWS Training and Certification, driven by one question: as AI transforms jobs, what should people learn? A 10-year Amazon veteran who led global teams delivering >$100M/year in ML automation savings, he now focuses on upskilling over automation. He is an avid GenAI tinkerer.


(graphic/headshot of speaker will be coming soon)

Building your own Developer Advocate with Deep Agents and Elasticseach

Learn how to implement a deep-thinking research assistant in Python using LangChain’s Deep Agents and Elasticsearch. This session walks through building a sub-agent in a prebuilt agentic harness application with tools for vetted research and comparative analysis, perfect for anyone exploring AI-powered workflows.

Justin Castilla

Senior Developer Advocate @ Elastic

Justin Castilla started his Software Engineering career as a Web Development Boot Camp Instructor where he developed a passion for exciting others with new concepts and empowering individuals with the tools needed to excel in their own right. As an Advocate at Redis, Justin created numerous videos breaking down Data Structures into easy-to-understand, relatable examples with real-world use cases. Now at Elastic, he has expanded into the realm of enhanced search, monitoring, and observability capabilities.

 

⚙️More to be announced!🔨🔧


 
 
 
 
 

2026 Career Mentorship Sessions


 
 

Leading When the System Is Changing: Human Skills for Technical Leaders in Uncertain Times


Data science is practiced inside systems that are constantly evolving — reorganizations, new technologies, shifting expectations, and accelerating timelines. While tools and models change quickly, the human demands placed on professionals often go unnamed: leading without authority, navigating ambiguity, and making ethical decisions when certainty is unavailable.

This session centers the human skills required to lead well when the system itself is changing.

Drawing from leadership development, organizational change work, and lived experience supporting professionals in high-impact environments, this session explores how data practitioners can cultivate discernment, clarity, and steadiness amid ongoing transformation. This session offers practical leadership lenses that help individuals remain effective during periods of uncertainty.

Participants will learn how unspoken expectations create invisible pressure, and explore how narrative, boundaries, and ethical self-trust function as stabilizing forces in rapidly shifting environments. The emphasis is not on “doing more,” but on leading with greater intentionality and sustainability.

Learning Objectives:

  • Language to name what is actually difficult about leading through change

  • A simple framework for navigating uncertainty without burning out

  • Greater clarity about how their presence and decisions influence systems, even without formal authority

This session is designed for women and gender-diverse professionals in data and adjacent technical fields who are stepping into influence — whether or not their role formally reflects it. It complements technical learning by strengthening the human foundations that allow professionals to adapt, communicate, and lead with integrity as the landscape continues to evolve.

About our Mentor:


Tiffany Dedeaux is a Master Certified Coach (MCC) and leadership development practitioner working at the intersection of organizational change, professional identity, and career transition. She is the founder of **[Sacred Time](https://sacred-time.com/)**, where she partners with professionals navigating complexity, ambiguity, and evolving systems — including leaders and practitioners in technical and data-adjacent fields.

With over 15 years of experience, Tiffany supports individuals and groups as they step into influence, clarify their professional narratives, and lead through change with integrity. Her work prioritizes discernment, clear thinking, and sustainable leadership presence over performance for performance’s sake.

Tiffany holds a Master of Arts in Ecopsychology and Cultural Transformation, grounding her work in a systems-level understanding of how people, roles, and environments shape one another. In addition to her coaching practice, she has held senior volunteer and governance leadership roles within professional associations, leading through restructuring, crisis response, and strategic realignment.

She is a frequent speaker and facilitator for conferences and professional communities, including Women in Data Science events, PyData chapters, and career-focused organizations. Tiffany’s work resonates especially with women and gender-diverse professionals navigating transition, visibility, and influence in data-driven and technical environments

 
 
 

Rethinking Career Planning and Technical Resumes in the Age of AI

AI has changed both the technology job market and how resumes are screened, with employers relying on a mix of applicant tracking systems, human reviewers, and AI‑assisted tools—while many candidates now use AI to generate generic, look‑alike resumes.

This interactive session helps data professionals rethink how they present their work. Instead of long lists of tools and duties, participants will learn how to communicate outcomes, real‑world impact, and capabilities in ways that make sense to both technical and non‑technical reviewers.

Attendees will practice articulating their unique capabilities—the combination of skills, talents, interests, and knowledge that drive their best work—so they can craft resumes and LinkedIn profiles that stand out in an AI‑influenced hiring environment.

Learning Objectives:

  • In a competitive job market, explore practical career paths that help them build strong data skills over time and move confidently toward a data science role.

  • Identify the specific capabilities that differentiate them as data professionals and express those clearly on their resumes.

  • Tailor resume content for different data‑oriented roles (data science, analytics, data engineering, data management, AI/ML, etc.) by providing context for projects and results.

About our Mentor:


Jennifer Hay is a career coach and resume writer specializing in technology, data, and analytics careers. As the founder of Tech Career Services and IT Resume Service, she combines technical expertise with career development experience to help clients define goals, create actionable plans, and present their strengths with confidence. She uses a proprietary career assessment and planning methodology (STIK) to guide students, recent graduates, and mid-career professionals in navigating the tech job market. Jennifer is certified in IT Resume Writing (CRS+IT), Student Career Coaching (CSCC), and holds CBIP credentials in Data Analysis and Business Analytics.

 
 
 
 


2026 Abstracts

 
 
 
 
 

Keynote


From Models to Teammates: Operating, Monitoring and Trusting Agentic AI in Production

Akriti Chadda - Senior Applied Scientist

Agentic AI systems are changing not just how machine learning works, but how teams think, communicate and make decisions. Unlike traditional models, agents plan, act, and adapt over time and often in non-deterministic ways. This creates a new leadership challenge: how do you build trust in systems that don't behave predictably and how do you explain their risks, limitations and failures to non-technical stakeholders?

This talk focuses on the human and organizational skills required to operate agentic AI in production. It explores how to communicate uncertainty, set realistic expectations and influence decision-making when metrics are incomplete and failures are subtle. Attendees will learn practical frameworks for framing agent behavior, aligning cross-functional teams and pushing back on over-automation when agents are the wrong abstraction.

Grounded in real-world production experience, this session helps data professionals grow beyond technical execution into thoughtful leadership thus equipping them to guide teams, stakeholders and organizations through the complexities of deploying agentic AI responsibly and effectively.

 
 
 
 

Invited Speakers


Every LLM Call Counts: The environmental cost of AI, and how data scientists can reduce it

Catherine Nelson - Data Scientist, ML Engineer, Author, consultant

AI comes with a big environmental cost. Training and serving AI models consumes vast amounts of electricity, water, and raw materials. Already, AI accounts for around 15% of data center energy usage, and the energy demands of AI are projected to double by 2030. But as data scientists using AI, there are some things we can do to reduce our environmental footprint.

In this talk, I'll summarize the latest data on AI's environmental impact, looking in particular at OpenAI, Google, and Anthropic. I'll also highlight what data the providers aren't disclosing, and why that lack of transparency makes it harder for us to make good choices.

In the second half of the talk, I'll give you actionable steps you can take to reduce the impact when you're using an AI model. I'll show you techniques including prompt optimization, model selection strategies, and caching that reduce both environmental impact and costs. I'll also talk about how good evaluation data is essential for these. You'll learn which models are most efficient, how model choices affect emissions, and you'll gain practical knowledge to make more sustainable choices.

 

How to Work with Your PM (When They Don't Speak AI)

Shaili Guru - AI product leader and educator

You've built something promising. You understand the technical tradeoffs. But your PM keeps asking for timelines you can't commit to, scope that doesn't make sense, or success metrics that miss the point.

After more than a decade on the PM side, I can tell you they're not being difficult on purpose. Data science projects just don't follow the rules they learned managing traditional software. That mismatch causes real problems.

I keep hearing the same frustrations from data scientists. Being treated like a request machine. PMs who have no clue what's easy versus hard in ML. Agile processes that try to cram research into two-week sprints. Work that stays invisible until it ships.

But here's what most data scientists don't see. Your PM is getting squeezed too. They're being asked for roadmaps, to defend your project against competing priorities, and to translate your work for leadership (usually without the context they need to do it well).

This talk is about bridging that gap. I'll share what PMs are actually worried about when they push for certainty, why they frame things the way they do, and what helps them advocate for your work when you're not in the room.

We'll cover how to reframe uncertainty as risk, how to make the exploration phase work visible, and how to build a real partnership with your PM. Not just a transactional one.

You'll leave with approaches you can use in your next sprint planning: language that lands, ways to build trust, and how to educate without being condescending.

 

Leveraging AI to Support Evidence-Based Wildlife and Permit Management

Sridevi Narayana Wagle - Machine Learning Engineer, Pacific Northwest National Laboratory

The Hanford Site played a central role in the Manhattan Project, producing an extensive corpus of scientific, engineering, and operational records spanning multiple decades. These materials ranging from technical reports and engineering drawings to photographic documentation are essential for contemporary nuclear research, environmental remediation, and historical analysis. While this is publicly accessible through the DOE Declassified Document Retrieval System (DDRS), its analytical value is significantly constrained by inconsistent metadata, limited document-level indexing, heterogeneous file formats, and the lack of full-text search capabilities.

We present a scalable AI-based framework for multimodal archival exploration that integrates semantic search, automated metadata enrichment, and interactive large language model (LLM) interfaces. Using AWS Bedrock embeddings in combination with the Claude Sonnet 3.5 model, our system extracts structured entities, infers relationships, generates technical summaries, and supports conversational querying over text and image-based content. The data processing pipeline processed approximately 1.5 TB of legacy data, including 4 million TIF files, over 70,000 images, and 1,300 PDF documents. Automated deduplication, document reconstruction, and page-level segmentation enabled fine-grained indexing and embedding of previously inaccessible technical details.

The resulting multimodal search platform supports fuzzy matching, retrieval, and contextual filtering, allowing users to locate specific chemical compounds, process descriptions, construction specifications, or equipment references embedded deep within scanned reports or imagery. The AI-driven interface dynamically generates follow-on research questions and interactive knowledge graphs that expose cross-document linkages, enabling new forms of exploratory analysis across historical nuclear workflows and environmental impact data.

This work demonstrates a methodology for transforming complex, low-accessibility scientific archives into AI-ready knowledge systems. Beyond Hanford, the approach establishes a technical foundation for applying advanced AI-driven discovery to other unique DOE collections, accelerating research, improving archival usability, and supporting future innovation.

 
 
 
 

 

When Time Tells: Using Sequence Modeling to Understand Transfer Student Retention

Hoda Soltani - civil engineer and data scientist, university of oklahoma

An end-to-end predictive analytics framework to model student dropout, with a focus on transfer students at a four-year university. Student dropout remains one of the most multifaceted and pressing challenges in higher education, arising from a complex interplay of academic, social, economic, and institutional factors that limit both individual potential and broader social mobility.

This study conducts school-level retention prediction using university-specific administrative datasets that are not publicly available. Transfer students--those who have previously earned academic credit at another postsecondary institution--represent a large and academically diverse population whose successful integration into four-year institutions requires timely, evidence-based support informed by both historical academic pathways and early university performance.

The session presents a comprehensive predictive framework examining the academic histories and first-term outcomes of transfer students admitted to Engineering, Business, and Arts and Sciences over a three-year observation period. Drawing on principles from educational data mining, the analysis incorporates multidimensional features including sociodemographic attributes, pre-transfer coursework, enrollment intensity, academic load, financial aid, campus employment, and early indicators of academic engagement.

The modeling pipeline integrates supervised learning for binary classification (retained vs. non-retained), clustering methods to identify latent student subpopulations, and model-interpretation tools to support transparency. Central to the framework is the use of sequence modeling techniques--such as recurrent neural networks, gated recurrent units, and attention-based architectures--to capture temporal dependencies in students' academic trajectories. Rather than relying on static or summary-based features, these models learn patterns across semester-by-semester course enrollments and performance, enabling more accurate and earlier identification of dropout risk by modeling the order, timing, and evolution of academic behaviors. Methodological challenges, including class imbalance, overfitting, and domain-informed feature engineering, are explicitly addressed.

 

A Data Science Approach to Quantifying Fish Passage Through Dams, Assessing Fish Injury, and Advancing Fisheries Research

Erin Zionce - Data Scientist, Pacific Northwest National Laboratory

Sandy Rech - Earth Scientist, Pacific Northwest National Laboratory

Dams disrupt the natural life cycles of migratory riverine fish, posing significant challenges to their survival. Addressing these connectivity issues requires interdisciplinary collaboration between biologists and data scientists. At Pacific Northwest National Laboratory (PNNL), researchers integrate ecological expertise with data science to study fish passage and survival. PNNL’s fish passage projects focus on anadromous fish species such as salmonids, using various tagging methods (e.g., radio telemetry (RT), balloon-tagging) and injury assessments to analyze migration through dams. Two studies conducted at U.S. Army Corps of Engineers operated dams – Mud Mountain Dam (MMD) in Washington State and Foster Dam in Oregon – used RT to evaluate fish passage and survival. At MMD, adult Chinook salmon (Oncorhynchus tshawytscha) implanted with RT tags were tracked as they returned to spawning grounds from the ocean via a Fish Passage Facility. At Foster Dam, RT-tagged juveniles were monitored to evaluate survival rates and travel times during ocean-bound migration. Efforts to automate fish tracking and streamline data analysis aimed to reduce manual data processing. However, challenges like the noisy nature of RT data and unpredictable fish behavior required tailored algorithms to ensure accurate results. A third study conducted at Howard A. Hanson Dam (HAHD) in Washington State used balloon-tagging with complementary injury assessments to evaluate the biological consequences of dam passage through specific routes. Traditional injury assessment methods at HAHD rely on intensive fish handling and manual assessment, introducing potential human bias, variability, and stress to fish. As an extension of this study, a proof-of-concept approach leveraging AI-driven image analysis was developed to automate and standardize injury assessments, while reducing human bias and minimizing stress to fish. Together, these three projects demonstrate the importance of interdisciplinary research to improve the evaluation of fish passage and survival, and to support the conservation of salmonid populations.

 

AI Beyond English: Building Multi-Lingual and Non-English AI Solutions

Rachel Wagner-Kaiser - Director, NLP Data Scientist

We will address the core challenges technical teams face when dealing with non-English languages in building effective AI solutions, reinforced by real-life examples. We will outline the complexity of non-English data, from tackling non-Latin character sets and low-resource languages to the practical hurdles of transforming unstructured data (like images and audio) into usable text. We will also go into the options for different technical approaches, including topics such as the complexity of language detection and cross-language processing techniques. The session will also analyze the current role and limitations of LLMs across diverse languages. We will conclude with best practices for designing and deploying high-performance, multilingual NLP systems that deliver value for practical business use cases.

 

Agentic AI as Your Personal Wellness Coach

Riya Joshi - Data and Applied Scientist, Microsoft AI

In today's fast-paced world, maintaining healthy daily routines is challenging, yet critical for overall well-being. Traditional wellness applications largely offer passive tracking and generic recommendations, leaving users to make decisions without actionable guidance. This talk introduces an **agentic AI framework** designed to proactively optimize daily habits by integrating wearable data, predictive modeling, and real-time decision-making.

Our system collects physiological and activity data from wearable devices such as the Apple Watch via HealthKit APIs. A lightweight iOS frontend captures and transmits data to a Python-based backend, where the agentic AI resides. The AI models the user's current state--including sleep quality, fatigue, heart rate variability, and activity trends--and predicts near-future wellness metrics. Using a combination of rule-based policies and reinforcement learning, the agent recommends and delivers personalized interventions, such as exercise prompts, diet adjustments, or sleep optimization strategies. Notifications and actionable guidance are pushed back to the user in real time, creating a closed-loop feedback system that continuously adapts to the user's behavior and goals.

This agentic approach transforms wellness applications from passive trackers into proactive personal coaches. By demonstrating a real-time prototype integrating wearable data, predictive modeling, and dynamic intervention, this work highlights the potential of agentic AI to improve habit adherence, enhance physical and mental health, and empower users with personalized, adaptive decision support. This talk will cover the system architecture, agentic decision logic, and potential avenues for future research in AI-driven wellness, making it accessible to both technical and non-technical audiences.

 

Developing hybrid KG-LLM solutions for reliable information extraction

Anahita Pakiman - Senior Knowledge graph engineer & semantic Architect, amazon

In this talk, we will explore the c"Qualification processes in industrial settings require accurate, equipment-specific inspection criteria from technical documentation and perfect execution to ensure deliverable quality and minimize post-launch downtime and claims. This presents a challenging science problem: how to extract and generate reliable inspection recommendations from heterogeneous data sources, without hallucinations.

We developed a hybrid Knowledge Graph - LLM solution that addresses fundamental limitations of LLM-only approaches. Initial LLM-only approaches exhibited significant hallucinations, generating unreliable inspection values and recommendations that couldn't be validated against domain constraints, prompting our hybrid KG-LLM solution.

Our methodology employs a domain-specific KG that captures semantic relationships between equipment types, failures and inspection requirements. This domain-specific KG extracts and links entities from diverse historical data sources, including failures and unstructured technical documentation, creating a comprehensive semantic network for constrained generation.

By using graph patterns to constrain LLM inputs, we transformed the task from open generation to structured information insertion, significantly reducing hallucinations. Results demonstrate substantial improvement in inspection recommendation accuracy and consistency, while maintaining extraction efficiency. The methodology offers generalizable findings for bridging structured and unstructured data in domains requiring high-precision in AI outputs.

 

From Bots to Bookings: Agentic AI in the Real World @ Expedia

Emma Rosenthal - Data Scientist, Expedia Group

Stephanie Chen - Senior Manager, data science, expedia group

Agentic AI is reshaping how we build and interact with data systems - and at Expedia, we're harnessing its power to redefine both customer experiences and internal workflows. In this session, we'll share a two-fold perspective on practical implementations of agentic AI in industry.

First, we'll explore how Expedia is integrating conversational AI into our checkout experience. By enabling users to connect Expedia with ChatGPT, travelers can browse products and book trips directly through AI-driven workflows. We'll discuss the architecture behind this integration, the strategies for measuring AI-driven user behavior, and the challenges and opportunities of embedding agentic AI into a high-stakes e-commerce environment.

Second, we'll turn inward to examine how AI is transforming the way data scientists work. From intelligent agents that automate repetitive tasks to workflow optimizations and productivity ""hacks"", we'll showcase how AI tools are accelerating analytics, improving decision-making, and freeing teams to focus on high-value insights. Attendees will gain practical ideas for leveraging AI in their own organizations--whether to enhance customer-facing products or to streamline internal processes.

Join us for a candid look at the promise and limitations of agentic AI in real-world applications, and learn how Expedia is navigating this rapidly evolving landscape to deliver smarter experiences for travelers and data professionals alike.

 

From Individual Contributor to Data Leader: How to Unblock your team & Influence Strategy

Shikha Verma - Senior manager analytics, toast

The transition from individual contributor to manager is one of the most challenging career shifts in data science--particularly for women, who represent only 26% of data science roles and an even smaller fraction of technical leadership in the US. This gap widens at the management level, where many talented women ICs hesitate to pursue leadership or struggle with the transition because the playbook is unclear.

This session shares hard-won lessons from my journey as a PhD-trained data scientist turned manager of a team of 5. I'll address the identity crisis many of us face: ""If I'm not the best technically anymore, what's my value?"" and provide a concrete roadmap for becoming a leader who is technical enough to unblock your team and strategic enough to influence the roadmap.

You'll walk away with clear frameworks to apply immediately:

-- The 70-20-10 rule: How to evolve your time allocation across technical work, enablement, and strategy
-- The "Am I the Bottleneck?" test: Weekly self-assessment to identify where you're helping vs. hindering
-- The "Strategic Value" filter: Prioritization framework for ruthless decision-making
--The "Technical Enough" checklist: Know when to dive deep vs. delegate

Actionable insights on:

--What to unlearn from the IC mindset (value = output → value = team's multiplied impact)
--Where to stay technical for high leverage (design reviews, unblocking) vs. where to let go (being the fastest coder)
--How to build strategic influence through stakeholder mapping, translating analytics to business language, and saying no effectively

This is for any woman in analytics considering leadership, newly managing, or struggling with the IC-manager balance. Leave with a clear mental model and practical tools to accelerate your transition and lead with confidence.

 

GeoAI for the Built Environment: Siting and Permitting

Anastasia Bernat - Senior Data Scientist, Pacific Northwest National Laboratory

How do we make sure AI doesn't get lost in space, especially when agencies need to make coordinated decisions on a myriad of environmental reviews for projects planned on U.S. lands? Too often these projects are delayed or over budget due to poor coordination with a variety of federal, state, and local laws dependent on the geographic location of the project site. However, geospatial artificial intelligence (GeoAI) has the potential to transform the pace and precision of permitting. PermitAI is a multimodal large language model testbed led by the Pacific Northwest National Laboratory that uses GeoAI to streamline the environmental permitting review process by turning millions of permitting maps into structured geointelligence. Digitization efforts focus on turning vast document and map repositories amassed through the National Environmental Policy Act (NEPA) into a spatially coherent Geographic Information System (GIS) dataset that charts decades of environmental review across agencies, scales, and formats. This includes capturing key geospatial data that agencies fundamentally rely on to scope baseline environmental conditions, communicate alternatives, weigh footprint constraints, and track mitigations. By then enriching, automating, and generating geospatial data from vast and heterogeneous government georegistries, GeoAI can rapidly build cohesive spatial reasoning for streamlined interagency coordination. This presentation will highlight a GeoAI data pipeline that is transforming how agencies geovisualize and analyze permitting data, reducing time spent navigating static documents and setting the foundations for integrating historic NEPA GIS layers into modern, information-rich digital permitting platforms and decision-support systems.

 

Soft Skills Are Not Optional: Why Early-Career Data Professionals Need Them Most

Swapnil Agrawal - Data Scientist, Microsoft

A common belief among early-career professionals is that soft skills are something to worry about later, once you become a manager or a leader. Early on, the focus is often placed solely on technical excellence, it could be writing better code, building better models, and delivering accurate results. While technical skills are essential, they are only the starting point.

In reality, soft skills matter more than ever at the beginning of a career. Early-career data professionals frequently work in ambiguous environments, collaborate across teams, and translate complex insights to non-technical stakeholders. Without strong communication, collaboration, and storytelling skills, even the best analysis can fail to influence decisions or create impact.

This talk challenges the myth that soft skills are only relevant for managers and leaders. Drawing from real experiences transitioning from entry-level to mid-level roles, the session demonstrates how early investment in communication, influence, and collaboration accelerates career growth, increases visibility, and builds trust with stakeholders.

Attendees will learn practical techniques to structure compelling data stories, align analysis with business goals, handle pushback on insights, and influence decisions without formal authority. The talk also explores how to demonstrate leadership behaviors--such as ownership, clarity, and empathy--regardless of title.

By reframing soft skills as career accelerators rather than optional extras, this session equips early-career data professionals to maximize their impact, navigate complex organizations, and grow faster and more intentionally in their careers.

AI As Your Personal Data Science Intern

Nandita Krishnan - data scientist

Many data professionals find themselves spending more time debugging AI-generated code than they would have spent writing it themselves, defeating the entire purpose. And then there are those who have abandoned AI tools entirely after repeated, frustrating experiences. This gap between AI’s potential and its practical application in real-world data science work remains frustratingly wide. 

This talk bridges that gap by providing practical strategies for leveraging agentic AI, such as Cursor and Claude Code, as your ‘personal intern’: one that can actually excel when you give the right amount of supervision and guidance. Drawing from hands-on experience implementing AI tools for building machine learning models, creating automated pipelines, and generating analysis visualizations, I’ll share concrete strategies that separate productive AI use from time-wasting rabbit holes. 

You’ll learn how to identify which tasks benefit most from AI assistance and which are better done manually. I’ll demonstrate prompt and context engineering techniques, including dos and don’ts, to help you avoid common pitfalls. We’ll explore how to establish verification workflows that catch errors early, implement guardrails that prevent catastrophic mistakes, and create feedback loops that improve AI output over time. I‘ll also talk about how to leverage built-in memory features to maintain context across sessions, configure custom rules that enforce your coding standards automatically, and use context files to give AI the proper background knowledge for your specific projects. This is about making agentic AI a reliable partner in your daily work, not just another technology to manage.


 

Forecasting You: How Data Science Powers Personalized Marketing

Ojasvi Khanna - Data scientist

From the ad-sponsored content we scroll past, to the products we are shown, to the emails that land in our inboxes, personalized ads are quietly influencing countless everyday customer buying decisions. Personalization is no longer a niche application of data science--it is the backbone of modern digital marketing. Today, the $650 billion industry of personalized marketing is being rapidly reshaped by AI and is projected to grow beyond $1.5 trillion by 2035.

While personalized systems often feel intuitive or even magical, the reality is more nuanced. At its core, personalization is a forecasting problem: making informed but uncertain predictions about what future you will do. Understanding this framing helps demystify why personalized marketing works when it does--and why it sometimes fails.

This talk breaks down personalized marketing from the ground up. It explains what data science models power these systems, and how recent advances in AI are accelerating both their scale and their impact. The session covers foundational ideas, modeling approaches, and emerging AI-driven use cases, offering a high-level understanding of how data scientists and AI professionals model, execute, monitor, and evaluated such personalized marketing campaigns in industry.

Lastly, this talk also tackles an important but often overlooked question: why better models and more powerful AI do not always lead to better outcomes. Such challenges reveal a critical truth for data professionals-- performance improvements on paper do not always translate into healthier, more meaningful real-world impact.

Although the examples focus on personalized marketing, the lessons extend far beyond it. This session is designed to equip data professionals with a clearer, more grounded understanding of how AI-driven personalization works--and how to build predictive systems that are not just smarter, but more thoughtful and responsible.

 

More Than a Retrain: How to Monitor, Diagnose, and Explain Drift in Production ML Models

Aashreen Raorane - Senior Data Scientist

Model degradation in production rarely comes from a single failure--it emerges through subtle shifts in data, upstream pipelines, or behavioral patterns. In real-world environments, a retrain alone doesn't fix these issues. What teams need is a systematic way to detect, diagnose, and explain drift.

This session presents a practical, tool-agnostic framework for understanding model drift based on lessons learned from validating and comparing production models. We will cover:

  1. Detection: identifying feature drift, prediction distribution changes, and version-to-version inconsistencies through input/output checks

  2. Diagnosis: tracing issues to upstream data shifts, schema changes, data quality problems, or model logic mismatches

  3. Explanation: translating technical findings into clear narratives for stakeholders to support retraining, rollback, or remediation decisions

Attendees will gain actionable techniques for monitoring model health and ensuring ML systems remain accurate, stable, and trustworthy over time--without requiring advanced ML Ops infrastructure.


 

Beyond the Prompt: Building Autonomous AI Agents for High-Stakes Adversarial Environments- such as finance, fraud & abuse

Sneha Sivakumar - Product leader

This talk provides practical, proven methods to build AI agents for real world high stakes business processes such as combating fraud. Moving past simple chatbots to explore how autonomous agents can reason, pivot, and act to stop bad actors in real-time. How do we solve the integration with existing systems and how to prepare your data for a successful launch. The talk will cover (1) how to choose a business problem ready for agentic revolution (2) data preparation and labelling (3) achieving a high, 99% precision (4) launching and learning from the the agent outcomes.

 

Biomanufacturing for a better world

Erin Wilson - data scientist

Industrial biomanufacturing. The phrase doesn’t quite evoke “social good” imagery, but with a little more context, I hope it will next time you hear it! Biomanufacturing is a sector of industry that aims to produce valuable materials by harnessing the vast catalog of molecules made by Nature. While molecules exist already in natural forms, many require painstaking, unsustainable, extractive processes to isolate key ingredients at scale. We can be more clever: by taking genetic instructions from organisms that naturally produce a useful molecule and installing those instructions in a microbe, we can grow tanks of microbes that “brew” such molecules instead. An ecosystem of biomanufacturing companies is already growing: many feed microbes with renewable inputs, like sugar, while others capitalize on waste streams from other industries, such as gaseous carbon emissions. 

To make a dent in climate change, biomanufacturing needs to get big. Industrial scale! While steel pipe networks wrapping around building-size bioreactors may contrast typical leafy green motifs of environmental sustainability work, industrial biomanufacturing is poised for social impact. We can reduce environmental harms caused by agricultural land use and pollution, provide alternatives that displace fossil carbon-based products, and even capture and repurpose carbon emissions before they enter the atmosphere. Biomanufacturing beautifully blends the mechanical and microbial for sustainability.

Many biomanufacturing approaches are maturing, but most are critically held back by underdeveloped data practices. We need better measurement equipment that can detect small-but-critical changes in biological systems; software that can predict and alert when such changes will trigger upsets to operations; deep domain experts that can ALSO troubleshoot with effective data analysis and visualization. Properly applied data science and engineering can help create a clearer window into the complexities of industrial biomanufacturing, and accelerate the field’s progress towards a healthier, more sustainable planet.

 

Designing Reliable Agentic AI Systems: Design Patterns for Production

Harsheeta Venkoba Rao - founding software engineer

Agentic AI systems promise autonomy, adaptability, and powerful multi-step reasoning but deploying them in production introduces challenges that traditional machine learning systems were never designed to handle. As systems move beyond single prompts to stateful, tool-using workflows, teams often encounter unpredictable behavior, silent quality degradation, and growing concerns around reliability, cost, and trust. This talk focuses on why these challenges emerge and how data professionals can think more systematically about building agentic AI systems that are reliable, observable, and safe.

The session begins by establishing a clear and accessible understanding of what makes a system “agentic,” contrasting prompt-based LLM pipelines with systems that reason over time, interact with external tools, and maintain context across steps. Using real-world scenarios, the talk highlights why agentic systems behave differently from traditional models and why familiar evaluation and monitoring approaches are often insufficient.

The core of the talk introduces design patterns for production agentic systems, emphasizing principles rather than tools. It explores how thoughtful system architecture, intentional monitoring, and well-placed guardrails can improve reliability without limiting usefulness. Monitoring is framed as a design choice rather than an afterthought, helping teams detect issues early, understand system behavior, and maintain confidence as systems evolve.

The talk concludes by examining the tradeoffs between autonomy and control and offering guidance on when agentic architectures add meaningful value and when simpler approaches may be more effective. Attendees will leave with a clear mental model for agentic AI systems, an understanding of why reliability is difficult but achievable, and practical principles they can apply when designing, evaluating, or deploying agentic systems in production environments.

 

Model Context Protocol (MCP): The Next Frontier of Generative AI

Neelam Koshiya - principle applied ai architect

As generative AI moves from experimentation to enterprise-scale adoption, the need for structure, control, and context becomes critical. Enter the Model Context Protocol (MCP)--a new paradigm that standardizes how applications communicate with foundation models using modular, context-rich instructions. MCP enables safer, more interpretable, and reusable GenAI workflows by separating business logic from prompts and embedding policy and governance into interactions. Model Context Protocol (MCP) has evolved into the universal "USB-C for AI," solving the critical "disconnected models" problem by standardizing how Large Language Models (LLMs) securely access diverse enterprise data and tools. This session explores why organizations are replacing brittle, custom-coded connectors with this model-agnostic layer to eliminate vendor lock-in, and how its client-server architecture--utilizing JSON-RPC--enables seamless integration with platforms like Claude, Bedrock, and Microsoft Copilot. Attendees will learn what capabilities can be unlocked through MCP's core primitives--Resources, Tools, and Prompts--and when to leverage the 10,000+ public integrations already available in the ecosystem to move from pilot projects to full-scale, agentic production deployments.

 

Surfacing Hidden Potential: ML-Driven Selection and Causal Inference for Rare Event Prediction in Partner Ecosystems

Somang (So) Han - data scientist

We present a machine learning framework for prioritizing high-potential partners in a large-scale ecosystem where meaningful business events are extremely rare.

To validate model-driven selection, we design a hybrid experimental framework combining randomized controlled trials for ML-selected partners with Synthetic Difference-in-Differences (SDID) for heuristic-selected partners where randomization is infeasible. A key challenge is statistical power under rare events and anticipated 50% non-compliance. To enable decision-making under these constraints, we adopt Bayesian inference with conjugate Beta-Binomial updating and Monte Carlo sampling for compliance-adjusted causal estimands (ITT, LATE). Rather than frequentist significance thresholds, we apply posterior probability thresholds calibrated to internal experimentation standards, enabling principled decisions when classical power is unattainable.

This work contributes methodologies for (1) rare event prediction under extreme class imbalance, (2) hybrid causal inference combining RCT and quasi-experimental approaches, and (3) Bayesian decision frameworks for resource-constrained experiments.

 

Taxonomy-Agnostic Hybrid Recommendation System for Procurement Classification

Ayushi Das - data scientist

Spend classification is a capability in procurement management, enabling strategic sourcing, supplier negotiation, cost optimization, and accurate financial reporting. However, classifying purchase orders (POs) and invoices remains challenging due to noisy and unstructured inputs, limited labeled data, evolving taxonomies, and ambiguous category definitions. Traditional supervised approaches struggle to generalize across such complex procurement environments.

This paper presents a taxonomy-agnostic, supplier-aware dual-expert recommendation architecture that combines LLMs with trained embedding-based semantic retrieval for robust procurement classification. The system leverages hierarchically grounded taxonomy descriptions, automatically generated and refined using LLMs, to improve semantic alignment between items and category scopes. Domain-specific embedding models are trained to enhance semantic search accuracy across noisy item descriptions, invoice text, and taxonomy metadata.

The dual-expert design consists of: (1) a retrieval expert that performs hybrid semantic search over taxonomy data, historical procurement records, and supplier intelligence, including normalized supplier descriptions and LLM-generated supplier tags; and (2) a fine-tuned LLM-based reranking expert that performs item-centric classification using structured reasoning, forced decision logic, and supplier-based validation signals. Prompt optimization of the reranking expert improves ranking precision and decision consistency without requiring model fine-tuning or retraining.

The system is evaluated across multiple taxonomies and achieves 85% top-5 accuracy on POs and 90 - 95% on invoices, outperforming the strongest baseline by approximately 20% and 51%, respectively. Recent enhancements yield 75 - 80% top-1 accuracy in real-world invoice classification. Error analysis indicates that remaining failures primarily arise from taxonomy design limitations, such as overlapping categories and insufficiently defined scopes.

Beyond accuracy gains, this work contributes automated taxonomy scope generation, supplier-aware classification via LLM-derived metadata, and scalable, production-ready framework that adapts to evolving taxonomies and unstructured data without retraining, demonstrating strong applicability across enterprise procurement environments.

 

When (and When Not) to Leverage Agentic AI: Practical Lessons from Building Projects and Autonomous Data Workflows

Alisha Gala - Senior data scientist

Booma Sowkarthiga Balasubramani - senior data scientist

Jingyi Du - principal data science manager

Agentic AI is increasingly promoted as the default paradigm for intelligent systems, promising autonomy, flexibility, and productivity. Yet as agent-based designs move from demos into real world data and decision workflows, teams encounter under-discussed challenges: unclear evaluation metrics, hidden operational costs, reliability risks, and ambiguous boundaries between human- and machine-control.

This talk takes a pragmatic view of Agentic AI through three concrete case vignettes:

  1. Creative matching pipeline (image→theme→verse): A deterministic workflow outperforms agentic orchestration on latency, predictability, and explainability—illustrating when agentic behavior is not needed.

  2. Experimentation-analysis-agent: Reads specifications, aggregates key metrics, and produces rollout recommendations—but doesn’t execute rollout decisions. Instead, the system surfaces tradeoffs, confidence signals, and guardrail-checks for human review. This shows how Agentic AI adds value as analytical decision support without crossing into autonomous control in high-risk contexts.

  3. Autonomous anomaly investigation agent: Plans queries, revises hypotheses, and proposes remediation. It demonstrates genuine agentic properties—and failure modes that grow as control is delegated: error amplification from early wrong assumptions, persuasive but false confidence, observability gaps, and evaluation blindness when teams track completion rather than decision quality, stability, and human override cost.

Across these examples, we analyze Agentic AI as a spectrum of architectural decisions—from simple pipelines to systems that adapt their level of autonomy under uncertainty. We share a decision framework for when autonomy earns its place (including stakes, reversibility, observability) and an evaluation toolkit that operationalizes success beyond task completion (correct-action rate, steps/time saved, latency inflation, etc.).

Attendees will leave with actionable criteria to decide when Agentic AI is justified, how to evaluate its real impact, and avoid common pitfalls, including a clear rubric for defending “no agent” designs when they are the safer, faster, and more reliable choice.