🧪

Knowledge Challenge

A friend thinks you can answer this question about Model Evaluation Framework

Your AI feature scored 91% on a benchmark. After deployment, customer complaints suggest accuracy is closer to 70%. What's the most likely explanation?