stackoverflow-survey-2025-analysis
Statistical ML analysis of AI trust and compensation in the 2025 developer survey.
AI trust analysis from Stack Overflow's 2025 survey.
// overview
This academic analysis studies two questions from the Stack Overflow 2025 Developer Survey: binary prediction of developer AI trust and linear modeling of EUR compensation from professional experience.
The workflow is script-driven. It includes the local survey dataset, preprocessing pipelines, EDA generation, classification modeling, regression diagnostics, PDFs for context, and generated plot artifacts.
// how it works
EDA, classification, and regression each have their own runnable script. Shared preprocessing handles validation, plausibility filters, IQR outlier handling, category reduction, and ordinal mappings.
Classification uses sklearn pipelines with ColumnTransformer so encoding and scaling stay inside the training flow. Regression emits global and per-country diagnostics with residual and QQ plots.
// capabilities
AI trust classification
Compares Logistic Regression with linear, polynomial, and RBF SVM candidates.
Compensation regression
Models EUR compensation against work experience with train/test evaluation.
Exploratory analysis
Generates plots for compensation, experience, AI trust, employment mix, and correlations.
Statistical validation
Uses repeated runs, cross-validation, confidence intervals, confusion matrices, and residual diagnostics.