stackoverflow-survey-2025-analysis

Statistical ML analysis of AI trust and compensation in the 2025 developer survey.

AI trust analysis from Stack Overflow's 2025 survey.

// overview

This academic analysis studies two questions from the Stack Overflow 2025 Developer Survey: binary prediction of developer AI trust and linear modeling of EUR compensation from professional experience.

The workflow is script-driven. It includes the local survey dataset, preprocessing pipelines, EDA generation, classification modeling, regression diagnostics, PDFs for context, and generated plot artifacts.

// how it works

EDA, classification, and regression each have their own runnable script. Shared preprocessing handles validation, plausibility filters, IQR outlier handling, category reduction, and ordinal mappings.

Classification uses sklearn pipelines with ColumnTransformer so encoding and scaling stay inside the training flow. Regression emits global and per-country diagnostics with residual and QQ plots.

// capabilities

AI trust classification

Compares Logistic Regression with linear, polynomial, and RBF SVM candidates.

Compensation regression

Models EUR compensation against work experience with train/test evaluation.

Exploratory analysis

Generates plots for compensation, experience, AI trust, employment mix, and correlations.

Statistical validation

Uses repeated runs, cross-validation, confidence intervals, confusion matrices, and residual diagnostics.