Abstract
<jats:p>Somatic cell count (SCC) and differential somatic cell count (DSCC) are widely recognized as reliable indicators of udder health and inflammation in dairy cows. Recent advances in statistical modeling and biochemical analysis have enabled more refined predictions of these indicators using milk composition traits. The objective of this study was to investigate the associations between key milk biochemical parameters (fat, protein, casein, lactose, solids non-fat, total dry matter, pH, urea, acetone, and β-hydroxybutyrate-BHB) and udder health indicators, and to identify the most robust predictive model for SCC. A total of 272 milk samples were collected from cows under the Official Performance Recording Milk Production program in Arad County, Romania, and analyzed using the CombiFoss™ FT+ system. After outlier removal, 251 samples were retained for analysis. Data preprocessing included standardization and log-transformation of SCC to improve model assumptions. Statis-tical modeling involved stepwise regression with interaction terms, as well as Ridge and Lasso regularization techniques. The best results were obtained using the Lasso model for log (SCC), which achieved the highest predictive accuracy (R² = 0.655), selecting biologically relevant predictors such as protein, BHB, casein, lactose, and DSCC. The model's predictions aligned closely with measured values and confirmed known correlations, such as the negative association between lactose and SCC and the positive association of BHB and protein with SCC. Although the Lasso model for DSCC showed lower predictive power (R² = 0.175), it selected key predictors (protein, acetone, and BHB) similar to other models, underlining the complex physio-logical basis of DSCC. Ridge regression confirmed these trends, supporting the robustness of the selected variables. These findings emphasize the utility of regularized regression, particularly Lasso, in developing practical tools for early mastitis screening based on routinely collected milk composition data, with potential applications in herd health monitoring and de-cision-making on commercial dairy farms.</jats:p>