Why do young workers quit their first job? Identification of the risk factors using the Cox model and survival trees

Wioletta Grzenda; Agnieszka Marszałek

doi:https://doi.org/10.59139/ws.2024.03.2

Why do young workers quit their first job? Identification of the risk factors using the Cox model and survival trees

Wioletta Grzenda Szkoła Główna Handlowa w Warszawie, Kolegium Analiz Ekonomicznych, Instytut Statystyki i Demografii, Polska / SGH Warsaw School of Economics, Collegium of Economic Analysis, Institute of Statistics and Demography, Poland ORCID: https://orcid.org/0000-0002-2226-4563 , Agnieszka Marszałek Badaczka niezależna / Independent ORCID: https://orcid.org/0000-0003-4906-6484 Wiadomości Statystyczne. The Polish Statistician, vol. 69, 2024, 3, s. 18-37 Opublikowano online: 2 kwietnia 2024 DOI https://doi.org/10.59139/ws.2024.03.2 Sposób cytowania: Grzenda, W., Marszałek, A. (2024). Why do young workers quit their first job? Identification of the risk factors using the Cox model and survival trees. Wiadomości Statystyczne. The Polish Statistician, 69(3), 18–37. https://doi.org/10.59139/ws.2024.03.2.

1647 Wyświetlenia 149 Pobrania

ARTYKUŁ

(Angielski) PDF

STRESZCZENIE

According to Statistics Poland’s data, the situation of young people in the Polish labour market has improved significantly in recent years. Therefore, on the one hand, it is easier for young people entering the labour market to find a job, and on the other, it is increasingly difficult for employers to keep such people in their organisation. The aim of this study is to identify and assess individual characteristics of young workers and work-related factors that affect the length of the time they spend in their first job. The study is based on data for 2019 and 2020 from Statistics Poland’s Labour Force Survey. It is of key importance in the research on the professional activity of young people to take into account in modelling the high volatility of their characteristics over time. Therefore, we used the Cox model with time-variant variables to identify factors of risk of quitting a young employee’s first job. One of the findings of the study was that people with higher education were more likely to quit their jobs than people with lower-level education. As regards work-related factors, in addition to the type of employment contract, the weekly working time and holding or not a managerial position were the important ones affecting the decision to continue or quit. Furthermore, groups of employees homogeneous in terms of the duration of their first job were identified using survival trees. We found that employees with fixed-term contracts were less likely to quit their jobs than those with permanent contracts, but working part-time.

SŁOWA KLUCZOWE

labour market, first job, young employees, Cox model, survival trees

JEL

J62, J64, C14, C41

BIBLIOGRAFIA

Ben-Hur, A., Guyon, I. (2003). Detecting stable clusters using principal component analysis. W: M. J. Brownstein, A. B. Kohodursky (red.), Functional Genomics: Methods and Protocols (s. 159–182). Humana press. https://doi.org/10.1385/1-59259-364-X:159.

Bodenhofer, U., Kothmeier, A., Hochreiter, S. (2011). APCluster: an R package for affinity propagation clustering. Bioinformatics, 27(17), 2463–2464. https://doi.org/10.1093/bioinformatics/btr406.

Brock, G., Pihur, V., Datta, S., Datta, S. (2008). clValid: An R Package for Cluster Validation. Journal of Statistical Software, 25(4), 1–22. https://doi.org/10.18637/jss.v025.i04.

Chiu, D. S., Talhouk, A. (2018). diceR: an R package for class discovery using an ensemble driven approach. BMC Bioinformatics, 19(11), 1–4. https://doi.org/10.1186/s12859-017-1996-y.

Dudoit, S., Fridlyand, J. (2003). Bagging to improve the accuracy of a clustering procedure. Bioinformatics, 19(9), 1090–1099. https://doi.org/10.1093/bioinformatics/btg038.

Fang, Y., Wang, J. (2012). Selection of the number of clusters via the bootstrap method. Computational Statistics and Data Analysis, 56(3), 468–477. https://doi.org/10.1016/j.csda.2011.09.003.

Fred, A. L. N., Jain, A. K. (2002). Data clustering using evidence accumulation. W: 2002 International Conference on Pattern Recognition (s. 276–280). IEEE. https://doi.org/10.1109/ICPR.2002.1047450.

Fred, A. L. N., Jain, A. K. (2005). Combining multiple clusterings using evidence accumulation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6), 835–850. https://doi.org/10.1109/TPAMI.2005.113.

Frey, B. J., Dueck, D. (2007). Clustering by Passing Messages Between Data Points. Science, 315(5814), 972–976. https://doi.org/10.1126/science.1136800.

Henning, C. (2007). Cluster-wise assessment of cluster stability. Computational Statistics and Data Analysis, 52(1), 258–271. https://doi.org/10.1016/j.csda.2006.11.025.

Hornik, K. (2005). A CLUE for CLUster Ensembles. Journal of Statistical Software, 14(12), 1–25. https://doi.org/10.18637/jss.v014.i12.

Kannan, R., Vempala, S., Vetta, A. (2004). On clustering: Good, Bad and Spectral. Journal of the ACM, 51(3), 497–515. https://doi.org/10.1145/990308.990313.

Kuncheva, L. I., Vetrov, D. P. (2006). Evaluation of Stability of k-Means Cluster Ensembles with Respect to Random Initialization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(11), 1798–1808. https://doi.org/10.1109/TPAMI.2006.226.

Leisch, F. (1999). Bagged Clustering (SFB Working Papers No. 51). https://doi.org/10.57938/9b129f95-b53b-44ce-a129-5b7a1168d832.

Leone, M., Sumedha, Weigt, M. (2007). Clustering by soft-constraint affinity propagation: applications to gene-expression data. Bioinformatics, 23(20), 2708–2715. https://doi.org/10.1093/bioinformatics/btm414.

Lord, E., Willems, M., Lapointe, F. J., Makarenkov, V. (2017). Using the stability of objects to determine the number of clusters in datasets. Information Sciences, 393, 29–46. https://doi.org/10.1016/j.ins.2017.02.010.

Marino, V., Presti, L. L. (2019). Stay in touch! New insights into end-user attitudes towards engagement platforms. Journal of Consumer Marketing, 36(6), 772–783. https://doi.org/10.1108/JCM-05-2018-2692.

Meng, J., Hao, H., Luan, Y. (2016). Classifier ensemble selection based on affinity propagation clustering. Journal of Biomedical Informatics, 60, 234–242. https://doi.org/10.1016/j.jbi.2016.02.010.

Monti, S., Tamayo, P., Mesirov, J., Golub, T. (2003). Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Machine Learning, 52(1–2), 91–118. https://doi.org/10.1023/A:1023949509487.

Ng, A. Y., Jordan, M. I., Weiss, Y. (2001). On Spectral Clustering: Analysis and an algorithm. W: T. G. Dietterich, S. Becker, Z. Ghahramani (red.), Advances in Neural Information Processing Systems 14. The MIT Press.

Rozmus, D. (2011). Porównanie stabilności zagregowanych algorytmów taksonomicznych opartych na macierzy współwystąpień. Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu. Research Papers of Wrocław University of Economics, (176), 212–220.

Rozmus, D. (2013). Porównanie dokładności taksonomicznej metody propagacji podobieństwa oraz zagregowanych algorytmów taksonomicznych opartych na idei metody bagging. Prace Naukowe Uniwersytetu Ekonomicznego we Wrocławiu. Research Papers of Wrocław University of Economics, (279), 106–114.

Rozmus, D. (2021). The Number of Groups in an Aggregated Approach in Taxonomy with the Use of Stability Measures and Classical Indices – A Comparative Analysis. Acta Universitatis Lodziensis. Folia Oeconomica, 6(357), 55–67. https://doi.org/10.18778/0208-6018.357.04.

Rozmus, D. (2022). Cluster Ensemble Stability in Clustering of EU Members in Terms of Sustainable Development Goals. W: K. Jajuga, G. Dehnel, M. Walesiak (red.), Modern Classification and Data Analysis. Methodology and Applications to Micro- and Macroeconomic Problems (s. 289– 301). Springer. https://doi.org/10.1007/978-3-031-10190-8_20.

?enbabaoglu, Y., Michailidis, G., Li, J. Z. (2014). Critical limitations of consensus clustering in class discovery. Scientific Reports, 4, 1–13. https://doi.org/10.1038/srep06207.

Shamir, O., Tishby, N. (2008). Cluster stability for finite samples. W: J. C. Platt, D. Koller, Y. Singer, S. T. Roweis (red.), Advances in Neural Information Processing Systems 20 (NIPS 2007) (s. 1297– 1304). Curran Associates.

Shi, J., Malik, J. (2000). Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8), 888–905. https://doi.org/10.1109/34.868688.

Suzuki, R., Shimodaira, H. (2006). Pvclust: an R package for assessing the uncertainty in hierarchical clustering. Bioinformatics, 22(12), 1540–1542. https://doi.org/10.1093/bioinformatics/btl117.

Volkovich, Z., Barzily, Z., Toledano-Kitai, D., Avros, R. (2010). The Hotteling’s metric as a cluster stability measure. Computer Modelling and New Technologies, 14(4), 65–72. http://www.cmnt.lv/upload-files/ns_3914_4_cmnt2010.pdf.

Yu, Z., Li, L., Liu, J., Zhang, J., Han, G. (2015). Adaptive noise immune cluster ensemble using affinity propagation. IEEE Transactions on Knowledge and Data Engineering, 27(12), 3176– 3189. https://doi.org/10.1109/TKDE.2015.2453162.

Wróć do: