{"id":10036,"date":"2026-04-21T10:26:15","date_gmt":"2026-04-21T10:26:15","guid":{"rendered":"https:\/\/www.europesays.com\/ai\/10036\/"},"modified":"2026-04-21T10:26:15","modified_gmt":"2026-04-21T10:26:15","slug":"value-systems-of-artificial-intelligence-and-university-students-theoretical-dominance-in-large-language-models-and-religious-priority-in-humans","status":"publish","type":"post","link":"https:\/\/www.europesays.com\/ai\/10036\/","title":{"rendered":"Value systems of artificial intelligence and university students: theoretical dominance in large language models and religious priority in humans"},"content":{"rendered":"<p>Abstract<\/p>\n<p>The rapid advancement of artificial intelligence (AI), particularly large language models (LLMs), raises critical questions about the value system these systems appear to reflect in comparison with human values. This study aimed to examine Spranger\u2019s six value types (religious, social, theoretical, economic, political, and aesthetic) as manifested in three LLMs (OpenAI-o1, Gemini-2.0, and DeepSeek-V3), and to compare them with the value system of a sample of students at King Khalid University. A descriptive\u2013comparative design was employed, administering the Study of Values to both groups: 214 students (male and female across academic levels) and the three LLMs, with repeated administrations to the latter to ensure test\u2013retest reliability. Results indicated statistically significant differences in both the prominence and ranking of values across groups. Theoretical values consistently dominated in the LLMs, followed by social, aesthetic, and political values, with religious values ranking lowest. In contrast, students prioritized religious values, followed by theoretical values, while aesthetic values occupied the lowest ranks. Further, significant effects of gender and academic level were observed among students: religious values were more salient among females, theoretical values among males, and aesthetic values among undergraduates. These findings suggest that LLMs project value system shaped by their training data, rather than by human cultural or moral frameworks. The study highlights the importance of integrating culturally diverse value dimensions into AI development and calls for raising students\u2019 awareness of using AI tools in ways aligned with human values. Effect-size estimates further indicated very large human\u2013AI discrepancies, particularly in the religious (d\u202f=\u202f2.21) and theoretical domains (d\u202f=\u202f1.22).<\/p>\n<p>1 Introduction1.1 Background and significance<\/p>\n<p>The rapid advancement of artificial intelligence (AI), particularly large language models (LLMs), has raised fundamental questions regarding the implicit value orientations embedded in AI-generated outputs.<\/p>\n<p>In recent years, artificial intelligence (AI) has shifted from a theoretical concept to a transformative force reshaping psychology, education, and industry. Central to this transformation are large language models (LLMs), designed to simulate linguistic and cognitive processes in producing text and engaging in complex interactions (<a class=\"ArticleReference\" href=\"#ref20\" id=\"ref20a\" data-event=\"articleReference-a-ref20\">Haase and Hanel, 2023<\/a>). Since its release in November 2022, OpenAI\u2019s ChatGPT has demonstrated human-like linguistic performance, prompting rapid development of competing systems such as Google\u2019s Gemini and China\u2019s DeepSeek.<\/p>\n<p>This growing reliance on AI raises ethical and societal concerns, particularly regarding the values embedded in its outputs (<a class=\"ArticleReference\" href=\"#ref28\" id=\"ref28a\" data-event=\"articleReference-a-ref28\">Kaya et al., 2024<\/a>). In psychology, values are viewed as fundamental motives guiding behaviour and moral orientations (<a class=\"ArticleReference\" href=\"#ref55\" id=\"ref55a\" data-event=\"articleReference-a-ref55\">Zahran, 1984<\/a>). Although LLMs lack self-awareness or subjective experience, their outputs may reflect implicit value patterns derived from training data and algorithms (<a class=\"ArticleReference\" href=\"#ref10\" id=\"ref10a\" data-event=\"articleReference-a-ref10\">Bodro\u017ea et al., 2023<\/a>; <a class=\"ArticleReference\" href=\"#ref19\" id=\"ref19a\" data-event=\"articleReference-a-ref19\">Guo et al., 2023<\/a>; <a class=\"ArticleReference\" href=\"#ref54\" id=\"ref54a\" data-event=\"articleReference-a-ref54\">Ye et al., 2024<\/a>).<\/p>\n<p>1.2 Research gap<\/p>\n<p>A growing body of research has begun to benchmark large language models directly against humans using instruments originally developed to assess cognitive, social, and emotional functioning. Recent evidence suggests that models such as ChatGPT-4 can outperform psychology students and trainees on standardized measures of social intelligence and can exceed human norms on performance-based assessments of emotional awareness, achieving near-ceiling scores with strong expert-rated contextual appropriateness (<a class=\"ArticleReference\" href=\"#ref50\" id=\"ref50a\" data-event=\"articleReference-a-ref50\">Sufyan et al., 2024<\/a>; <a class=\"ArticleReference\" href=\"#ref16\" id=\"ref16a\" data-event=\"articleReference-a-ref16\">Elyoseph et al., 2023<\/a>). Parallel work in high-stakes academic and professional domains, including medical licensing examinations, university mathematics admissions tests, and histology assessments, has likewise shown that GPT-based systems may match or surpass average student performance when the item is treated as the unit of analysis (<a class=\"ArticleReference\" href=\"#ref39\" id=\"ref39a\" data-event=\"articleReference-a-ref39\">Meyer et al., 2024<\/a>; <a class=\"ArticleReference\" href=\"#ref53\" id=\"ref53a\" data-event=\"articleReference-a-ref53\">Udias et al., 2024<\/a>; <a class=\"ArticleReference\" href=\"#ref37\" id=\"ref37a\" data-event=\"articleReference-a-ref37\">Mavrych et al., 2025<\/a>).<\/p>\n<p>Despite these advances, the available literature has remained focused primarily on cognitive performance, emotional recognition, or accuracy in domain-specific testing. Considerably less attention has been devoted to the question of whether large language models exhibit stable value-related output patterns that can be meaningfully compared with human value priorities. This gap is consequential, because values are not peripheral psychological attributes; rather, they shape judgment, preference, decision-making, and social interpretation across educational, cultural, and interpersonal contexts. As AI systems are increasingly used in settings involving guidance, evaluation, and meaning-making, understanding the value-related tendencies reflected in their outputs becomes both theoretically important and practically necessary.<\/p>\n<p>This gap is particularly evident in relation to classical psychological frameworks of human values. Although value theory has long occupied a central place in personality, social, and cultural psychology, relatively few studies have applied established value models to contemporary AI systems in a way that permits structured comparison with human respondents. Moreover, when such comparisons are attempted, they often risk anthropomorphic overinterpretation by implying that AI systems possess internal beliefs, intentions, or enduring value structures analogous to those of humans. A more conceptually disciplined approach is therefore needed\u2014one that treats AI responses as patterned outputs generated under standardized prompting conditions, while still allowing psychologically meaningful comparison with human value priorities.<\/p>\n<p>The present study addresses this gap by examining value systems in three prominent large language models\u2014OpenAI-o1, Gemini-2.0, and DeepSeek-V3\u2014and comparing their output patterns with the value priorities of students at King Khalid University. Specifically, the study draws on Spranger\u2019s six value types\u2014religious, theoretical, social, political, economic, and aesthetic\u2014to explore areas of convergence and divergence between AI-generated and human value priorities. By doing so, the study aims to clarify the extent to which AI aligns with, or diverges from, human value orientations, and to contribute to ongoing discussions on culturally informed AI development, interpretive caution, and the responsible use of AI in psychological and social contexts.<\/p>\n<p>Taken together, the foregoing review and identified research gap provide the foundation for the present investigation. Accordingly, the study was designed to examine value-related patterns in selected large language models and to compare them with the value priorities of university students within a structured psychological framework.<\/p>\n<p>1.3 Objectives<\/p>\n<p>This study aims to:<\/p>\n<p>Compare value systems of three LLMs (ChatGPT-o1, Gemini-2.0, DeepSeek-V3).<\/p>\n<p>Compare AI value profiles with human students.<\/p>\n<p>Examine gender and academic level differences in human values.<\/p>\n<p>Analyze convergence\/divergence in value ranking patterns.<\/p>\n<p>1.4 Research hypotheses<\/p>\n<p>There are statistically significant differences in value system among university students according to gender and academic level.<\/p>\n<p>There are statistically significant differences in value system among large language models (ChatGPT, Gemini, and DeepSeek) depending on model type.<\/p>\n<p>There are statistically significant differences in value system between students and large language models (ChatGPT, Gemini, and DeepSeek).<\/p>\n<p>Theoretical and social values are expected to rank highest in large language models, while religious and aesthetic values appear in the lowest ranks. In contrast, religious and social values are expected to rank highest among university students overall, with aesthetic values occupying the lowest ranks. Economic and political values are expected to fall in the middle range for both groups. Theoretical values are anticipated to vary among students, ranking differently between undergraduate and postgraduate levels.<\/p>\n<p>1.5 Research delimitations<\/p>\n<p>This study is delimited to the examination of values according to <a class=\"ArticleReference\" href=\"#ref48\" id=\"ref48a\" data-event=\"articleReference-a-ref48\">Spranger\u2019s (1928)<\/a> classification, conceptualized as interests, preferences, and judgements, and restricted to six value types: political, social, theoretical, religious, economic, and aesthetic. The scope is further limited to three large language models\u2014ChatGPT-O1, Gemini-2.0, and DeepSeek-V3\u2014tested during the period from 1 December 2024 to 30 January 2025. On the human side, the study is confined to male and female students enrolled in the College of Education at King Khalid University, across academic levels ranging from undergraduate to doctoral studies, during the 2024\u20132025 academic year.<\/p>\n<p>Boundary conditions and transferability. Because the human sample is drawn from one university context (College of Education, KKU), the human value hierarchy should be interpreted as context-bounded and not assumed to represent all university students or cultures. In addition, administering the instrument in Arabic improves comparability with the human sample but may introduce cross-lingual effects for models whose training data are unevenly distributed across languages. Finally, while the reference base prioritizes peer-reviewed sources, a limited number of recent preprints are cited to reflect fast-evolving evidence in LLM research; these should be re-evaluated as peer-reviewed versions become available. Moreover, because commercial LLM providers routinely update or retire model versions, the exact model snapshots used during the data-collection window may no longer remain accessible, which can limit exact reruns on identical versions.<\/p>\n<p>1.6 Challenges and mitigation strategies<\/p>\n<p>One of the main challenges in this study concerned the stability of AI responses, which raised concerns of either inconsistency or exaggerated uniformity. This issue was addressed by administering the measures to the models repeatedly, at different times and across separate sessions using the same standardized procedure, to enhance test\u2013retest reliability. Another challenge was the relatively small sample size of human participants. While this limitation may reduce the generalizability of the findings, it does not appear to undermine the accuracy or validity of the results obtained. We also recognise prompt sensitivity (i.e., shifts in model outputs due to small wording changes) as a methodological threat; this was mitigated by using a fixed prompt template, enforcing an A\/B-only format, and repeating administrations across separate sessions.<\/p>\n<p>2 Methods2.1 Research design<\/p>\n<p>The present study employed a descriptive\u2013comparative design to examine the level and ranking of values among participants and to compare these across human and AI groups. This design was deemed appropriate for studies requiring the description of data, group comparisons, and analysis of potential differences between heterogeneous populations.<\/p>\n<p>2.2 Population and participants2.2.1 The \u201chuman\u201d participants are university students<\/p>\n<p>Human participants. The human population consisted of students from the College of Education at King Khalid University, including both male and female students enrolled in closely related educational disciplines (psychology, curriculum and instruction, and educational administration). All academic levels were represented (undergraduate, master\u2019s, and doctoral). A stratified random sampling method was used, yielding a total of 214 participants (134 males, 80 females). Given their shared educational focus, participants were considered comparable in disciplinary background. <a class=\"ArticleReference\" href=\"#tab1\" id=\"tab1a\" data-event=\"articleReference-a-tab1\">Table 1<\/a> presents the distribution of the human sample by gender and academic level.<\/p>\n<p>Gender\/Academic levelUndergraduateMaster\u2019sDoctoralTotalMale562949134Female11373280Total676681214<\/p>\n<p>Distribution of the human sample by gender and academic level.<\/p>\n<p>2.2.2 Artificial \u201cparticipants\u201d (LLMs)<\/p>\n<p>The artificial \u201cparticipants\u201d in this study were three large language models (LLMs): ChatGPT-o1 (OpenAI), Gemini-2.0 (Google), and DeepSeek-V3 (DeepSeek). All models were accessed through their official web-based interfaces using the most recent publicly available versions during the data collection period (1 December 2024\u201330 January 2025).<\/p>\n<p>To maximise comparability between human and AI data, the Study of Values (SOV) was administered to each LLM The Arabic version of the SOV\u2014the same version administered to the students\u2014was presented item by item to the models, with minimal adaptations to fit a text-only interaction and to enforce the forced-choice response format.<\/p>\n<p>For each administration (\u201crun\u201d), a new conversation was initiated with the model to avoid contamination by prior context. At the beginning of each run, the model received a brief instruction describing the task and specifying that it must select only one of the response options per item (see <a class=\"ArticleReference\" href=\"#SM1\" id=\"SM1a\" data-event=\"articleReference-a-sm1\">Appendix 1<\/a> for the exact text of the prompts). Each SOV item was then presented with its two alternatives labelled (A) and (B), and the model was explicitly instructed to respond using a single letter (A or B) without explanation. When a model produced an output that did not conform to this response format (e.g., full sentences, multiple options), the item was immediately re-presented with a reminder of the response rule; if non-conformity persisted, the response was coded as missing for that item and excluded from scoring.<\/p>\n<p>The models were administered the SOV multiple times in order to estimate the stability of their responses and to derive more robust model-level estimates. Specifically, the SOV was administered seven times to ChatGPT-o1, five times to Gemini-2.0, and five times to DeepSeek-V3. Each run was conducted in a separate session, at different times of day, to reduce the influence of transient system-level fluctuations and to approximate repeated measurements of a single system (i.e., system-level repeated measurements).<\/p>\n<p>All prompts and SOV items were presented in Arabic to maintain linguistic consistency with the human sample and to approximate the same semantic content. No manual editing of model responses was performed apart from enforcing the A\/B format. The resulting A\/B responses were then scored using the same key and scoring procedures applied to the human participants, yielding six domain scores (theoretical, religious, social, political, economic, aesthetic) for each run of each model.<\/p>\n<p>Prompt sensitivity and robustness. Because LLM outputs can shift with minor changes in instruction framing, we kept the prompt template and item wording strictly constant across runs and sessions. The inter-run correlations reported above (<a class=\"ArticleReference\" href=\"#SM1\" id=\"SM1a\" data-event=\"articleReference-a-sm1\">Supplementary Table S1<\/a>) provide an initial stability check under this standardized protocol. Nevertheless, future replications should include prompt-perturbation analyses using minimally varied instruction templates and should report rank-order stability (e.g., Spearman correlations of value rankings) to quantify sensitivity explicitly.<\/p>\n<p>2.3 Instruments2.3.1 Study of values (SOV)<\/p>\n<p>The primary instrument employed in this study was the Study of Values, originally developed by <a class=\"ArticleReference\" href=\"#ref9001\" id=\"ref9001a\" data-event=\"articleReference-a-ref9001\">Allport and Vernon (1931)<\/a>. The measure was translated into Arabic by <a class=\"ArticleReference\" href=\"#ref23\" id=\"ref23a\" data-event=\"articleReference-a-ref23\">Hanaa (1959)<\/a> and subsequently adapted to the local cultural context by <a class=\"ArticleReference\" href=\"#ref9005\" id=\"ref9005a\" data-event=\"articleReference-a-ref9005\">Sufyan (1995)<\/a>. The SOV has been widely used to assess value system and their hierarchical ranking, and its construct validity and reliability have been confirmed in multiple previous studies.<\/p>\n<p>Content. The test consists of 45 forced-choice items, requiring respondents (human or AI) to prioritize one value while rejecting another simultaneously, thereby enhancing the precision of value differentiation (<a class=\"ArticleReference\" href=\"#ref23\" id=\"ref23a\" data-event=\"articleReference-a-ref23\">Hanaa, 1959<\/a>). The SOV is grounded in Spranger\u2019s classification, which classifies values into six domains:<\/p>\n<p>Social value: concern for others, viewing people as ends in themselves, characterized by empathy and compassion (<a class=\"ArticleReference\" href=\"#ref23\" id=\"ref23a\" data-event=\"articleReference-a-ref23\">Hanaa, 1959<\/a>).<\/p>\n<p>Theoretical value: interest in knowledge and the discovery of truth, independent of practical or aesthetic considerations (<a class=\"ArticleReference\" href=\"#ref23\" id=\"ref23a\" data-event=\"articleReference-a-ref23\">Hanaa, 1959<\/a>).<\/p>\n<p>Economic value: focus on utility, practicality, and evaluating objects and individuals according to their functional benefit (<a class=\"ArticleReference\" href=\"#ref23\" id=\"ref23a\" data-event=\"articleReference-a-ref23\">Hanaa, 1959<\/a>).<\/p>\n<p>Aesthetic value: appreciation of beauty, harmony, and form, with evaluation of the world based on its structural composition (<a class=\"ArticleReference\" href=\"#ref23\" id=\"ref23a\" data-event=\"articleReference-a-ref23\">Hanaa, 1959<\/a>).<\/p>\n<p>Political value: interest in power, leadership, control, influence, prestige, and engagement with public affairs.<\/p>\n<p>Religious value: Commitment to absolute spiritual or metaphysical standards and concern with transcendent or divine matters.<\/p>\n<p>2.3.1.1 Validity and reliability<\/p>\n<p>Multiple lines of evidence support the validity of the SOV. In earlier local applications, the face validity of the Arabic version had already been examined among university students. For the current study, validity was reassessed with a pilot sample of 60 students (balanced by gender and academic level). Item-total correlations (Pearson) were computed, yielding coefficients ranging from 0.69 to 0.88 (<a class=\"ArticleReference\" href=\"#SM1\" id=\"SM1a\" data-event=\"articleReference-a-sm1\">Appendix 2<\/a>), confirming significant item-domain consistency across all six value dimensions. All items were retained to ensure comprehensive coverage of value constructs.<\/p>\n<p>Test\u2013retest reliability was examined with the same subsample over a two-month interval. Coefficients were high across domains: theoretical (0.962), religious (0.945), political (0.903), economic (0.883), social (0.892), and aesthetic (0.860), indicating strong stability among human participants.<\/p>\n<p>2.3.2 AI-specific adaptation of the SOV<\/p>\n<p>For administration to the LLMs, the Arabic SOV was converted into a structured text-based format compatible with conversational interfaces. Each forced-choice item was reformatted as a direct question followed by two clearly demarcated options (A and B). The substantive wording of the items and alternatives was preserved, and no changes were made to the content, polarity, or scoring key. The only adaptation consisted of adding explicit labels (e.g., \u201cOption A,\u201d \u201cOption B\u201d) and an instruction requesting the model to choose the option most consistent with the response pattern generated under the task instructions. This minimal adaptation was designed to preserve the psychometric properties of the SOV while making it executable in a human-AI interaction setting.<\/p>\n<p>To verify the stability of AI responses, the instrument was administered repeatedly to each LLM. Within-model rank-order stability across the six value domains was strong for ChatGPT-o1 (Kendall\u2019s W\u202f=\u202f0.802, p\u202f&lt;\u202f0.001) and DeepSeek-V3 (W\u202f=\u202f0.840, p\u202f&lt;\u202f0.001), and moderate for Gemini-2.0 (W\u202f=\u202f0.448, p\u202f=\u202f0.048). Dispersion indices (Mean, SD, CV, and 95% CIs) for each value domain are provided in <a class=\"ArticleReference\" href=\"#SM1\" id=\"SM1a\" data-event=\"articleReference-a-sm1\">Supplementary Tables S1, S2<\/a>.<\/p>\n<p>2.4 Data analysis<\/p>\n<p>Data were analysed using IBM SPSS Statistics, Version 28. All tests were two-tailed with a nominal significance level of \u03b1\u202f=\u202f0.05. In line with contemporary recommendations in psychological research, the emphasis was placed on effect sizes and confidence intervals, with p-values used as supplementary indicators rather than the sole basis for interpretation. Different analytic strategies were adopted for the human sample, the LLM outputs, and the human\u2013AI comparisons to reflect the distinct nature of these data sources.<\/p>\n<p>2.4.1 Human sample<\/p>\n<p>For the human participants, descriptive statistics (means, standard deviations, minimum and maximum scores, skewness, and kurtosis) were computed for each of the six value domains. The Shapiro\u2013Wilk test was used to assess the normality of the value distributions in the student sample, supplemented by inspection of skewness, kurtosis, and histograms. Given the relatively large sample size, minor deviations from normality were treated as acceptable, and parametric procedures were retained with appropriate caution in interpretation.<\/p>\n<p>To test Hypothesis 1 (gender and academic level effects on value orientations), separate two-way analyses of variance (ANOVAs) were conducted for each value domain, with gender (male, female) and academic level (undergraduate, master\u2019s, doctoral) entered as between-subjects factors. For each ANOVA, we examined the main effects of gender and academic level as well as their interaction. Where omnibus tests were statistically significant, post hoc Tukey tests were performed to identify pairwise differences between academic levels. For all ANOVA results, we reported the F statistic, degrees of freedom, p-values, and partial eta squared (\u03b72\u209a) as an index of effect size.<\/p>\n<p>For bivariate comparisons involving only two groups (e.g., specific follow-up contrasts where appropriate), independent-samples t-tests were used. In such cases, Cohen\u2019s d was calculated as a standardized measure of the magnitude of group differences and, when relevant, used as an effect-size metric in preference to relying solely on statistical significance.<\/p>\n<p>2.4.2 LLM outputs<\/p>\n<p>Because repeated administrations to the same LLM (e.g., seven runs for ChatGPT, five for Gemini, and five for DeepSeek) represent repeated measurements on a single system rather than observations from independent individuals, the LLM data were treated primarily within a descriptive and reliability-oriented framework. For each model and each value domain, we computed: (a) the mean score across runs, (b) the standard deviation and range (minimum\u2013maximum), (c) the coefficient of variation as an index of relative dispersion, and (d) 95% confidence intervals around the mean, using the t distribution appropriate for the number of runs per model.<\/p>\n<p>To evaluate the stability of each model\u2019s outputs across repeated administrations, we quantified rank-order agreement across runs using Kendall\u2019s coefficient of concordance (W) over the six value domains, and we additionally report the mean and worst-case pairwise Spearman rank correlations (\u03c1) between runs as complementary stability indicators. Dispersion of run-level scores was summarized using the mean, SD, coefficient of variation (CV), and 95% confidence intervals (t-based) for each value domain. Full stability and dispersion results are reported in <a class=\"ArticleReference\" href=\"#SM1\" id=\"SM1a\" data-event=\"articleReference-a-sm1\">Supplementary Tables S1, S2<\/a>.<\/p>\n<p>To address Hypothesis 2 (differences across AI models), we conducted exploratory nonparametric comparisons using the Kruskal\u2013Wallis test on the run-level scores for each value domain, treating runs as units of analysis within each model. Given that runs from the same model are not fully independent observations and that the number of runs per model is small, the Kruskal\u2013Wallis results are reported and discussed as exploratory, sensitivity-type analyses. Substantive interpretation of differences among AI models relies primarily on the descriptive patterns and stability indicators (means, confidence intervals, and reliability estimates), rather than on formal hypothesis testing alone.<\/p>\n<p>2.4.3 Human\u2013AI comparisons<\/p>\n<p>For Hypothesis 3, which concerned differences between human and AI value profiles, the student sample was treated as a reference distribution against which the LLMs\u2019 scores were compared. For each value domain and each LLM (and, where appropriate, for the pooled AI scores), we computed standardized mean differences by expressing the AI mean in relation to the student mean and standard deviation. Specifically, Cohen\u2019s d was calculated using the student standard deviation as the denominator, thereby quantifying how many standard deviations the AI model\u2019s mean lay above or below the human mean for each value domain. These effect sizes were interpreted as the primary indicators of the substantive magnitude of human\u2013AI discrepancies.<\/p>\n<p>In addition, independent-samples t-tests were conducted, comparing the human participants (N\u202f=\u202f214) with the set of AI runs (e.g., the 17 scored LLM administrations combined) for each value domain. In these tests, the group variable contrasted students versus AI responses, and separate t-tests were run at the value-domain level. However, because the AI \u201ccases\u201d are repeated outputs from a small number of models rather than independent individuals, these inferential tests were treated as secondary, exploratory analyses aimed at assessing the robustness of the observed differences. Interpretation of human\u2013AI contrasts therefore focused primarily on the direction and magnitude of Cohen\u2019s d and the associated confidence intervals, with p-values from t-tests serving only as supplementary evidence.<\/p>\n<p>Finally, to address Hypothesis 4 concerning value rankings, we examined the hierarchical ordering of the six value domains for students and for each AI model based on their mean scores. Rankings were summarized in tables and visualized in figures. We also computed descriptive indices of convergence and divergence in rank order (e.g., overlaps in the top-ranked and bottom-ranked values) between humans and AI models. These analyses were intentionally nonparametric and descriptive, reflecting the ordinal nature of rank data and our interest in global pattern similarity rather than fine-grained statistical testing of rank differences.<\/p>\n<p>3 Results3.1 Normality testing<\/p>\n<p>The Shapiro\u2013Wilk test was applied to examine the distribution of the six value domains among student participants. Descriptive statistics and normality results are presented in <a class=\"ArticleReference\" href=\"#tab2\" id=\"tab2a\" data-event=\"articleReference-a-tab2\">Table 2<\/a> Most values met the assumption of normality (p\u202f&gt;\u202f0.05), with the exception of the aesthetic value (p\u202f=\u202f0.004). Given the adequate sample size, parametric tests were retained, with caution applied in interpreting the aesthetic value results.<\/p>\n<p>ValuesMeanSDSkewnessKurtosisShapiro\u2013Wilkp-valueTheoretical42.907.45\u22120.101\u22120.0010.9870.06Religious43.6078.859\u22120.289\u22120.3420.9870.06Social37.7486.1840.055\u22120.3980.9890.10Aesthetic33.3937.8810.455\u22120.0370.9800.004Political41.4866.599\u22120.147\u22120.4150.9910.20Economic40.8607.125\u22120.081\u22120.0750.9920.31<\/p>\n<p>Descriptive statistics and Shapiro\u2013Wilk test for checking the assumption of normal distribution of values.<\/p>\n<p>3.2 Within-model stability across repeated LLM runs<\/p>\n<p>Across repeated runs under the standardized protocol, ChatGPT-o1 exhibited strong agreement in the rank ordering of the six value domains (Kendall\u2019s W\u202f=\u202f0.802, \u03c72(5)\u202f=\u202f28.083, p\u202f&lt;\u202f0.001), with a mean pairwise Spearman \u03c1 of 0.769 (worst-case \u03c1\u202f=\u202f0.462). DeepSeek-V3 similarly showed strong rank-order stability (W\u202f=\u202f0.840, \u03c72(5)\u202f=\u202f21.012, p\u202f&lt;\u202f0.001; mean \u03c1\u202f=\u202f0.800; worst-case \u03c1\u202f=\u202f0.585). Gemini-2.0 showed moderate stability (W\u202f=\u202f0.448, \u03c72(5)\u202f=\u202f11.192, p\u202f=\u202f0.048), with lower pairwise agreement across runs (mean \u03c1\u202f=\u202f0.311; worst-case \u03c1\u202f=\u202f\u22120.058), indicating greater sensitivity to transient fluctuations even under fixed instructions. <a class=\"ArticleReference\" href=\"#SM1\" id=\"SM1a\" data-event=\"articleReference-a-sm1\">Supplementary Table S2<\/a> summarizes dispersion (Mean, SD, CV, and 95% CIs) for each value domain within each model.<\/p>\n<p>3.3 Hypothesis 1: gender and academic level effects<\/p>\n<p>There are statistically significant differences in value system among students according to gender and academic level.<\/p>\n<p>Two-way ANOVA was conducted for each value domain. Post hoc Tukey comparisons were performed where appropriate. Key results were as follows:<\/p>\n<p>Theoretical value: significant differences by academic level (F\u202f=\u202f7.067, p\u202f=\u202f0.001), with doctoral students scoring higher than undergraduates (Mean Difference\u202f=\u202f4.33, p\u202f=\u202f0.001). Significant gender differences also emerged (F\u202f=\u202f11.534, p\u202f=\u202f0.001), favouring males (<a class=\"ArticleReference\" href=\"#SM1\" id=\"SM1a\" data-event=\"articleReference-a-sm1\">Appendix 3<\/a>).<\/p>\n<p>Religious value: Significant differences were observed by gender (F\u202f=\u202f4.721, p\u202f=\u202f0.031), with females scoring higher. No significant differences were found by academic level.<\/p>\n<p>Social, political, and economic values: No significant differences were observed by gender or academic level (p\u202f&gt;\u202f0.05).<\/p>\n<p>Aesthetic value: Significant differences by academic level (F\u202f=\u202f4.723, p\u202f=\u202f0.010), with undergraduates scoring higher than doctoral students (Mean Difference\u202f=\u202f4.24, p\u202f=\u202f0.003).<\/p>\n<p>3.3.1 Effect sizes<\/p>\n<p>Effect sizes were computed for all statistically significant tests. The effect of academic level on theoretical values was of medium magnitude (\u03b72\u209a\u202f=\u202f0.063), and the effect of gender on theoretical values was also medium (\u03b72\u209a\u202f=\u202f0.052). For religious values, the effect of gender was small to medium (\u03b72\u209a\u202f=\u202f0.022). The effect of academic level on aesthetic values was of small-to-medium magnitude (\u03b72\u209a\u202f=\u202f0.043).<\/p>\n<p>According to the Tukey HSD post hoc test, the significant difference was found only between bachelor\u2019s and Ph.D. students, in favor of Ph.D. students, on the theoretical value. <a class=\"ArticleReference\" href=\"#SM1\" id=\"SM1a\" data-event=\"articleReference-a-sm1\">Appendix 3<\/a> presents the detailed results.<\/p>\n<p>3.4 Hypothesis 2: differences across AI models<\/p>\n<p>There are statistically significant differences in value system among AI models (ChatGPT, Gemini, DeepSeek).<\/p>\n<p>The Kruskal\u2013Wallis test revealed significant differences among models in the theoretical (H\u202f=\u202f10.145, p\u202f=\u202f0.006), social (H\u202f=\u202f10.053, p\u202f=\u202f0.007), and aesthetic values (H\u202f=\u202f5.990, p\u202f=\u202f0.050), favouring ChatGPT over the other models. No significant differences were observed in the economic (H\u202f=\u202f2.006, p\u202f=\u202f0.367), political (H\u202f=\u202f5.089, p\u202f=\u202f0.079), or religious values (H\u202f=\u202f1.416, p\u202f=\u202f0.493) (<a class=\"ArticleReference\" href=\"#SM1\" id=\"SM1a\" data-event=\"articleReference-a-sm1\">Appendix 4<\/a>).<\/p>\n<p>However, these were used only as exploratory diagnostics, given that they do not address direct differences between the LLMs themselves However, these analyses were treated as exploratory because the number of repeated runs within each model was limited and the run-level observations were not fully independent. Accordingly, they were not treated as the primary basis for evaluating Hypothesis 2.<\/p>\n<p>Hypothesis 2. Instead, our main inferential focus for comparing the three AI models relied on human-referenced effect sizes, namely the standardized mean differences between GPT, Gemini, and DeepSeek expressed in units of the student standard deviation for each value domain (see <a class=\"ArticleReference\" href=\"#tab3\" id=\"tab3a\" data-event=\"articleReference-a-tab3\">Table 3<\/a>). These effect sizes provide a more interpretable and psychometrically coherent index of between-model differences: GPT shows very large advantages over both Gemini and DeepSeek in the theoretical and social domains (exceeding 2\u20133 human SD units), medium-to-large advantages in the economic, political, and aesthetic domains, and a clear superiority over DeepSeek\u2014but only a small edge over Gemini\u2014in the religious domain. Because these human-scaled effects directly quantify the magnitude and practical significance of the discrepancies among models, our substantive conclusions regarding differences between LLMs are based primarily on the pattern and size of these standardized effects, with the Kruskal\u2013Wallis test serving only as supporting, exploratory evidence.<\/p>\n<p>Value domainGPT meanGemini meanDeepSeek meand (GPT\u2013Gemini)d (GPT\u2013DeepSeek)d (Gemini\u2013DeepSeek)Theoretical57.0040.2037.002.262.680.43Social47.0029.0026.002.913.400.49Economic29.5725.2020.800.611.230.62Political35.1428.8027.400.961.170.21Aesthetic33.8627.2027.800.840.77\u22120.08Religious26.0023.4011.800.291.601.31<\/p>\n<p>Human-referenced standardized differences between AI models across value domains.<\/p>\n<p>Standardized differences (d) are expressed in units of the student sample\u2019s standard deviation for each value domain.<\/p>\n<p>3.5 Hypothesis 3: differences between humans and AI models<\/p>\n<p>There are statistically significant differences in value system between students and AI models (ChatGPT, Gemini, DeepSeek).<\/p>\n<p>Independent-samples (t) tests indicated significant differences between human and AI mean scores across most values (<a class=\"ArticleReference\" href=\"#SM1\" id=\"SM1a\" data-event=\"articleReference-a-sm1\">Appendix 5<\/a>). The most salient results were:<\/p>\n<p>Religious value: students scored substantially higher than all three AI models.<\/p>\n<p>Theoretical value: AI models scored relatively higher than students, with differences varying by model.<\/p>\n<p>Social value: students demonstrated higher scores than certain AI models, with differences depending on the specific comparison.<\/p>\n<p>3.5.1 Effect sizes<\/p>\n<p>For human\u2013AI comparisons, standardized mean differences revealed very large effects in the religious domain (d\u202f=\u202f2.21), large effects in the theoretical domain (d\u202f=\u202f1.22), and medium effects in social values (d\u202f\u2248\u202f0.55). Effect sizes in economic and political domains were small to medium (d\u202f=\u202f0.25\u20130.35), while the aesthetic domain showed a large effect (d\u202f\u2248\u202f0.85). These effect size estimates provide a more informative interpretation of group differences than reliance on p-values alone.<\/p>\n<p>3.6 Hypothesis 4: ranking patterns<\/p>\n<p>Theoretical and social values are expected to rank highest among AI models, while religious and aesthetic values rank lowest. In contrast, religious and social values are expected to rank highest among university students, with aesthetic values occupying the lowest ranks. Economic and political values are expected to remain in the middle range for both groups, while theoretical values vary among students according to academic level.<\/p>\n<p>Ranking analyses were conducted based on mean scores for each value domain (<a class=\"ArticleReference\" href=\"#SM1\" id=\"SM1a\" data-event=\"articleReference-a-sm1\">Appendix 6<\/a>; <a class=\"ArticleReference\" href=\"#fig1\" id=\"fig1a\" data-event=\"articleReference-a-fig1\">Figures 1<\/a>\u2013<a class=\"ArticleReference\" href=\"#fig4\" id=\"fig4a\" data-event=\"articleReference-a-fig4\">4<\/a>).<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.europesays.com\/ai\/wp-content\/uploads\/2026\/04\/fpsyg-17-1755145-g001.webp\" class=\"is-inside-mask\" alt=\"Bar chart comparing mean scores for six values: Religious at forty-three point sixty-one, Social at thirty-seven point seventy-five, Aesthetic at thirty-three point thirty-nine, Political at forty-one point forty-nine, Economic at forty point eighty-six, and Theoretical at forty-two point ninety-one.\" loading=\"lazy\"\/><\/p>\n<p>Ranking of the six value domains among university students overall.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.europesays.com\/ai\/wp-content\/uploads\/2026\/04\/fpsyg-17-1755145-g002.webp\" class=\"is-inside-mask\" alt=\"Grouped bar chart showing mean scores across six value types\u2014Religious, Social, Aesthetic, Political, Economic, Theoretical\u2014split by academic level (Bachelor's, Master's, PhD) and gender (Males top row, Females bottom row). Males and females at all academic levels tend to rate Religious values highest, with females scoring higher than males in Religious values at every academic level. Theoretical values increase with education level, particularly for PhD males. Aesthetic values are consistently the lowest across all groups.\" loading=\"lazy\"\/><\/p>\n<p>Ranking of the six value domains among university students by gender and academic level.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.europesays.com\/ai\/wp-content\/uploads\/2026\/04\/fpsyg-17-1755145-g003.webp\" class=\"is-inside-mask\" alt=\"Bar chart comparing mean values for theoretical, social, economic, political, aesthetic, and religious categories. Theoretical is highest at forty-five, followed by social at thirty-five, then political and aesthetic at thirty, economic at twenty-six, and religious at twenty-one.\" loading=\"lazy\"\/><\/p>\n<p>Ranking of the six value domains across large language models (LLMs) overall.<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/www.europesays.com\/ai\/wp-content\/uploads\/2026\/04\/fpsyg-17-1755145-g004.webp\" class=\"is-inside-mask\" alt=\"Grouped bar chart comparing mean scores for six categories\u2014Theoretical, Social, Economic, Political, Aesthetic, and Religious\u2014across three LLMs: chatGPT, DeepSeek, and Gemini. chatGPT has the highest scores in all categories, especially Theoretical at fifty-four, while Gemini consistently has the lowest scores, with Religious being the lowest at eighteen. Data values are labeled on each bar.\" loading=\"lazy\"\/><\/p>\n<p>Ranking of the six value domains for each large language model (LLM) separately.<\/p>\n<p>AI models. Across all three LLMs, the theoretical value consistently occupied the highest rank, followed by social, aesthetic, and political values (with minor variation among models). Religious values consistently ranked lowest.<\/p>\n<p>University students. Religious values ranked highest for the majority of students, with the exception of some doctoral participants who prioritized theoretical values. Aesthetic values ranked lowest overall, while political and economic values occupied middle positions.<\/p>\n<p>Summary of rankings (<a class=\"ArticleReference\" href=\"#SM1\" id=\"SM1a\" data-event=\"articleReference-a-sm1\">Appendix 6<\/a>):<\/p>\n<p>AI models (e.g., ChatGPT): 1. Theoretical, 2. Social, 3. Aesthetic, 4. Political, 5. Economic, 6. Religious.<\/p>\n<p>Students: 1. Religious, 2. Theoretical, 3. Political, 4. Economic, 5. Social, 6. Aesthetic.<\/p>\n<p><a class=\"ArticleReference\" href=\"#fig1\" id=\"fig1a\" data-event=\"articleReference-a-fig1\">Figures 1<\/a>\u2013<a class=\"ArticleReference\" href=\"#fig4\" id=\"fig4a\" data-event=\"articleReference-a-fig4\">4<\/a> illustrate these ranking patterns for students and AI models overall, as well as disaggregated by gender, academic level, and model type.<\/p>\n<p>Overall, the distribution of values appears ordered as follows.<\/p>\n<p>Religious values occupied the highest position, followed by theoretical, political, economic, social, and finally aesthetic values, which ranked lowest.<\/p>\n<p>Across groups, female students tended to priorities religious values, whereas male students scored higher on theoretical values. Differences across academic levels also emerged, with doctoral students placing greater emphasis on theoretical values, whereas undergraduates showed relatively greater emphasis on religious and aesthetic values.<\/p>\n<p><a class=\"ArticleReference\" href=\"#fig3\" id=\"fig3a\" data-event=\"articleReference-a-fig3\">Figure 3<\/a> illustrates the distribution of the values. Theoretical values consistently occupied the highest rank, followed by social, aesthetic, and political values, with minor variation across models. Economic values appeared in the mid-range, whereas religious values were consistently ranked lowest.<\/p>\n<p><a class=\"ArticleReference\" href=\"#fig4\" id=\"fig4a\" data-event=\"articleReference-a-fig4\">Figure 4<\/a> clearly indicates that ChatGPT placed theoretical and social values at the top, with religious values at the bottom. Gemini showed a similar pattern but with slightly higher emphasis on political values. DeepSeek prioritised theoretical values most strongly, followed by aesthetic and social values, while again ranking religious values last. These variations highlight both the consistency of certain patterns (e.g., dominance of theoretical values) and model-specific differences in secondary value preferences.<\/p>\n<p>Taken together, these findings provide the empirical basis for a broader interpretation of how value-related patterns differ between human participants and large language models. The following discussion situates these results within the literature on cultural values, AI alignment, and context-sensitive psychological interpretation.<\/p>\n<p>4 Discussion<\/p>\n<p>The findings reveal a clear value gap between large language models (LLMs) and university students, a gap that warrants close attention given the rapid advancement of AI and its growing influence across multiple domains of life. By systematically analysing the outputs of leading LLMs (ChatGPT, Gemini, DeepSeek) and comparing them with students\u2019 value system at different academic levels, this study provides novel insights into how \u201cvalues\u201d may manifest in AI systems and raises critical questions about their alignment with human values. The results also underscore the importance of demographic, cultural, and social factors in shaping value system among individuals and point to the need for embedding cultural filters into AI systems to enhance their compatibility with prevailing societal values.<\/p>\n<p>4.1 Human value differences<\/p>\n<p>The results indicated significant differences in theoretical and aesthetic values across academic levels: doctoral students scored higher on theoretical values, whereas undergraduates scored higher on aesthetic values. This finding suggests developmental shifts in students\u2019 value system as they progress through higher education, with greater emphasis on scientific inquiry and critical thinking at advanced stages. These results are consistent with previous research indicating that value rankings may change during university years (<a class=\"ArticleReference\" href=\"#ref8\" id=\"ref8a\" data-event=\"articleReference-a-ref8\">Bakr, 1975<\/a>; <a class=\"ArticleReference\" href=\"#ref12\" id=\"ref12a\" data-event=\"articleReference-a-ref12\">Cox, 1989<\/a>; <a class=\"ArticleReference\" href=\"#ref13\" id=\"ref13a\" data-event=\"articleReference-a-ref13\">Dobashi, 1976<\/a>).<\/p>\n<p>Gender differences also emerged, with males scoring higher on theoretical values and females scoring higher on religious values. These findings align with earlier studies highlighting the role of demographic and sociocultural factors\u2014such as gender and educational level\u2014in shaping value system (<a class=\"ArticleReference\" href=\"#ref5\" id=\"ref5a\" data-event=\"articleReference-a-ref5\">Al-Batsh and Abd al-Rahman, 1990<\/a>; <a class=\"ArticleReference\" href=\"#ref6\" id=\"ref6a\" data-event=\"articleReference-a-ref6\">Allen, 1981<\/a>; <a class=\"ArticleReference\" href=\"#ref3\" id=\"ref3a\" data-event=\"articleReference-a-ref3\">Abu al-Nil, 1985<\/a>; <a class=\"ArticleReference\" href=\"#ref56\" id=\"ref56a\" data-event=\"articleReference-a-ref56\">Zahran and Serry, 1985<\/a>; <a class=\"ArticleReference\" href=\"#ref1\" id=\"ref1a\" data-event=\"articleReference-a-ref1\">Abd al-Fattah, 1992<\/a>; <a class=\"ArticleReference\" href=\"#ref38\" id=\"ref38a\" data-event=\"articleReference-a-ref38\">McGuiness-Biewitt, 1985<\/a>). Specifically, prior research (<a class=\"ArticleReference\" href=\"#ref49\" id=\"ref49a\" data-event=\"articleReference-a-ref49\">Sufyan, 2002<\/a>; <a class=\"ArticleReference\" href=\"#ref7\" id=\"ref7a\" data-event=\"articleReference-a-ref7\">Al-Suwwad and Al-Azirjawi, 1987<\/a>; <a class=\"ArticleReference\" href=\"#ref23\" id=\"ref23a\" data-event=\"articleReference-a-ref23\">Hanaa, 1959<\/a>) consistently reported higher theoretical values among males, while political values tended not to show significant gender-related differences (<a class=\"ArticleReference\" href=\"#ref5\" id=\"ref5a\" data-event=\"articleReference-a-ref5\">Al-Batsh and Abd al-Rahman, 1990<\/a>; <a class=\"ArticleReference\" href=\"#ref6\" id=\"ref6a\" data-event=\"articleReference-a-ref6\">Allen, 1981<\/a>). Similarly, economic values were previously found to favor males (<a class=\"ArticleReference\" href=\"#ref7\" id=\"ref7a\" data-event=\"articleReference-a-ref7\">Al-Suwwad and Al-Azirjawi, 1987<\/a>).<\/p>\n<p>In contrast, the present study found no significant gender differences in social values, with male and female students scoring similarly. This result diverges from much of the earlier literature, which typically found higher social values among females (<a class=\"ArticleReference\" href=\"#ref56\" id=\"ref56a\" data-event=\"articleReference-a-ref56\">Zahran and Serry, 1985<\/a>). However, <a class=\"ArticleReference\" href=\"#ref3\" id=\"ref3a\" data-event=\"articleReference-a-ref3\">Abu Al-Nil (1985)<\/a> reported contrary findings, suggesting that local cultural factors may shape gender patterns in social values. In the Saudi context, recent societal transformations may have contributed to the convergence of male and female scores on this dimension.<\/p>\n<p>With regard to aesthetic values, the current findings showed higher scores among undergraduates compared with doctoral students. This may reflect age-related interests, as younger students tend to show greater concern with beauty and artistic appreciation, whereas doctoral students are more heavily engaged in research and professional pursuits. Previous research often reported higher aesthetic values among females (<a class=\"ArticleReference\" href=\"#ref49\" id=\"ref49a\" data-event=\"articleReference-a-ref49\">Sufyan, 2002<\/a>; <a class=\"ArticleReference\" href=\"#ref7\" id=\"ref7a\" data-event=\"articleReference-a-ref7\">Al-Suwwad and Al-Azirjawi, 1987<\/a>; <a class=\"ArticleReference\" href=\"#ref23\" id=\"ref23a\" data-event=\"articleReference-a-ref23\">Hanaa, 1959<\/a>; <a class=\"ArticleReference\" href=\"#ref38\" id=\"ref38a\" data-event=\"articleReference-a-ref38\">McGuiness-Biewitt, 1985<\/a>), though some studies (<a class=\"ArticleReference\" href=\"#ref6\" id=\"ref6a\" data-event=\"articleReference-a-ref6\">Allen, 1981<\/a>) failed to detect gender effects.<\/p>\n<p>Finally, religious values showed gender-based differences in favor of females in the present study. This result is consistent with some previous studies (<a class=\"ArticleReference\" href=\"#ref25\" id=\"ref25a\" data-event=\"articleReference-a-ref25\">Hunt, 1980<\/a>; <a class=\"ArticleReference\" href=\"#ref13\" id=\"ref13a\" data-event=\"articleReference-a-ref13\">Dobashi, 1976<\/a>), though others reported the opposite pattern, with higher religious values among males (<a class=\"ArticleReference\" href=\"#ref1\" id=\"ref1a\" data-event=\"articleReference-a-ref1\">Abd al-Fattah, 1992<\/a>; <a class=\"ArticleReference\" href=\"#ref7\" id=\"ref7a\" data-event=\"articleReference-a-ref7\">Al-Suwwad and Al-Azirjawi, 1987<\/a>; <a class=\"ArticleReference\" href=\"#ref46\" id=\"ref46a\" data-event=\"articleReference-a-ref46\">Sab\u1e25\u0101n, 1975<\/a>). Such inconsistencies across studies likely reflect differences in religious traditions, societal contexts, and historical periods.<\/p>\n<p>4.2 Differences across LLMs<\/p>\n<p>The results revealed statistically significant differences among LLMs in theoretical, social, and aesthetic values, favouring ChatGPT over Gemini and DeepSeek. This finding suggests that AI \u201cvalues\u201d are essentially reflections of the training data and the algorithms employed, and that variation in both the nature of the data and model architectures contributes to observable differences in outputs. As <a class=\"ArticleReference\" href=\"#ref9\" id=\"ref9a\" data-event=\"articleReference-a-ref9\">Bender et al. (2021)<\/a> emphasised, LLMs operate on vast corpora of text using machine learning algorithms, without possessing self-awareness or lived experience. Although some models are developed in Western contexts (e.g., ChatGPT, Gemini) and others in Eastern contexts (e.g., DeepSeek), these results do not necessarily reflect cultural differences in training data, given the opacity of data sources. Rather, they may point to differences in design, optimisation, and filtering processes. The relatively stronger alignment of ChatGPT with human-like value structures may suggest a comparative advantage in \u201cvalue balance,\u201d though this remains an artefact of training rather than evidence of genuine value endorsement.<\/p>\n<p>4.3 Human\u2013AI divergence<\/p>\n<p>Comparisons between human and AI groups demonstrated substantive differences in most value domains. The most striking was the religious value, which was consistently higher among students than all three LLMs. By contrast, theoretical values appeared higher in AI outputs compared with student means, albeit with some variation across models. Social values also showed differences, with students scoring higher in certain subgroups than some AI models. These findings highlight a fundamental divergence between human and AI value system, reflecting differences in the sources of formation\u2014cultural socialization and lived experience for humans, versus statistical learning from large-scale text corpora for AI. Such divergences underscore the importance of critically examining the implicit priorities embedded in LLM outputs, particularly when these systems are used in sensitive psychological, educational, or social contexts (<a class=\"ArticleReference\" href=\"#ref20\" id=\"ref20a\" data-event=\"articleReference-a-ref20\">Haase and Hanel, 2023<\/a>; <a class=\"ArticleReference\" href=\"#ref9002\" id=\"ref9002a\" data-event=\"articleReference-a-ref9002\">Flint et al., 2022<\/a>; <a class=\"ArticleReference\" href=\"#ref10\" id=\"ref10a\" data-event=\"articleReference-a-ref10\">Bodro\u017ea et al., 2023<\/a>).<\/p>\n<p>Recent literature further supports interpreting human\u2013AI differences through agency, evaluation, and contextual responsiveness rather than task accuracy alone. Generative AI can expand exploratory thinking and cognitive diversity, yet its outputs remain conditioned by human design choices, governance structures, and deployment contexts (<a class=\"ArticleReference\" href=\"#ref30\" id=\"ref30a\" data-event=\"articleReference-a-ref30\">Krakowski, 2025<\/a>). In higher education, AI may approximate human evaluative judgments, but it often moderates extreme scores and remains sensitive to task framing and response structure (<a class=\"ArticleReference\" href=\"#ref17\" id=\"ref17a\" data-event=\"articleReference-a-ref17\">Flod\u00e9n, 2025<\/a>). At the user level, AI decisions are frequently perceived as less fair and less comprehensible than human decisions unless explanatory mechanisms are provided (<a class=\"ArticleReference\" href=\"#ref47\" id=\"ref47a\" data-event=\"articleReference-a-ref47\">Shulner-Tal et al., 2025<\/a>). Within education and science, these patterns highlight the importance of culturally responsive, ethically guided, and context-aware AI implementation (<a class=\"ArticleReference\" href=\"#ref14\" id=\"ref14a\" data-event=\"articleReference-a-ref14\">Dzogovic et al., 2024<\/a>).<\/p>\n<p>4.4 Value ranking analysis<\/p>\n<p>The results showed that AI models consistently prioritized theoretical values, followed by social, aesthetic, or political values, while religious values ranked lowest. In contrast, religious values occupied the top rank among most students, with aesthetic values consistently ranking last and political and economic values positioned in the middle. This divergence reinforces the notion that values are not merely abstract beliefs but integral to personal identity and culture (<a class=\"ArticleReference\" href=\"#ref56\" id=\"ref56a\" data-event=\"articleReference-a-ref56\">Zahran and Serry, 1985<\/a>). By contrast, AI models appear to reflect orientations closer to those dominant in scientific and technological domains, particularly within Western contexts.<\/p>\n<p>4.5 The problem of \u201cvalues\u201d in AI<\/p>\n<p>A central issue raised by this study is whether the term \u201cvalues\u201d can legitimately be applied to AI outputs. LLMs lack self-awareness and lived experience, yet their textual outputs reveal structured patterns in how they rank and prioritize concepts aligned with <a class=\"ArticleReference\" href=\"#ref48\" id=\"ref48a\" data-event=\"articleReference-a-ref48\">Spranger\u2019s (1928)<\/a> six value domains. Following <a class=\"ArticleReference\" href=\"#ref9003\" id=\"ref9003a\" data-event=\"articleReference-a-ref9003\">Floridi and Chiriatti (2020)<\/a>, the behavioral outputs of AI systems provide a meaningful basis for value analysis, even if such rankings are not consciously held. An ongoing debate remains as to whether AI should embody human values or maintain neutrality, raising profound questions about AI ethics and developer responsibility (<a class=\"ArticleReference\" href=\"#ref51\" id=\"ref51a\" data-event=\"articleReference-a-ref51\">Taddeo and Floridi, 2018<\/a>).<\/p>\n<p>In line with critiques that describe LLMs as \u201cstochastic parrots\u201d that reproduce statistical patterns from their training data (<a class=\"ArticleReference\" href=\"#ref9\" id=\"ref9a\" data-event=\"articleReference-a-ref9\">Bender et al., 2021<\/a>), we interpret the observed value rankings as output-level regularities shaped by pretraining corpora and post-training alignment, not as internally held convictions. Accordingly, our claims are confined to what the models express under a controlled psychometric prompt, and we avoid attributing intentionality, lived experience, or moral agency to the systems.<\/p>\n<p>4.6 Cultural variables<\/p>\n<p>The findings also highlight the significance of demographic, cultural, and social variables in shaping human value hierarchies. Results are consistent with prior research documenting the influence of gender and educational level on value system (<a class=\"ArticleReference\" href=\"#ref23\" id=\"ref23a\" data-event=\"articleReference-a-ref23\">Hanaa, 1959<\/a>; <a class=\"ArticleReference\" href=\"#ref4\" id=\"ref4a\" data-event=\"articleReference-a-ref4\">Al-\u2018Umari and Nashwan, 1985<\/a>; <a class=\"ArticleReference\" href=\"#ref8\" id=\"ref8a\" data-event=\"articleReference-a-ref8\">Bakr, 1975<\/a>; <a class=\"ArticleReference\" href=\"#ref12\" id=\"ref12a\" data-event=\"articleReference-a-ref12\">Cox, 1989<\/a>; <a class=\"ArticleReference\" href=\"#ref13\" id=\"ref13a\" data-event=\"articleReference-a-ref13\">Dobashi, 1976<\/a>; <a class=\"ArticleReference\" href=\"#ref49\" id=\"ref49a\" data-event=\"articleReference-a-ref49\">Sufyan, 2002<\/a>). The academic climate itself appears to reinforce certain value domains, as suggested by earlier studies (<a class=\"ArticleReference\" href=\"#ref4\" id=\"ref4a\" data-event=\"articleReference-a-ref4\">Al-\u2018Umari and Nashwan, 1985<\/a>; <a class=\"ArticleReference\" href=\"#ref11\" id=\"ref11a\" data-event=\"articleReference-a-ref11\">Cantrell, 1976<\/a>; <a class=\"ArticleReference\" href=\"#ref38\" id=\"ref38a\" data-event=\"articleReference-a-ref38\">McGuiness-Biewitt, 1985<\/a>; <a class=\"ArticleReference\" href=\"#ref9004\" id=\"ref9004a\" data-event=\"articleReference-a-ref9004\">Malkosh, 1996<\/a>).<\/p>\n<p>4.7 Need for cultural filters<\/p>\n<p>This study calls for embedding \u201ccultural filters\u201d into AI systems to ensure greater alignment with the values of diverse user populations, especially in societies with strong religious and cultural identities (<a class=\"ArticleReference\" href=\"#ref33\" id=\"ref33a\" data-event=\"articleReference-a-ref33\">Li et al., 2022<\/a>). Such an approach reflects growing recognition of cultural pluralism in AI ethics and the need to avoid imposing homogenised \u201cuniversal\u201d values. Instead, ethical AI development should account for contextual diversity and local priorities to enhance both acceptance and fairness.<\/p>\n<p>Practical implications in sensitive domains. In applied settings such as psychological counseling, education, and legal guidance, value misalignment may translate into recommendations that inadvertently conflict with users\u2019 religious or cultural priorities (e.g., framing coping strategies, family obligations, or moral norms in ways that are culturally incongruent). This underscores the need for culturally informed guardrails, transparent disclosure of model limitations, and user education to prevent over-reliance on AI outputs in value-laden decisions.<\/p>\n<p>4.8 Alignment with hypotheses<\/p>\n<p>The results broadly supported the proposed hypotheses, revealing significant differences in values according to gender, academic level, and AI model type. Clear divergences were observed between humans and AI models, with religious values more salient among students and theoretical values dominating AI rankings. These findings reinforce the theoretical framework of Spranger\u2019s classification and provide empirical evidence for its applicability in both human and AI contexts.<\/p>\n<p>5 Conclusion<\/p>\n<p>The present study contributes to the growing interdisciplinary dialogue on artificial intelligence, human values, and culturally situated meaning by examining value-related output patterns in large language models alongside the value priorities of university students. The findings point to a notable divergence between the dominant theoretical orientation observed in AI-generated responses and the stronger religious priority expressed by the human sample. This contrast is not merely descriptive; rather, it highlights the importance of cultural, linguistic, and educational context in the interpretation of value-related outputs generated by contemporary AI systems.<\/p>\n<p>At a conceptual level, the study supports the cautious use of classical value frameworks, such as Spranger\u2019s typology, in the analysis of AI-generated content. At the same time, the findings underscore the importance of interpretive discipline. What is being observed in large language models is better understood as patterned output under constrained prompting conditions, not as evidence of internal beliefs, intentions, or human-like value structures. This distinction is especially important in order to avoid anthropomorphic overreach and to preserve conceptual clarity in emerging psychological and interdisciplinary research on AI.<\/p>\n<p>From an applied perspective, the results raise important considerations for educational, counseling, and other high-stakes contexts in which value-sensitive judgments matter. If AI systems are increasingly used to support reflection, communication, or decision-making in culturally grounded settings, then misalignment between model output patterns and human value priorities may have meaningful practical consequences. These findings therefore reinforce the need for culturally sensitive alignment practices, transparent prompt design, and more context-aware evaluation strategies.<\/p>\n<p>At the same time, the study should be interpreted within its methodological boundaries, particularly the restricted human sampling frame and the model-specific nature of AI outputs. Further work should test the robustness and transferability of the observed patterns across broader samples, languages, and model families.<\/p>\n<p>In view of these findings and their theoretical as well as practical implications, it is useful to move from interpretation to action-oriented guidance. The following recommendations therefore outline implications for users, researchers, and developers working with AI systems in culturally sensitive contexts.<\/p>\n<p>5.1 Recommendations<\/p>\n<p>For users of artificial intelligence systems, the findings of the present study highlight the importance of approaching AI-generated content with informed critical awareness, particularly in domains where cultural values, ethical judgments, and normative perspectives play a central role. While large language models can provide sophisticated and contextually coherent responses, their outputs may reflect patterns derived from training data and alignment procedures rather than culturally grounded human priorities. Users in educational, advisory, or decision-support contexts should therefore remain attentive to the interpretive limits of AI-generated responses and treat them as informational resources rather than authoritative value judgments.<\/p>\n<p>For researchers, the results underscore the need for more systematic and cross-contextual investigations into value-related output patterns in artificial intelligence systems. Priority should be given to cross-cultural replication, multilingual validation, and robustness checks that examine whether the observed rankings remain stable under controlled variations in model and prompt conditions. Expanding methodological approaches that integrate psychological theory, cultural analysis, and computational evaluation will be especially important for advancing this emerging line of interdisciplinary inquiry.<\/p>\n<p>For developers and designers of AI systems, the findings suggest the importance of integrating culturally sensitive alignment strategies and evaluation frameworks during model development and deployment. As AI technologies increasingly interact with users in socially meaningful and culturally diverse contexts, greater attention should be devoted to how training data, alignment processes, and prompting structures shape value-related output patterns. Designing AI systems that acknowledge cultural diversity, enhance transparency in response generation, and allow for context-aware adaptation may help mitigate potential mismatches between model outputs and human value expectations.<\/p>\n<p>StatementsData availability statement<\/p>\n<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.<\/p>\n<p>Ethics statement<\/p>\n<p>Ethical approval was not required for the study involving humans in accordance with the local legislation and institutional requirements. Written informed consent to participate in this study was not required from the participants in accordance with the national legislation and the institutional requirements.<\/p>\n<p>Author contributions<\/p>\n<p>NS: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Visualization, Writing \u2013 original draft, Writing \u2013 review &amp; editing. SA: Methodology, Resources, Validation, Visualization, Writing \u2013 review &amp; editing. AT: Methodology, Resources, Validation, Visualization, Writing \u2013 review &amp; editing. AdA: Methodology, Validation, Visualization, Writing \u2013 review &amp; editing. AhA: Methodology, Resources, Validation, Visualization, Writing \u2013 review &amp; editing. NE: Methodology, Resources, Validation, Visualization, Writing \u2013 review &amp; editing. MK: Methodology, Resources, Validation, Visualization, Writing \u2013 review &amp; editing.<\/p>\n<p>Funding<\/p>\n<p>The author(s) declared that financial support was received for this work and\/or its publication. The authors express their appreciation to the Deanship of Graduate Studies and Scientific Research at King Khalid University for funding this work through the (Large Group Project under Grant No. (Project No. RGP.2\/671\/46\/ Academic Year 1447).<\/p>\n<p>Acknowledgments<\/p>\n<p>The authors extend their appreciation to King Khalid University for supporting this project.<\/p>\n<p>Conflict of interest<\/p>\n<p>The author(s) declared that this work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.<\/p>\n<p>Generative AI statement<\/p>\n<p>The author(s) declared that Generative AI was used in the creation of this manuscript. Artificial intelligence tools were used only for limited translation assistance and minor linguistic rephrasing during manuscript preparation. All scientific content, study design, data analysis, interpretation, and final manuscript approval remained the sole responsibility of the authors.<\/p>\n<p>Any alternative text (alt text) provided alongside figures in this article has been generated by Frontiers with the support of artificial intelligence and reasonable efforts have been made to ensure accuracy, including review by the authors wherever possible. If you identify any issues, please contact us.<\/p>\n<p>Publisher\u2019s note<\/p>\n<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.<\/p>\n<p>References<\/p>\n<p class=\"notranslate\">Abd al-FattahY. (1992). Dynamics of the relationship between parental care as perceived by children, their adjustment, and values [in Arabic]. J. Psychol.24, 24\u201340.<\/p>\n<p class=\"notranslate\">Abu al-NilM. S. (1985). Psychosomatic Disorders [in Arabic]. Cairo: Al-Khaniji Library.<\/p>\n<p class=\"notranslate\">Al-\u2018UmariK.NashwanA. M. (1985). The value system of Yarmouk University students: a study of the legal correlations of influencing factors [in Arabic]. Yarmouk University Res1, 143\u2013158.<\/p>\n<p class=\"notranslate\">Al-BatshM. W.Abd al-RahmanH. (1990). The value structure among students of the University of Jordan [in Arabic]. Dirasat Hum. Soc. Sci.17, 92\u2013136.<\/p>\n<p class=\"notranslate\">AllenM. P. (1981). A Study of High-School Educational Values: Implications for Counselors (Doctoral Dissertation). Ann Arbor, Michigan: Louisiana State University.<\/p>\n<p class=\"notranslate\">AllportG. W.VernonP. E. (1931). A study of values. Boston, MA: Houghton Mifflin.<\/p>\n<p class=\"notranslate\">Al-SuwwadA.Al-AzirjawiM. F. (1987). The value system among students at the University of Mosul [in Arabic]. Al-Mustansiriya J.15, 591\u2013608.<\/p>\n<p class=\"notranslate\">BakrM. I. (1975). A Comparative Study of Values Among University and Secondary Students (Unpublished Master\u2019s thesis). Baghdad: University of Baghdad. [In Arabic]<\/p>\n<p class=\"notranslate\">BenderE. M.GebruT.McMillan-MajorA.MitchellS. (2021). On the dangers of stochastic parrots: can language models be too big? In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (pp. 610\u2013623). New York, NY, USA: ACM.<\/p>\n<p class=\"notranslate\">Bodro\u017eaB.Bo\u0161kovi\u0107M.Joki\u0107D. (2023). Chatting with GPT-3: exploring social psychology via AI-generated text. Comput. Human Behav.139:107652. doi: 10.1016\/j.chb.2022.107652<\/p>\n<p class=\"notranslate\">CantrellD. D. (1976). Impact of University Department on Students\u2019 Values (Doctoral Dissertation). Ann Arbor, Michigan: Dissertation Abstracts International.<\/p>\n<p class=\"notranslate\">CoxS. S. (1989). Values Change as an Index of Development: A Longitudinal Study Using the Study of Values (Doctoral Dissertation). Ann Arbor, Michigan: Berea College.<\/p>\n<p class=\"notranslate\">DobashiN. (1976). A longitudinal Study of Student Values in a Japanese Liberal Arts College (Doctoral Dissertation). Ann Arbor, Michigan: Dissertation Abstracts International, 36, 6502-A.<\/p>\n<p class=\"notranslate\">DzogovicS. A.Zdravkovska-AdamovaB.SerpilH. (2024). From theory to practice: a holistic study of the application of artificial intelligence methods and techniques in higher education and science. Human Res. Rehab.14, 293\u2013311. doi: 10.21554\/hrr.092406<\/p>\n<p class=\"notranslate\">ElyosephZ.Hadar-ShovalD.AsrafK.LvovskyM. (2023). ChatGPT outperforms humans in emotional awareness evaluations. Front. Psychol.14:1199058. doi: 10.3389\/fpsyg.2023.1199058, <\/p>\n<p class=\"notranslate\">Flod\u00e9nJ. (2025). Grading exams using large language models: a comparison between human and AI grading of exams in higher education using ChatGPT. Br. Educ. Res. J.51, 201\u2013224. doi: 10.1002\/berj.4069<\/p>\n<p class=\"notranslate\">FlintS. W.PiotrkowiczA.WattsK. (2022). Use of artificial intelligence to understand adults\u2019 thoughts and behaviours relating to COVID-19. Perspectives in Public Health, 142, 167\u2013174. doi: 10.1177\/1757913920979332<\/p>\n<p class=\"notranslate\">FloridiL.ChiriattiM. (2020). GPT-3: Its nature, scope, limits, and consequences. Minds and Machines, 30, 681\u2013694. doi: 10.1007\/s11023-020-09548-1<\/p>\n<p class=\"notranslate\">GuoB.ZhangX.WangZ.JiangM.NieJ.DingY.et al., (2023) How close is ChatGPT to human experts? Comparison corpus, evaluation, and detectionarXiv Available online at: <a href=\"https:\/\/arxiv.org\/abs\/2301.07597\" target=\"_blank\" rel=\"nofollow noopener\">https:\/\/arxiv.org\/abs\/2301.07597<\/a> (Accessed June 15, 2025).<\/p>\n<p class=\"notranslate\">HaaseJ.HanelP. H. P. (2023). Artificial muses: generative artificial intelligence chatbots have risen to human-level creativity. J. Creativity33:100066. doi: 10.1016\/j.yjoc.2023.100066<\/p>\n<p class=\"notranslate\">HanaaA. M. (1959). Educational and Vocational Guidance [In Arabic]. Cairo: Dar Al-Nahda Al-Misriyya.<\/p>\n<p class=\"notranslate\">HuntC. (1980). Terminal and Instrumental Values of Seniors in public high Schools in Louisiana (Doctoral Dissertation). Ann Arbor, Michigan: Dissertation Abstracts International, 40.<\/p>\n<p class=\"notranslate\">KayaF.AydinF.SchepmanA.RodwayP.Yeti\u015fensoyO.Demir KayaM. (2024). The roles of personality traits, AI anxiety, and demographic factors in attitudes toward AI. Int. J. Hum.-Comput. Interact.40, 497\u2013514. doi: 10.1080\/10447318.2022.2151730<\/p>\n<p class=\"notranslate\">KrakowskiS. (2025). Human-AI agency in the age of generative AI. Inf. Organ.35:100560. doi: 10.1016\/j.infoandorg.2025.100560<\/p>\n<p class=\"notranslate\">LiJ.DuJ.LiuS. (2022). Dark personalities in large language models. Pers. Individ. Differ.185:111284. doi: 10.1016\/j.paid.2021.111284<\/p>\n<p class=\"notranslate\">MalkoshR. (1996). The impact of a communication psychology course on social tendency within Adler\u2019s theory among a sample of University of Jordan students. Dirasat, 23, 230\u2013237.<\/p>\n<p class=\"notranslate\">MavrychV.YousefE. M.YaqinuddinA.BolgovaO. (2025). Large language models in medical education: a comparative cross-platform evaluation in answering histological questions. Med. Educ. Online30:2534065. doi: 10.1080\/10872981.2025.2534065, <\/p>\n<p class=\"notranslate\">McGuiness-BiewittJ. (1985). Career Values of College Students (Doctoral Dissertation). Ann Arbor, Michigan: Dissertation Abstracts International, 46, 1195-A.<\/p>\n<p class=\"notranslate\">MeyerA.RieseJ.StreichertT. (2024). Comparison of the performance of GPT-3.5 and GPT-4 with that of medical students on the written German medical licensing examination: observational study. JMIR Med. Educ.10:e50965. doi: 10.2196\/50965, <\/p>\n<p class=\"notranslate\">Sab\u1e25\u0101nR. (1975). Value Orientations among Students of the University of Baghdad (Master\u2019s thesis). Baghdad: University of Baghdad.<\/p>\n<p class=\"notranslate\">Shulner-TalA.KuflikT.KligerD.ManciniA. (2025). Who made that decision and why? Users\u2019 perceptions of human versus AI decision-making and the power of explainable-AI. Int. J. Hum.-Comput. Interact.41, 4230\u20134247. doi: 10.1080\/10447318.2024.2348843<\/p>\n<p class=\"notranslate\">SprangerE. (1928). Types of men: The Psychology and Ethics of Personality. New York: John Reprint.<\/p>\n<p class=\"notranslate\">SufyanN. S. (1995). Prevailing values among students of Sana\u2019a University (Taiz Branch) Unpublished master\u2019s thesis]. College of Education, Al-MustansiriyahUniversity.<\/p>\n<p class=\"notranslate\">SufyanN. S. (2002). A cross-cultural study of values among Taiz and Baghdad university students. Egypt. J. Psychol. Stud.12, 267\u2013295.<\/p>\n<p class=\"notranslate\">SufyanN. S.FadhelF. H.AlkhathamiS. S.MukhadiJ. Y. A. (2024). Artificial intelligence and social intelligence: preliminary comparison study between AI models and psychologists. Front. Psychol.15:1353022. doi: 10.3389\/fpsyg.2024.1353022, <\/p>\n<p class=\"notranslate\">TaddeoM.FloridiL. (2018). Regulate AI to avert a social catastrophe. Nature562, 438\u2013441. doi: 10.1038\/d41586-018-04602-6<\/p>\n<p class=\"notranslate\">UdiasA.Alonso-AyusoA.AlfaroC.AlgarM. J.CuestaM.Fern\u00e1ndez-IsabelA.et al. (2024). Chatgpt\u2019s performance in university admissions tests in mathematics. Int. Electron. J. Math. Educ.19, em0795. doi: 10.29333\/iejme\/15517<\/p>\n<p class=\"notranslate\">YeH.XieY.RenY.FangH.ZhangX.SongG. (2024) Measuring human and AI values using generative psychometricsarXiv Available online at: <a href=\"https:\/\/arxiv.org\/abs\/2409.12106\" target=\"_blank\" rel=\"nofollow noopener\">https:\/\/arxiv.org\/abs\/2409.12106<\/a> (Accessed June 15, 2025).<\/p>\n<p class=\"notranslate\">ZahranH. A. (1984). Social Psychology (5th ed). Cairo: Alam Al-Kutub. [In Arabic]<\/p>\n<p class=\"notranslate\">ZahranH. A.SerryJ. M. (1985) Prevailing and desired values among youth in Egyptian and Saudi contexts. Proceedings of the First Psychology Conference. Cairo, Egypt: Egyptian Society for Psychological Studies.<\/p>\n<p>Summary<\/p>\n<p class=\"h5\">Keywords<\/p>\n<p>artificial intelligence, large language models, Spranger\u2019s classification, university students, value system<\/p>\n<p class=\"h5\">Citation<\/p>\n<p>Sufyan NS, Alshehri SM, Teleb AA, Abbady A, Awad Asiri AAA, Elamrousy NH and Khasawneh MAS (2026) Value systems of artificial intelligence and university students: theoretical dominance in large language models and religious priority in humans. Front. Psychol. 17:1755145. doi: <a class=\"Summary__doi notranslate\" target=\"_blank\" href=\"http:\/\/dx.doi.org\/10.3389\/fpsyg.2026.1755145\" data-event=\"articleSummary-a-doi\" rel=\"nofollow noopener\">10.3389\/fpsyg.2026.1755145<\/a><\/p>\n<p class=\"h5\">Received<\/p>\n<p>26 November 2025<\/p>\n<p class=\"h5\">Revised<\/p>\n<p>12 March 2026<\/p>\n<p class=\"h5\">Accepted<\/p>\n<p>17 March 2026<\/p>\n<p class=\"h5\">Published<\/p>\n<p>21 April 2026<\/p>\n<p class=\"h5\">Volume<\/p>\n<p>17 &#8211; 2026<\/p>\n<p class=\"h5\">Edited by<\/p>\n<p class=\"notranslate\"><a href=\"https:\/\/loop.frontiersin.org\/people\/1779370\/overview\" target=\"_blank\" rel=\"nofollow noopener\">Carlos Enrique George-Reyes<\/a>, Bolivarian University of Ecuador, Ecuador<\/p>\n<p class=\"h5\">Reviewed by<\/p>\n<p class=\"notranslate\"><a href=\"https:\/\/loop.frontiersin.org\/people\/2963136\/overview\" target=\"_blank\" rel=\"nofollow noopener\">Olga Tapalova<\/a>, Abai University, Kazakhstan<\/p>\n<p class=\"notranslate\"><a href=\"https:\/\/loop.frontiersin.org\/people\/3145243\/overview\" target=\"_blank\" rel=\"nofollow noopener\">Harun Serpil<\/a>, Anadolu Universitesi Yunus Emre Kampusu, T\u00fcrkiye<\/p>\n<p class=\"h5\">Updates<\/p>\n<p><img decoding=\"async\" class=\"CrossmarkUpdates__img\" src=\"https:\/\/www.europesays.com\/ai\/wp-content\/uploads\/2026\/04\/crossmark_color.webp\" alt=\"Crossmark icon\"\/><\/p>\n<p>Check for updates<\/p>\n<p class=\"h5\"> Copyright <\/p>\n<p class=\"Summary__copyright__text\">\u00a9 2026 Sufyan, Alshehri, Teleb, Abbady, Awad Asiri, Elamrousy and Khasawneh. <\/p>\n<p>This is an open-access article distributed under the terms of the <a href=\"https:\/\/creativecommons.org\/licenses\/by\/4.0\/\" target=\"_blank\" rel=\"nofollow noopener\">Creative Commons Attribution License (CC BY)<\/a>. The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.<\/p>\n<p class=\"notranslate correspondence\">*Correspondence: Nabil Saleh Sufyan, <a href=\"https:\/\/www.frontiersin.org\/articles\/10.3389\/mailto:nsofian@kku.edu.sa\" class=\"email-link\" rel=\"nofollow noopener\" target=\"_blank\">nsofian@kku.edu.sa<\/a><\/p>\n<p class=\"h5\">Disclaimer<\/p>\n<p> All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher. <\/p>\n","protected":false},"excerpt":{"rendered":"Abstract The rapid advancement of artificial intelligence (AI), particularly large language models (LLMs), raises critical questions about the&hellip;\n","protected":false},"author":2,"featured_media":360,"comment_status":"","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[24,25,1642,8557,8556,8558],"class_list":{"0":"post-10036","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-ai","8":"tag-ai","9":"tag-artificial-intelligence","10":"tag-large-language-models","11":"tag-sprangers-classification","12":"tag-university-students","13":"tag-value-system"},"_links":{"self":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/10036","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/comments?post=10036"}],"version-history":[{"count":0,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/posts\/10036\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media\/360"}],"wp:attachment":[{"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/media?parent=10036"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/categories?post=10036"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.europesays.com\/ai\/wp-json\/wp\/v2\/tags?post=10036"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}