Data
This chapter describes the data sources and data structure used in this study. Data is organized across five layers, from the AACSB accreditation database (already secured) through curriculum documents to Phase 2 interview and survey data.
Overview: Data by Research Phase
| Phase | Data Layer(s) | Source | Status |
|---|---|---|---|
| Phase 1 | Layers 1-3 | AACSB Excel + School websites + Open Syllabus | Layer 1 secured; Layers 2-3 to be collected |
| Phase 2a | Layer 4 | Instructor interviews (purposive sample) | Pending Phase 1 completion |
| Phase 2b | Layer 5 | Student surveys | Pending Phase 1 completion |
Layer 1: AACSB Accreditation Database
Status: Secured
The AACSB Excel database (1,077 accredited schools, 69 countries) serves as the sampling frame for the entire study. It provides school-level metadata used both to select the sample and as QCA condition variables.
Country Distribution (Target Countries)
| Country | AACSB Accredited | Target Sample |
|---|---|---|
| US | 556 | ~40 (maintains Hwang et al. comparability) |
| China | 54 | 20-25 |
| Korea | 19 | 15-19 (near-census) |
Variables Available in Layer 1
| Variable | Type | Role in Study |
|---|---|---|
| School name | ID | Case identification |
| Country | Categorical | QCA scope condition / decomposed into AI policy proxy |
| Public/Private | Binary | QCA condition variable |
| Student enrollment | Continuous | School size variable |
| Program level (UG/MBA/PhD) | Categorical | QCA condition variable |
| Program list | Text | “AI” keyword pre-screening |
| Delivery mode | Categorical | Descriptive statistics |
| Website URL | URL | Entry point for Layer 2 data collection |
| Accreditation tenure (years) | Continuous | AACSB Maturity proxy for QCA |
Key improvement over Hwang et al. (2025): AACSB metadata enables institution-characteristic-based QCA analysis that was impossible with the US News-based approach used in the prior study.
Layer 2: Course Catalogs and Program Pages
Status: Additional collection required
Layer 2 provides the what of AI curriculum integration – what themes are included, how AI appears in program learning outcomes.
| Target | Source | Codeable Dimension |
|---|---|---|
| AI-related course titles and descriptions | University website course catalog | Curricular Themes (S/E/D/En/L) |
| Program learning outcomes | Program pages | CT Linkage (explicit or not) |
Collection procedure:
- Access school websites via Layer 1 URL list
- Search for AI-related course listings (keyword: “artificial intelligence”, “AI”, “machine learning”, “ChatGPT”, etc.)
- Download course titles, descriptions, and program learning outcomes
- Organize by school and country
Layer 3: Syllabi
Status: Additional collection required
Layer 3 provides the how of AI integration – actual pedagogical approaches, assessment modes, and CT linkage evidence in syllabus text.
| Target | Source | Codeable Dimension |
|---|---|---|
| Learning objective verbs | Syllabi originals | CT Level (Bloom’s mapping) |
| Class activity descriptions | Syllabi originals | Pedagogical Approaches (C/S/B/P/L) |
| Assessment/assignment items | Syllabi originals | Assessment Modes (A/R/F/Q) |
| Explicit CT mentions | Syllabi originals | CT Linkage (Explicit/Implicit/Absent) |
Country-Specific Collection Approaches
| Country | Primary Source | Expected Coverage |
|---|---|---|
| US | Open Syllabus (opensyllabus.org) | ~80% of target schools |
| China | University website + direct email requests | Limited public availability |
| Korea | University website + direct email requests | Moderate availability |
Multilingual Processing Protocol
- Preserve originals in source language
- Parallel English translation (researcher + AI-assisted + cross-validation)
- Coding performed in original language (assign language-proficient coders per country)
- Translation used for secondary verification only
Layer 4: Instructor Interview Data (Phase 2a)
Status: Pending Phase 1 completion
Semi-structured interviews with instructors of AI-integrated courses selected through Phase 1 QCA results.
Sample Design
| Criterion | Details |
|---|---|
| Selection basis | Phase 1 QCA pathways (typical cases 2-3 per path; deviant cases 2-3) |
| Country balance | ~5 per country (US 5, China 5, Korea 5) |
| Total target | 15-30 instructors |
Data Collected
| Domain | Data Type |
|---|---|
| AI integration decision-making | Audio recording + transcript |
| Actual teaching practice | Field notes + artifacts (slides, handouts) where available |
| CT facilitation strategies | Interview transcript + memo |
| Intended-Enacted gap reflection | Interview transcript |
| Assessment methods | Description + examples |
| Contextual factors | Interview transcript |
Analysis Path
- Thematic Analysis (Braun & Clarke, 2006) – 6-step procedure
- Deductive-inductive hybrid coding against Phase 1 framework
- Member checking for validation
Layer 5: Student Survey Data (Phase 2b)
Status: Pending Phase 2a completion
Online surveys administered to students enrolled in courses taught by Phase 2a interviewees.
Sample Design
| Item | Target |
|---|---|
| Per course | 30-50 students |
| Total estimated | 300-500 students across all three countries |
| Recruitment | Via Phase 2a instructors |
Survey Sections
| Section | Content | Measurement |
|---|---|---|
| A. AI usage experience | Frequency, type, tools used in class | Self-developed scale |
| B. CT self-perception | Perceived impact of AI use on CT | Adapted CT self-efficacy scale |
| C. CT skills (optional) | Indirect CT measurement | Watson-Glaser short form or CCTST subscale |
| D. Pedagogical experience | Teaching methods and assessment types experienced | Student version of Hwang et al. coding framework |
| E. Contextual awareness | AI policy, school support, cultural factor perceptions | Self-developed scale |
Analysis Path
- Descriptive statistics + cross-national comparison (ANOVA / Kruskal-Wallis)
- Test CT perception differences by Phase 1 cluster membership
- Triangulation with Phase 2a instructor interview data
Data File Structure
docs/05_분석/
├── Data_Raw/
│ ├── layer1_aacsb_schools.xlsx # Layer 1: AACSB database
│ ├── layer2_course_descriptions/ # Layer 2: Collected by country
│ │ ├── us_courses.csv
│ │ ├── china_courses.csv
│ │ └── korea_courses.csv
│ ├── layer3_syllabi/ # Layer 3: Syllabus originals
│ │ ├── us/
│ │ ├── china/
│ │ └── korea/
│ └── qca_coded_data.csv # QCA coding worksheet
├── Qual/ # Phase 1 qualitative mapping outputs
│ └── comparative_mapping/
└── Quan/ # Phase 1 quantitative analysis outputs
├── stm/ # STM results
├── network/ # Network analysis results
└── qca/ # QCA results
Data Governance
| Item | Policy |
|---|---|
| Layers 1-3 (documents) | Public sources only; no personal data |
| Layer 4 (interviews) | IRB approval required; audio + consent forms; anonymized transcripts |
| Layer 5 (surveys) | IRB approval required; anonymous/de-identified; multilingual consent |
| Storage | Google Drive (private folder) + local backup |
| Access | Research team only; no external sharing without co-author agreement |
| Existing materials | docs/01_기존자료/ – NEVER delete or overwrite |
| Raw data | docs/05_분석/Data_Raw/ – NEVER delete or overwrite |