Data

This chapter describes the data sources and data structure used in this study. Data is organized across five layers, from the AACSB accreditation database (already secured) through curriculum documents to Phase 2 interview and survey data.


Overview: Data by Research Phase

Phase Data Layer(s) Source Status
Phase 1 Layers 1-3 AACSB Excel + School websites + Open Syllabus Layer 1 secured; Layers 2-3 to be collected
Phase 2a Layer 4 Instructor interviews (purposive sample) Pending Phase 1 completion
Phase 2b Layer 5 Student surveys Pending Phase 1 completion

Layer 1: AACSB Accreditation Database

Status: Secured

The AACSB Excel database (1,077 accredited schools, 69 countries) serves as the sampling frame for the entire study. It provides school-level metadata used both to select the sample and as QCA condition variables.

Country Distribution (Target Countries)

Country AACSB Accredited Target Sample
US 556 ~40 (maintains Hwang et al. comparability)
China 54 20-25
Korea 19 15-19 (near-census)

Variables Available in Layer 1

Variable Type Role in Study
School name ID Case identification
Country Categorical QCA scope condition / decomposed into AI policy proxy
Public/Private Binary QCA condition variable
Student enrollment Continuous School size variable
Program level (UG/MBA/PhD) Categorical QCA condition variable
Program list Text “AI” keyword pre-screening
Delivery mode Categorical Descriptive statistics
Website URL URL Entry point for Layer 2 data collection
Accreditation tenure (years) Continuous AACSB Maturity proxy for QCA

Key improvement over Hwang et al. (2025): AACSB metadata enables institution-characteristic-based QCA analysis that was impossible with the US News-based approach used in the prior study.


Layer 2: Course Catalogs and Program Pages

Status: Additional collection required

Layer 2 provides the what of AI curriculum integration – what themes are included, how AI appears in program learning outcomes.

Target Source Codeable Dimension
AI-related course titles and descriptions University website course catalog Curricular Themes (S/E/D/En/L)
Program learning outcomes Program pages CT Linkage (explicit or not)

Collection procedure:

  1. Access school websites via Layer 1 URL list
  2. Search for AI-related course listings (keyword: “artificial intelligence”, “AI”, “machine learning”, “ChatGPT”, etc.)
  3. Download course titles, descriptions, and program learning outcomes
  4. Organize by school and country

Layer 3: Syllabi

Status: Additional collection required

Layer 3 provides the how of AI integration – actual pedagogical approaches, assessment modes, and CT linkage evidence in syllabus text.

Target Source Codeable Dimension
Learning objective verbs Syllabi originals CT Level (Bloom’s mapping)
Class activity descriptions Syllabi originals Pedagogical Approaches (C/S/B/P/L)
Assessment/assignment items Syllabi originals Assessment Modes (A/R/F/Q)
Explicit CT mentions Syllabi originals CT Linkage (Explicit/Implicit/Absent)

Country-Specific Collection Approaches

Country Primary Source Expected Coverage
US Open Syllabus (opensyllabus.org) ~80% of target schools
China University website + direct email requests Limited public availability
Korea University website + direct email requests Moderate availability

Multilingual Processing Protocol

  1. Preserve originals in source language
  2. Parallel English translation (researcher + AI-assisted + cross-validation)
  3. Coding performed in original language (assign language-proficient coders per country)
  4. Translation used for secondary verification only

Layer 4: Instructor Interview Data (Phase 2a)

Status: Pending Phase 1 completion

Semi-structured interviews with instructors of AI-integrated courses selected through Phase 1 QCA results.

Sample Design

Criterion Details
Selection basis Phase 1 QCA pathways (typical cases 2-3 per path; deviant cases 2-3)
Country balance ~5 per country (US 5, China 5, Korea 5)
Total target 15-30 instructors

Data Collected

Domain Data Type
AI integration decision-making Audio recording + transcript
Actual teaching practice Field notes + artifacts (slides, handouts) where available
CT facilitation strategies Interview transcript + memo
Intended-Enacted gap reflection Interview transcript
Assessment methods Description + examples
Contextual factors Interview transcript

Analysis Path

  • Thematic Analysis (Braun & Clarke, 2006) – 6-step procedure
  • Deductive-inductive hybrid coding against Phase 1 framework
  • Member checking for validation

Layer 5: Student Survey Data (Phase 2b)

Status: Pending Phase 2a completion

Online surveys administered to students enrolled in courses taught by Phase 2a interviewees.

Sample Design

Item Target
Per course 30-50 students
Total estimated 300-500 students across all three countries
Recruitment Via Phase 2a instructors

Survey Sections

Section Content Measurement
A. AI usage experience Frequency, type, tools used in class Self-developed scale
B. CT self-perception Perceived impact of AI use on CT Adapted CT self-efficacy scale
C. CT skills (optional) Indirect CT measurement Watson-Glaser short form or CCTST subscale
D. Pedagogical experience Teaching methods and assessment types experienced Student version of Hwang et al. coding framework
E. Contextual awareness AI policy, school support, cultural factor perceptions Self-developed scale

Analysis Path

  • Descriptive statistics + cross-national comparison (ANOVA / Kruskal-Wallis)
  • Test CT perception differences by Phase 1 cluster membership
  • Triangulation with Phase 2a instructor interview data

Data File Structure

docs/05_분석/
├── Data_Raw/
│   ├── layer1_aacsb_schools.xlsx          # Layer 1: AACSB database
│   ├── layer2_course_descriptions/         # Layer 2: Collected by country
│   │   ├── us_courses.csv
│   │   ├── china_courses.csv
│   │   └── korea_courses.csv
│   ├── layer3_syllabi/                     # Layer 3: Syllabus originals
│   │   ├── us/
│   │   ├── china/
│   │   └── korea/
│   └── qca_coded_data.csv                  # QCA coding worksheet
├── Qual/                                   # Phase 1 qualitative mapping outputs
│   └── comparative_mapping/
└── Quan/                                   # Phase 1 quantitative analysis outputs
    ├── stm/                                # STM results
    ├── network/                            # Network analysis results
    └── qca/                               # QCA results

Data Governance

Item Policy
Layers 1-3 (documents) Public sources only; no personal data
Layer 4 (interviews) IRB approval required; audio + consent forms; anonymized transcripts
Layer 5 (surveys) IRB approval required; anonymous/de-identified; multilingual consent
Storage Google Drive (private folder) + local backup
Access Research team only; no external sharing without co-author agreement
Existing materials docs/01_기존자료/ – NEVER delete or overwrite
Raw data docs/05_분석/Data_Raw/ – NEVER delete or overwrite