Data

This chapter describes the data sources and data structure used in this study. Data is organized across five layers, from the AACSB accreditation database (already secured) through curriculum documents to Phase 2 interview and survey data.

Overview: Data by Research Phase

Phase	Data Layer(s)	Source	Status
Phase 1	Layers 1-3	AACSB Excel + School websites + Open Syllabus	Layer 1 secured; Layers 2-3 to be collected
Phase 2a	Layer 4	Instructor interviews (purposive sample)	Pending Phase 1 completion
Phase 2b	Layer 5	Student surveys	Pending Phase 1 completion

Layer 1: AACSB Accreditation Database

Status: Secured

The AACSB Excel database (1,077 accredited schools, 69 countries) serves as the sampling frame for the entire study. It provides school-level metadata used both to select the sample and as QCA condition variables.

Country Distribution (Target Countries)

Country	AACSB Accredited	Target Sample
US	556	~40 (maintains Hwang et al. comparability)
China	54	20-25
Korea	19	15-19 (near-census)

Variables Available in Layer 1

Variable	Type	Role in Study
School name	ID	Case identification
Country	Categorical	QCA scope condition / decomposed into AI policy proxy
Public/Private	Binary	QCA condition variable
Student enrollment	Continuous	School size variable
Program level (UG/MBA/PhD)	Categorical	QCA condition variable
Program list	Text	“AI” keyword pre-screening
Delivery mode	Categorical	Descriptive statistics
Website URL	URL	Entry point for Layer 2 data collection
Accreditation tenure (years)	Continuous	AACSB Maturity proxy for QCA

Key improvement over Hwang et al. (2025): AACSB metadata enables institution-characteristic-based QCA analysis that was impossible with the US News-based approach used in the prior study.

Layer 2: Course Catalogs and Program Pages

Status: Additional collection required

Layer 2 provides the what of AI curriculum integration – what themes are included, how AI appears in program learning outcomes.

Target	Source	Codeable Dimension
AI-related course titles and descriptions	University website course catalog	Curricular Themes (S/E/D/En/L)
Program learning outcomes	Program pages	CT Linkage (explicit or not)

Collection procedure:

Access school websites via Layer 1 URL list
Search for AI-related course listings (keyword: “artificial intelligence”, “AI”, “machine learning”, “ChatGPT”, etc.)
Download course titles, descriptions, and program learning outcomes
Organize by school and country

Layer 3: Syllabi

Status: Additional collection required

Layer 3 provides the how of AI integration – actual pedagogical approaches, assessment modes, and CT linkage evidence in syllabus text.

Target	Source	Codeable Dimension
Learning objective verbs	Syllabi originals	CT Level (Bloom’s mapping)
Class activity descriptions	Syllabi originals	Pedagogical Approaches (C/S/B/P/L)
Assessment/assignment items	Syllabi originals	Assessment Modes (A/R/F/Q)
Explicit CT mentions	Syllabi originals	CT Linkage (Explicit/Implicit/Absent)

Country-Specific Collection Approaches

Country	Primary Source	Expected Coverage
US	Open Syllabus (opensyllabus.org)	~80% of target schools
China	University website + direct email requests	Limited public availability
Korea	University website + direct email requests	Moderate availability

Multilingual Processing Protocol

Preserve originals in source language
Parallel English translation (researcher + AI-assisted + cross-validation)
Coding performed in original language (assign language-proficient coders per country)
Translation used for secondary verification only

Layer 4: Instructor Interview Data (Phase 2a)

Status: Pending Phase 1 completion

Semi-structured interviews with instructors of AI-integrated courses selected through Phase 1 QCA results.

Sample Design

Criterion	Details
Selection basis	Phase 1 QCA pathways (typical cases 2-3 per path; deviant cases 2-3)
Country balance	~5 per country (US 5, China 5, Korea 5)
Total target	15-30 instructors

Data Collected

Domain	Data Type
AI integration decision-making	Audio recording + transcript
Actual teaching practice	Field notes + artifacts (slides, handouts) where available
CT facilitation strategies	Interview transcript + memo
Intended-Enacted gap reflection	Interview transcript
Assessment methods	Description + examples
Contextual factors	Interview transcript

Analysis Path

Thematic Analysis (Braun & Clarke, 2006) – 6-step procedure
Deductive-inductive hybrid coding against Phase 1 framework
Member checking for validation

Layer 5: Student Survey Data (Phase 2b)

Status: Pending Phase 2a completion

Online surveys administered to students enrolled in courses taught by Phase 2a interviewees.

Sample Design

Item	Target
Per course	30-50 students
Total estimated	300-500 students across all three countries
Recruitment	Via Phase 2a instructors

Survey Sections

Section	Content	Measurement
A. AI usage experience	Frequency, type, tools used in class	Self-developed scale
B. CT self-perception	Perceived impact of AI use on CT	Adapted CT self-efficacy scale
C. CT skills (optional)	Indirect CT measurement	Watson-Glaser short form or CCTST subscale
D. Pedagogical experience	Teaching methods and assessment types experienced	Student version of Hwang et al. coding framework
E. Contextual awareness	AI policy, school support, cultural factor perceptions	Self-developed scale

Analysis Path

Descriptive statistics + cross-national comparison (ANOVA / Kruskal-Wallis)
Test CT perception differences by Phase 1 cluster membership
Triangulation with Phase 2a instructor interview data

Data File Structure

docs/05_분석/
├── Data_Raw/
│   ├── layer1_aacsb_schools.xlsx          # Layer 1: AACSB database
│   ├── layer2_course_descriptions/         # Layer 2: Collected by country
│   │   ├── us_courses.csv
│   │   ├── china_courses.csv
│   │   └── korea_courses.csv
│   ├── layer3_syllabi/                     # Layer 3: Syllabus originals
│   │   ├── us/
│   │   ├── china/
│   │   └── korea/
│   └── qca_coded_data.csv                  # QCA coding worksheet
├── Qual/                                   # Phase 1 qualitative mapping outputs
│   └── comparative_mapping/
└── Quan/                                   # Phase 1 quantitative analysis outputs
    ├── stm/                                # STM results
    ├── network/                            # Network analysis results
    └── qca/                               # QCA results

Data Governance

Item	Policy
Layers 1-3 (documents)	Public sources only; no personal data
Layer 4 (interviews)	IRB approval required; audio + consent forms; anonymized transcripts
Layer 5 (surveys)	IRB approval required; anonymous/de-identified; multilingual consent
Storage	Google Drive (private folder) + local backup
Access	Research team only; no external sharing without co-author agreement
Existing materials	`docs/01_기존자료/` – NEVER delete or overwrite
Raw data	`docs/05_분석/Data_Raw/` – NEVER delete or overwrite