Projects
Research Infrastructure & Experiment Systems
Demand-Driven Modularity Analysis Pipeline (UMich EECS)
PyTorch · Distributed Training · Activation Analysis · CKA/SVD Pipeline · Causal Ablation
A full-stack experimental framework for studying when and why modularity emerges in Transformers.
Key Components
- Multi-task fine-tuning system (Math vs Sentiment, Logic vs Emotion, etc.)
- Layer-wise representation logging for activations & hidden states
- ΔW/W gradient-flow tracking across all training steps
- Low-rank structure analyzer: SVD, CKA similarity curves, PCA curvature
- Causal intervention engine: singular-direction ablation vs neuron ablation
- Visualization suite for modularity trajectories & rank evolution
Impact
Supports the core findings behind “Demand-Driven Modularity” and enables reproducible, large-scale mechanistic experiments.
Activation-Subspace Rubric System for Reasoning Evaluation (Fudan NLP/IR)
Sparse Autoencoders · Dictionary Learning · Causal Steering · Representation Geometry
A toolkit for mapping activation subspaces of LLMs into interpretable rubrics for evaluating reasoning quality.
Features
- Dictionary learning & SAE-based latent concept extraction
- Discovery of “reasoning quality” activation directions
- Causal manipulation (activation injection / suppression)
- Per-step evaluation on LongWriter / HelloBench
- Cross-model generalization tests
Impact
Moves LLM evaluation from “outcome-based” assessment to process-aware interpretation.
LongWriter + HelloBench Data Generation & Evaluation Pipeline
LLM Generation · Long-Context Handling · JSON-Structured Evaluation · Cluster Execution
A reproducible system for generating and evaluating long-form outputs across 5 benchmark subsets.
Core Features
- Multi-GPU distributed inference
- Automatic long-context prompt assembly
- Structured scoring for factuality, creativity, reasoning chain quality
- Error detection for hallucination / formatting issues
- Batch monitoring & logging system
Impact
Produces high-quality long-form datasets for evaluating LLM reasoning and creativity.
Engineering & Applied Projects
HCI: MCP-Enhanced Smart Personal Assistant
LLM + Multi-Tool Orchestration + Personalized Workflows
(Screenshots & description extracted from uploaded report) :contentReference[oaicite:1]{index=1}
An intelligent assistant prototype integrating “fast path / slow path” routing, dual-channel memory, and Gaode Map MCP.
Core Features
- Smart task routing (local LLM vs multi-tool workflows)
- Personalized navigation using Gaode MCP
- Lifestyle automation via natural-language “shortcut commands”
Transportation Network Optimization After the Key Bridge Collapse
Python · OpenStreetMap · DBSCAN · Dijkstra
Urban mobility optimization pipeline analyzing real infrastructure failure.
Highlights
- 42% travel distance increase detected post-collapse
- Bus network redesigned via clustering
- 18% commute time reduction & $4.2M annual savings
Multi-Stage Manufacturing Optimization
Genetic Algorithms · Fuzzy Logic · Statistical Quality Control
An adaptive decision-support system under uncertain defect rates.
Highlights
- Hypothesis-test–driven inspection thresholds
- 18% profitability improvement
- Robust to ±2% defect-rate noise
CLIP Zero-Shot Vision & Text Classification
PyTorch · Contrastive Learning · Prompt Engineering
Explored CLIP’s generalization across images & NLP.
Key Contributions
- 63–80% accuracy on CIFAR/Food101/Pets
- Novel “CLIPText” zero-shot NLP classifier
- Improved AGNews from 18% → 43%
- Robustness tested on stylized anime images