Software Engineer · ML Engineer · AI Engineer · M.S. @ Pitt

JOHNSON

JAO

5+ years engineering production systems at scale — from distributed data pipelines and ML models to high-concurrency backend services.

↓ Scroll to explore
PythonPyTorchTensorFlowScikit-learnPySparkSQLBigQueryC++PythonPyTorchTensorFlowScikit-learnPySparkSQLBigQueryC++
AWSDockerKubernetesKubeflowAirflowCI/CDKQLLinuxAWSDockerKubernetesKubeflowAirflowCI/CDKQLLinux

About

Engineering ML systems
from prototype to production.

I'm Tsung-Han (Johnson) Jao, currently pursuing my M.S. at the University of Pittsburgh with a perfect 4.0 GPA. I have 5+ years of industry experience building scalable backend systems and production ML pipelines across cybersecurity and e-commerce.

At Trend Micro, I architected distributed data pipelines processing billions of events and built high-concurrency scoring services. At Tagtoo, I developed deep learning models and recommendation systems serving 10M+ users. I bridge the gap between software engineering and machine learning.

0+
Years Experience
0M+
Users Impacted
0%
Conversion Lift
4
GPA at Pitt
Cybersecurity
Threat detection, anomaly detection, phishing classification
E-commerce
RecSys, identity resolution, conversion optimization
Backend & Infra
Distributed systems, high-concurrency services, system design
MLOps
Kubeflow, Airflow, Docker, CI/CD, model monitoring

Where ideas become code.

At Work

Experience

Where I've
built & shipped.

Nov 2022 — Jul 2025

Cloud Development Engineer

Data Science
Trend Micro
Taipei, Taiwan

Architected distributed pipelines with PySpark & Airflow processing billions of security events in TB-scale Parquet on AWS S3.

Developed 10 detection filters in one quarter via statistical & sequence analysis on ADX, achieving 10× alert volume increase with <5% false positive rate.

Designed discriminative features for a deep learning phishing detection pipeline from email content.

Mentored interns to raise test coverage from 5% → 100% and build CI/CD pipelines.

Sep 2019 — Nov 2022

Machine Learning Engineer

Tagtoo
Taipei, Taiwan

Built deep learning models (CNN, DNN, Autoencoder) for identity resolution, unifying 10M+ user profiles across devices.

Developed SVD-based recommendation systems achieving 50% conversion rate lift, validated through A/B testing.

Orchestrated end-to-end model lifecycle via Docker & Kubeflow — weekly retraining, daily inference, drift monitoring.

Refactored legacy scripts, reducing report generation from 3 days to under 2 hours across 500+ e-commerce sites.

LEARN

Education

Where I
grew roots.

Current
Aug 2025 — Jan 2027

M.S. Telecommunications

University of Pittsburgh
Pittsburgh, PA

GPA 4.0/4.0. TA for Mathematical Foundations of ML. Coursework in AI, Algorithm Design, Cloud Computing, Network Security.

Sep 2016 — Apr 2019

M.S. Computer Science & Information Engineering

National Taiwan University of Science and Technology
Taipei, Taiwan

Thesis: Deep hashing neural network for content-based image retrieval — 4.96% mAP increase on CIFAR-10/100.

Sep 2012 — Jun 2016

B.S. Computer Science & Engineering

Yuan Ze University
Taoyuan, Taiwan

Foundation in algorithms, data structures, and systems programming.

Never stop learning.

Pittsburgh, PA

Selected Work

Projects that
drove results.

01

AI Agent Office

In Progress

A multi-agent system where a CEO agent autonomously recruits and orchestrates specialized sub-agents to collaboratively solve user problems — a virtual office powered by AI.

Agentic AIMulti-AgentLLMGitHub →
02

Gmail Job App Tracker

Full-stack application that integrates Gmail API with LLM-powered parsing to automatically track and organize job applications. Built with a Python backend, frontend UI, and database layer.

LLMGmail APIFull-StackPythonGitHub →
03

Phishing Detection Pipeline

Designed discriminative feature sets from email content for a deep learning classification model, contributing to a 10× increase in detected threats with under 5% false positive rate.

Deep LearningFeature EngineeringCybersecurity
04

Identity Resolution System

Built CNN/DNN/Autoencoder models to consolidate fragmented browser & device footprints into unified profiles for 10M+ users, enabling precision-targeted advertising.

TensorFlowPyTorchBigQuery
05

SVD Recommendation Engine

Developed matrix factorization-based recommendation system optimizing latent factors, achieving a 50% conversion rate increase validated through A/B testing.

RecSysSVDA/B Testing
06

Deep Hashing for Image Retrieval

Designed a CNN with Residual layers for content-based image retrieval, achieving 4.96% mAP improvement on CIFAR-10/100 through feature extraction optimization.

Computer VisionKerasThesis

Speaking

PyData Taipei Meetup

Spoke on leveraging Kaggle competitions to sharpen practical ML engineering skills and bridge the gap between competition and production.

October 2022 · Taipei

Read Article →

"We don't stop until we cross the finish line."

Overheard at Tokyo Marathon, 2025

Race Log

Finish lines
I've crossed.

Chicago Marathon

Upcoming
Chicago, IL
Full Marathon
2026

Challenge Taiwan

Taitung, Taiwan
Triathlon
2025

Tokyo Marathon

Tokyo, Japan
Full Marathon
2025

EVA Air Marathon

Taipei, Taiwan
Full Marathon
2024

Other Races

World Masters Games · 21K · Taipei · 2025
PUMA Night Run · 10K · Taipei · 2025
Taipei Marathon · 21K · Taipei · 2023
Taipei Starlight Marathon · 11K · Taipei · 2023
3
Marathons
1
Triathlon
8
Total Races
3
Countries
"Without data, you're just another person with an opinion."
— W. Edwards Deming

Blog

Writing &
thinking out loud.

Anatomy of the Claude Code Leak: What 512,000 Lines of TypeScript Reveal About Building Production AI Agents

Mar 2026 · Agentic AI

How to Actually Read Your AI Agent's Langfuse Dashboard

Mar 2026 · Langfuse

What I Learned Building AI Agents with LangGraph

Feb 2026 · Agentic AI

Fake News Detection: Comparing 9 Models from Naive Bayes to DistilBERT

Jan 2026 · NLP

How I Used Kaggle to Level Up as an ML Engineer

Oct 2022 · ML Engineering
CONNECT

Let's build
something great.

Open to full-time opportunities in Software Engineering, ML Engineering, AI Engineering, and Backend/Data Infrastructure roles.

© 2026 Tsung-Han (Johnson) JaoPittsburgh, PA