John Evan Owen - CTO & AI Engineer Portfolio

Profile Picture

"From distilling 70B LLMs on 24 H200 GPUs to scaling teams of 20+ engineers, I build AI systems that ship. Pending LLM patent, 4x inference throughput gains, 15+ production apps, and platforms handling 10,000+ TPS."

10+

Years Experience

20+

Engineers Led

15+

Apps Shipped

1

Pending Patent

John "Evan" Owen

Email

evan@jwo3.io

Phone

+1 803-386-8346

Github

jwo3.io/gh

Linkedin

jwo3.io/in

Skills

LLM Post-Training
Model Evaluation
Artificial Intelligence
Machine Learning
Deep Learning
NLP
GPU Computing
Distributed Systems
Cloud Computing
DevOps
Team Leadership
Strategic Planning
Cybersecurity

Achievements

Blackbelt in Zen Shotokai Karate

Pending Patent

Systems and Methods for Generating an Improved Mixture of Attention Model

QWERKY AI Inc. · 2025

Novel LLM architecture with CUDA implementation

Education

Georgia Institute of Technology

Georgia Institute of Technology

MS Computer Science

Expected May 2028

University of South Carolina

University of South Carolina

BS Computer Science

Summa Cum Laude

GPA: 3.956 / 4.0

Programming Languages

C++
Go
Python
SQL
Rust
Bash
Swift
Objective-C
Elixir
Java
Mojo
CUDA
Solidity
Nix

Tools, Libraries, Frameworks, Etc.

Postgres
Redis
PyTorch
Hugging Face Transformers
DeepSpeed
vLLM
lm-eval-harness
Weights & Biases
MLflow
TensorFlow
Scikit-learn
ROCm
Metal
Kubernetes
Docker
Helm
ElasticSearch
Celery
RabbitMQ
ZeroMQ
MQTT
Flask
FastAPI
Nginx
Terraform
Ansible
AWS
GCP
GKE
EKS
Temporal
Prometheus
Datadog
Grafana
Kafka
EVM
Socket.IO
WebSockets
JSON
REST
GraphQL
JSON-RPC
Protobuf
Eventlet/Gevent
SNS
FCM
SQS
DynamoDB
MongoDB
VectorDB

Experience

QWERKY AI Inc.

Co-Founder & Chief Technology Officer

2025 - 2026

  • Designed and built QDistill, an end-to-end post-training pipeline distilling 70B parameter teachers into 3B-8B hybrid Mamba-Transformer students via DeepSpeed ZeRO-3 on 24 H200 GPUs across 3 nodes, followed by SFT instruction tuning and RL alignment (GRPO, DPO), achieving 4x inference throughput and 1M token context length
  • Contributed patches to Hugging Face Accelerate/Trainer enabling dual-model DeepSpeed plugins; invented selective gradient checkpointing for hybrid architectures saving 30-40% memory
  • Curated 12M sample training dataset from 9 sources with MinHash LSH deduplication (15-20% dedup rate); generated synthetic data from Llama 3 405B; built Model-as-a-Judge evaluation using Claude and GPT
  • Built comprehensive evaluation suite using lm-eval-harness across MMLU, HellaSwag, PIQA, ARC, Winogrande, plus custom speed/latency benchmarking (TTFT, TPOT, P99) with models served via vLLM; all experiments tracked in W&B and MLflow
  • Developed key pending patents for LLMs with CUDA implementation, including Systems and Methods for Generating an Improved Mixture of Attention Model
  • Primary contributor of State Space Models and Mamba architectures to Modular's MAX inference framework in Mojo, authoring custom selective scan, causal convolution, and RMSNorm fused kernels
  • Built a 6-person research team and a 2-engineer product department; established a rigorous interview process driving 100%+ increase in team effectiveness

MerkleRoot

Chief Technology Officer

2024 - 2025

  • Managed a team of 20+ international developers and engineers
  • Led teams through the successful development and launch of the company's flagship products
  • Spearheaded the adoption of cutting-edge blockchain technologies and cryptography
  • Key stakeholder in the company's strategic direction and transition away from staff augmentation
  • Managed organizational restructuring after the loss of major clients
  • Developed a Solidity decompiler in Go and Rust that was a key feature in the company's product Hexman
  • Optimized Solidity codebases and EVM instruction sets, driving significant reductions in gas fees and bytecode size
  • Performed a security audit on a client's smart contract, identifying and fixing 5 critical vulnerabilities

VP of Architecture

2022 - 2024

  • Led technical strategy and architecture across all major client projects
  • Increased client allocation and revenue by 300% in 1 year
  • Created the process for technical screenings and mentorship within the company
  • Led hiring, conducted technical interviews, code reviews, and mentored developers
  • Grew team from 9 to 20+ engineers in 1 year
  • Spearheaded the adoption of asdf and Nix for development and deployment, reducing environment setup time
  • Developed white papers and technical design documentation and established best practices for the company
  • Primary architect for the company's main client, Forte

Forte

Architect

2022 - 2024

  • Led 8 teams of 5-7 engineers to deliver a new blockchain platform for the gaming industry at 1 million concurrent users
  • Architected Platform X and the Forte Protocol with a focus on developer experience and scalability
  • Designed byte-level metadata standard enabling the storage of 1+ million JSON NFTs on Ethereum with significant gas and storage cost reductions
  • Designed a multi-tenant, multi-region, and multi-layered architecture for Platform 2.0 with core abstractions in the Blockchain Layer and Orchestration Layer
  • Built Go services for Platform 2.0's core, including the Bridge, Orchestration, and API Layers, enabling gaming studios to run live in-game economies at production scale
  • Built Go microservices and programming language SDKs for the platform
  • Conducted interviews for senior leadership and directors of engineering
  • Led initiative to significantly reduce platform technical debt, improving developer velocity and system reliability

First Foundry

Senior Software Engineer

2022

  • Managed handoff of multiple ongoing applications as 52inc underwent acquisition by First Foundry
  • Led a 3-member research team under the CTO; built high-throughput data bridge in Go at 1,000+ rps
  • Built microservices in Elixir/Phoenix, Rust, and Python, integrating Ethereum, Zero Knowledge Proof, and Solana blockchain technologies
  • Created a notification system in Elixir with Erlang/OTP, delivering 10,000+ notifications per second

52 Inc.

Senior Software Engineer

2019 - 2022

  • Delivered 15+ apps and platforms to clients across healthcare, insurance, marketplace, and live streaming domains; rescued 5+ failing projects
  • Mobile iOS app development in Swift and Objective-C
  • Scaled a hospital messaging system from 100 rps to 1,000+ rps, restoring reliable real-time communication for clinical staff during peak hours
  • Maintained and scaled a Kubernetes-based platform tracking 1,000+ devices in real time, enabling an emergency response company to coordinate field operations reliably
  • Created document reading and searching apps for the insurance industry in Swift and Objective-C
  • Built a predictive pricing model for Ventrue Resorts using LSTM and Linear Regression in TensorFlow; discovered that the inductive bias of Linear Regression outperformed the LSTM for this domain

Software Engineer

2015 - 2019

  • Contributed to backend systems and APIs in Python, Java, and C++ under senior engineering guidance, supporting marketplace, healthcare, and live streaming products
  • Worked with the lead engineer to design and build a redundant data pipeline for real-time device location tracking, enabling an emergency response company to monitor field assets with sub-second latency
  • Implemented supervised machine learning models (kNN) for recommendation systems under mentorship, gaining early hands-on ML experience

Publications

Blog Posts

Bringing Blazing Fast State Space Models to the Modular MAX Framework

February 2026

Implementing Mamba 1 architecture into Modular's MAX inference framework with custom CPU-only Selective Scan and causal convolution kernels for cross-vendor SSM support.

Mother May AI: An Opinion on Geoffrey Hinton's Mother AI

September 2025

A critical examination of Geoffrey Hinton's "mother AI" proposal, advocating for human-centered AI development that prioritizes sustainability and efficiency over speculative superintelligence concerns.

Attention: The Breakthroughs and the Bottlenecks

June 2025

Analysis of Meta's Ring Attention and DeepSeek-V3's Multi-head Latent Attention, exploring how fundamental memory and hardware constraints continue to limit LLM architectures.

Incidental Non-Determinism: When AI Surprises You (and Why)

May 2025

Exploring hidden sources of unpredictability in LLMs including floating-point arithmetic errors, parallel processing inconsistencies, and hardware variations affecting AI reproducibility.