John Evan Owen | AI Engineer & CTO

"From distilling 70B LLMs on 24 H200 GPUs to scaling teams of 20+ engineers, I build AI systems that ship. Pending LLM patent, 4x inference throughput gains, 15+ production apps, and platforms handling 10,000+ TPS."

10+

Years Experience

20+

Engineers Led

15+

Apps Shipped

Pending Patent

John "Evan" Owen

evan@jwo3.io

+1 803-386-8346

jwo3.io/gh

jwo3.io/in

Skills

LLM Post-Training

Model Evaluation

Artificial Intelligence

Machine Learning

Deep Learning

NLP

GPU Computing

Distributed Systems

Cloud Computing

DevOps

Team Leadership

Strategic Planning

Cybersecurity

Achievements

Blackbelt in Zen Shotokai Karate

Pending Patent

Systems and Methods for Generating an Improved Mixture of Attention Model

QWERKY AI Inc. · 2025

Novel LLM architecture with CUDA implementation

Education

Georgia Institute of Technology

MS Computer Science

Expected May 2028

University of South Carolina

BS Computer Science

Summa Cum Laude

GPA: 3.956 / 4.0

Programming Languages

C++

Python

SQL

Rust

Bash

Swift

Objective-C

Elixir

Java

Mojo

CUDA

Solidity

Nix

Tools, Libraries, Frameworks, Etc.

Postgres

Redis

PyTorch

Hugging Face Transformers

DeepSpeed

vLLM

lm-eval-harness

Weights & Biases

MLflow

TensorFlow

Scikit-learn

ROCm

Metal

Kubernetes

Docker

Helm

ElasticSearch

Celery

RabbitMQ

ZeroMQ

MQTT

Flask

FastAPI

Nginx

Terraform

Ansible

AWS

GCP

GKE

EKS

Temporal

Prometheus

Datadog

Grafana

Kafka

EVM

Socket.IO

WebSockets

JSON

REST

GraphQL

JSON-RPC

Protobuf

Eventlet/Gevent

SNS

FCM

SQS

DynamoDB

MongoDB

VectorDB

Experience

QWERKY AI Inc.

Co-Founder & Chief Technology Officer

2025 - 2026

Designed and built QDistill, an end-to-end post-training pipeline distilling 70B parameter teachers into 3B-8B hybrid Mamba-Transformer students via DeepSpeed ZeRO-3 on 24 H200 GPUs across 3 nodes, followed by SFT instruction tuning and RL alignment (GRPO, DPO), achieving 4x inference throughput and 1M token context length
Contributed patches to Hugging Face Accelerate/Trainer enabling dual-model DeepSpeed plugins; invented selective gradient checkpointing for hybrid architectures saving 30-40% memory
Curated 12M sample training dataset from 9 sources with MinHash LSH deduplication (15-20% dedup rate); generated synthetic data from Llama 3 405B; built Model-as-a-Judge evaluation using Claude and GPT
Built comprehensive evaluation suite using lm-eval-harness across MMLU, HellaSwag, PIQA, ARC, Winogrande, plus custom speed/latency benchmarking (TTFT, TPOT, P99) with models served via vLLM; all experiments tracked in W&B and MLflow
Developed key pending patents for LLMs with CUDA implementation, including Systems and Methods for Generating an Improved Mixture of Attention Model
Primary contributor of State Space Models and Mamba architectures to Modular's MAX inference framework in Mojo, authoring custom selective scan, causal convolution, and RMSNorm fused kernels
Built a 6-person research team and a 2-engineer product department; established a rigorous interview process driving 100%+ increase in team effectiveness

MerkleRoot

Chief Technology Officer

2024 - 2025

Managed a team of 20+ international developers and engineers
Led teams through the successful development and launch of the company's flagship products
Spearheaded the adoption of cutting-edge blockchain technologies and cryptography
Key stakeholder in the company's strategic direction and transition away from staff augmentation
Managed organizational restructuring after the loss of major clients
Developed a Solidity decompiler in Go and Rust that was a key feature in the company's product Hexman
Optimized Solidity codebases and EVM instruction sets, driving significant reductions in gas fees and bytecode size
Performed a security audit on a client's smart contract, identifying and fixing 5 critical vulnerabilities

VP of Architecture

2022 - 2024

Led technical strategy and architecture across all major client projects
Increased client allocation and revenue by 300% in 1 year
Created the process for technical screenings and mentorship within the company
Led hiring, conducted technical interviews, code reviews, and mentored developers
Grew team from 9 to 20+ engineers in 1 year
Spearheaded the adoption of asdf and Nix for development and deployment, reducing environment setup time
Developed white papers and technical design documentation and established best practices for the company
Primary architect for the company's main client, Forte

Forte

Architect

2022 - 2024

Led 8 teams of 5-7 engineers to deliver a new blockchain platform for the gaming industry at 1 million concurrent users
Architected Platform X and the Forte Protocol with a focus on developer experience and scalability
Designed byte-level metadata standard enabling the storage of 1+ million JSON NFTs on Ethereum with significant gas and storage cost reductions
Designed a multi-tenant, multi-region, and multi-layered architecture for Platform 2.0 with core abstractions in the Blockchain Layer and Orchestration Layer
Built Go services for Platform 2.0's core, including the Bridge, Orchestration, and API Layers, enabling gaming studios to run live in-game economies at production scale
Built Go microservices and programming language SDKs for the platform
Conducted interviews for senior leadership and directors of engineering
Led initiative to significantly reduce platform technical debt, improving developer velocity and system reliability

First Foundry

Senior Software Engineer

2022

Managed handoff of multiple ongoing applications as 52inc underwent acquisition by First Foundry
Led a 3-member research team under the CTO; built high-throughput data bridge in Go at 1,000+ rps
Built microservices in Elixir/Phoenix, Rust, and Python, integrating Ethereum, Zero Knowledge Proof, and Solana blockchain technologies
Created a notification system in Elixir with Erlang/OTP, delivering 10,000+ notifications per second

52 Inc.

Senior Software Engineer

2019 - 2022

Delivered 15+ apps and platforms to clients across healthcare, insurance, marketplace, and live streaming domains; rescued 5+ failing projects
Mobile iOS app development in Swift and Objective-C
Scaled a hospital messaging system from 100 rps to 1,000+ rps, restoring reliable real-time communication for clinical staff during peak hours
Maintained and scaled a Kubernetes-based platform tracking 1,000+ devices in real time, enabling an emergency response company to coordinate field operations reliably
Created document reading and searching apps for the insurance industry in Swift and Objective-C
Built a predictive pricing model for Ventrue Resorts using LSTM and Linear Regression in TensorFlow; discovered that the inductive bias of Linear Regression outperformed the LSTM for this domain

Software Engineer

2015 - 2019

Contributed to backend systems and APIs in Python, Java, and C++ under senior engineering guidance, supporting marketplace, healthcare, and live streaming products
Worked with the lead engineer to design and build a redundant data pipeline for real-time device location tracking, enabling an emergency response company to monitor field assets with sub-second latency
Implemented supervised machine learning models (kNN) for recommendation systems under mentorship, gaining early hands-on ML experience

Publications

Blog Posts

Bringing Blazing Fast State Space Models to the Modular MAX Framework

February 2026

Implementing Mamba 1 architecture into Modular's MAX inference framework with custom CPU-only Selective Scan and causal convolution kernels for cross-vendor SSM support.

Mother May AI: An Opinion on Geoffrey Hinton's Mother AI

September 2025

A critical examination of Geoffrey Hinton's "mother AI" proposal, advocating for human-centered AI development that prioritizes sustainability and efficiency over speculative superintelligence concerns.

Attention: The Breakthroughs and the Bottlenecks

June 2025

Analysis of Meta's Ring Attention and DeepSeek-V3's Multi-head Latent Attention, exploring how fundamental memory and hardware constraints continue to limit LLM architectures.

Incidental Non-Determinism: When AI Surprises You (and Why)

May 2025

Exploring hidden sources of unpredictability in LLMs including floating-point arithmetic errors, parallel processing inconsistencies, and hardware variations affecting AI reproducibility.

John Evan Owen - CTO & AI Engineer Portfolio

John "Evan" Owen

Skills

Achievements

Pending Patent

Education

Programming Languages

Tools, Libraries, Frameworks, Etc.

Experience

QWERKY AI Inc.

Co-Founder & Chief Technology Officer

MerkleRoot

Chief Technology Officer

VP of Architecture

Forte

Architect

First Foundry

Senior Software Engineer

52 Inc.

Senior Software Engineer

Software Engineer

Publications

Blog Posts