Case Info

EduHam — Coding Learning Platform from Zero to Scale

A coding learning platform built from scratch in twelve months. Custom multi-language sandbox, instructor authoring tools, real-time classrooms, and a closed-loop analytics stack — running 5,000 concurrent learners by handover.

The Challenge

A B2C learn-to-code platform with two product modes — self-paced courses and instructor-led cohorts — and one hard requirement: every lesson ends in code the learner actually runs, not a multiple-choice question. Over twelve months we built the platform from an empty repository: identity, courses, lesson authoring, quiz and assignment builders, a multi-language code-execution sandbox, real-time classrooms, billing, and a closed-loop analytics pipeline. By month twelve the platform served 5,000 concurrent learners across ten UI languages and executed 50,000+ code submissions a day.

Four Builder Surfaces, One Platform

Every course is assembled from four kinds of unit. Treating them as separate apps would have given instructors four logins, four content models, and four analytics surfaces. Treating them as one platform required a shared content model and a single submission pipeline — the choice paid back the first time an instructor dropped a code assignment into the middle of a live classroom in three clicks, not a release.

Dimension	Lesson Builder	Quiz Builder	Assignment Builder	Live Classroom
Primary author	Curriculum lead	Instructor	Instructor	Instructor
Learner surface	Self-paced reader	Self-paced or cohort	Self-paced or cohort	Cohort only
Sandbox required	Optional (snippets)	Optional (code questions)	Always	Always
Completion measure	Time-on-page + scroll depth	Score + attempts	Tests passed	Attendance + submissions

When the curriculum team launched an early language track in month seven, they assembled it from existing unit pools in eleven business days — no new components shipped.

Solution Architecture

Microservice decomposition

A handful of Laravel services with hard domain boundaries. Most own their own Postgres database; tightly-coupled domains share a connection rather than forcing a boundary that was not real. Each service exposes a versioned contract — HTTP for synchronous calls, a message queue for async paths.

One frontend, three surfaces

A single Vue (TypeScript) codebase split into three surfaces — learner app, instructor authoring, operator dashboard — sharing the same component library and i18n bundle. Pusher carries real-time presence, live submissions, and instructor notifications.

Isolated code sandbox

The sandbox is a separate domain on its own Kubernetes node pool. A Go orchestrator schedules submissions onto pre-warmed language containers — seccomp-locked, egress-denied, ephemeral rootfs. Ten programming languages by month twelve, p99 cold start under 500 ms.

Closed-loop analytics

Managed change-data-capture streams operational databases into a columnar warehouse. The data team and instructors query learner behaviour, completion curves, and content health on near-real-time dashboards. Reporting is live views with notification triggers — not a morning email.

Domain microservices, one Vue frontend, an isolated sandbox node pool, and a CDC→warehouse analytics spine.

Code Sandbox: Request to Result in Under 500 ms p99

The sandbox is the load-bearing piece of the product. A learner who waits four seconds for their first code run drops; one who waits 200 ms keeps going. Four stages run in sequence on the orchestrator's hot path, each with its own latency budget.

Stage 1

Validate

< 30 ms

Auth, rate limits, payload size, language whitelist, lightweight syntax pre-check. Hard cut on obvious abuse.

Stage 2

Schedule

< 80 ms

18%

Pick a pre-warmed container from the language-specific pool, attach the submission rootfs, apply the security profile. No image pull on the hot path.

Stage 3

Execute

< 320 ms typical

62%

Run the learner's code with a wall-clock timeout, memory cap, CPU quota, and egress-denied network namespace. The hot pool keeps the cold-start tail off the p99.

Stage 4

Capture and score

< 70 ms

14%

Stream stdout/stderr, diff against expected output for assignments, persist artifacts, emit a submission event to the analytics bus.

The cheapest win was deleting the image-pull step from the hot path: a 3.2-second median cold start in week six became a 90 ms hot start by month nine.

Learner Funnel — Month 3 MVP vs Month 12 at Scale

The MVP shipped at month six with a single language, no real-time, and a single UI language. Month twelve served ten programming languages, real-time classrooms, and instructor-built assignments. The funnel compares cohorts of 1,000 newly signed-up learners on the month-3 MVP against the month-12 production platform — same channel, same onboarding email.

Step

Before

After

Signed up

1,000

Completed onboarding

61%

84%

Ran first code submission

44%

78%

Passed first assignment

21%

52%

Active on day 30

11%

22%

The largest single lift came from cutting the first-code-run delay. When the median time from signup to first successful code execution dropped from 4 minutes 12 seconds to 38 seconds, the day-7 retention curve moved with it.

Technical Stack

Customer + operator UI

Vue 3TypeScriptPinia

One frontend codebase for learner, instructor, and operator surfaces

Domain services

Laravel (PHP)Per-service Postgres

Domain-rich CRUD that ships weekly across identity, courses, billing, notifications

Sandbox orchestrator

GoDockerCustom language images

Submission scheduler in Go for predictable latency; seccomp-locked, egress-denied containers per submission

Real-time

Pusher Channels

Managed because the team was small and the SLA was tight

Analytics

Managed CDCColumnar warehouse

Bought instead of built — connector quality and schema-drift handling outpaced what an eight-person team could maintain

Infrastructure

Managed KubernetesTwo node pools (general + sandbox)

Burst capacity for cohort launches; the sandbox node pool runs under tighter isolation than general workloads

Key Technical Decisions

Custom sandbox over a hosted runner

Tradeoff: Six engineering weeks of upfront work and a node pool to operate from day one

Why: Hosted runners either capped at three languages we cared about, charged per-execution at a multiple of our own pool cost at scale, or sat behind a public egress we could not lock down. Owning the sandbox kept p99 cold start under 500 ms, opened the door to ten languages within twelve months, and gave us a single security profile to reason about.

Pusher Channels over a self-hosted WebSocket fleet

Tradeoff: Per-message vendor cost; channel sharding strategy had to be designed early

Why: Real-time would have eaten two engineers for three months. Pusher gave us sub-100 ms fan-out on day one. Channel namespaces by tenant kept the vendor cost linear with paying cohorts, not with total signups.

Managed CDC + warehouse over building our own ETL

Tradeoff: Vendor cost on the connector; less control over edge-case schema changes

Why: An in-house ETL would have absorbed a data engineer for six months and produced a worse result. The data team shipped instructor dashboards in month nine instead of month fifteen.

Implementation Timeline

M1–M2

Discovery, architecture, core hires

Domain map, service boundaries, hiring loop designed. First three engineers on board by end of month two.

M3–M4

Identity, courses, authoring shell

Laravel skeleton for identity and courses, Vue shell for instructor authoring, first lesson rendered end-to-end.

M5–M6

MVP — sandbox v1, quiz, assignment, billing

Code sandbox v1 in production with two languages, quiz and assignment builders shipped, Stripe live. Private beta opens at end of month six.

M7–M8

Real-time classrooms, i18n to 10 UI languages

Pusher integration for classrooms, first cohort runs, i18n bundle to 10 UI languages, more programming languages added.

M9–M10

Analytics warehouse, instructor dashboards

Managed CDC into the warehouse, learner-event schema modelled, near-real-time instructor and CRO dashboards.

M11–M12

Scale hardening and handover

Channel sharding by tenant, sandbox pool sized for peak, observability hardened, on-call handed to the in-house team. Platform exits at 5,000 concurrent and steady.

Challenges and How We Solved Them

The Problem

First sustained-load stress at ~3,000 concurrent. A paid-acquisition push in month eight ran longer than the sandbox pool had been sized for, and the real-time fan-out plus Postgres read pressure on the courses service hit at the same fifteen-minute window.

Approach

Sharded real-time channels by tenant so one noisy classroom could not starve the fleet, added read replicas behind the courses service, moved hot reads for course catalogues into Redis, and resized the sandbox pre-warm pool to track the previous hour's submission rate.

Outcome

Latency stayed inside SLA for the remaining 90 minutes of the campaign. The same architecture absorbed 5,000 concurrent two months later without a second incident. The catalogue cache cut p95 lesson-load from 1.4 s to 320 ms.

The Problem

Sandbox abuse — shortly after opening a lower-level language runner, two learners tested the boundaries. One ran an outbound port scanner from inside the container; another launched a long-lived background process that consumed compute for forty minutes before the wall-clock budget killed it.

Approach

Locked the egress namespace to DNS plus the internal package mirror, tightened the security profile to deny raw sockets, and added a per-container CPU monitor that auto-kills jobs whose wall-clock-to-CPU ratio matches a sustained-abuse profile.

Outcome

Zero confirmed escape attempts since. The abuse-detection rule has fired three times in nine months — all genuine. The egress-denied default became a documented platform invariant.

Numbers That Moved

5,000+

Concurrent learners (peak)

was 0 ↑

50,000+ / day

Code submissions executed

was 900 / day ↑

480 ms

Sandbox cold start, p99

was 3.2 s ↓

Programming languages supported

was 1 ↑

UI languages shipped

was 1 ↑

41%

Day-7 learner retention

was 28% ↑

22 min

MTTR on production incidents

was 3h 40min ↓

Engineers on the team

was 0 ↑

Engagement shapes

Need an engineering lead who can hire, design, and deliver?

See how we engage

Engagement Team

Oleksandr Kotliarov — Engineering LeadBackend (domain services) — 3 engineersSandbox / Platform — 1 engineerFrontend — 2 engineersData / Analytics — 1 engineerSRE / DevOps — 1 engineer (joined month 10)

Lessons Learned

Designing for scalability on day one is cheaper than retrofitting it in month nine. The architectural decisions made before the second engineer joined paid back every time the platform absorbed a step-change in concurrent learners.

Observability is non-negotiable in a code-execution platform. Every container had structured telemetry from the first commit; sandbox abuse incidents were caught and closed in days, not weeks.

Analytics velocity is bought, not built, at this stage. Managed CDC plus a managed warehouse replaced six months of platform engineering — the data engineer instead shipped dashboards the curriculum team used every day.

Next Case Study

FinCue — Consumer Lending Platform across Two Markets

A consumer lending operator running short-term loans, installments, and marketplace BNPL across two markets and three brands. Over twenty-four months we rebuilt the platform around a hybrid scoring engine, moved payments onto an in-house card-acquiring stack, replaced the third-party telemarketing CRM with a closed-loop in-house dialer, and absorbed multiple credit-history sources into one feature pipeline. The headline number is the one that should not exist: defaults went down while approvals went up.

FinTech enterprise

Other cases

More cases

View all cases

FinTech | enterprise

FinCue

FinCue — Consumer Lending Platform across Two Markets

10.6% FPD30+ default rate (was 18.4%)

41% Approval rate (was 27%)

38 sec Time-to-decision, median (was 14 min)

Read Case Study →

Enterprise | enterprise

ObjectFirst

ObjectFirst — Web Platform Rebuild for an Enterprise Storage Vendor

13 Engineers hired and operating from zero

Live Subscription product launched alongside the CAPEX appliance business

0 WordPress instances in production after month nine

Read Case Study →

EduHam — Coding Learning Platform from Zero to Scale

The Challenge

Four Builder Surfaces, One Platform

Solution Architecture

Microservice decomposition

One frontend, three surfaces

Isolated code sandbox

Closed-loop analytics

Code Sandbox: Request to Result in Under 500 ms p99

Validate

Schedule

Execute

Capture and score

Learner Funnel — Month 3 MVP vs Month 12 at Scale

Technical Stack

Customer + operator UI

Domain services

Sandbox orchestrator

Real-time

Analytics

Infrastructure

Key Technical Decisions

Custom sandbox over a hosted runner

Pusher Channels over a self-hosted WebSocket fleet

Managed CDC + warehouse over building our own ETL

Implementation Timeline

Challenges and How We Solved Them

The Problem

Approach

Outcome

The Problem

Approach

Outcome

Numbers That Moved

Engagement Team

Lessons Learned

Related Services

Delivery Transformation

Fractional Delivery

Next Case Study

FinCue — Consumer Lending Platform across Two Markets

More cases

FinCue — Consumer Lending Platform across Two Markets

ObjectFirst — Web Platform Rebuild for an Enterprise Storage Vendor