Skip to content
AI SaaS Startup

Pryvasee AI — Production Cloud-Native Platform

Built and operated the entire cloud infrastructure and DevOps platform for an AI-powered SaaS startup from zero to production.

Terraform Azure AKS Kubernetes Flux CD Helm Gateway API Docker Azure DevOps TypeORM cert-manager External Secrets Operator Karpenter OpenSearch Fluent Bit Prometheus Grafana Alertmanager Entra ID WAF Front Door CDN AGFC Node.js React Flutter Fastlane
Sections
8 deep-dives
Tech Stack
26 technologies

Overview

Built and operated the entire cloud infrastructure and DevOps platform for an AI-powered SaaS startup from zero to production. The platform serves a React web app, 3 Node.js microservices, and a Flutter mobile app — all running on Kubernetes with full GitOps automation.

As the sole DevOps and platform engineer, I designed, implemented, and maintained every layer of the stack — from Terraform modules provisioning Azure resources to Flux CD automating deployments to Kubernetes.


Architecture

Terraform Layered State Architecture

The infrastructure uses a layered state architecture where each layer references the previous via terraform_remote_state:

base/       → VNet, NSGs, NAT, Storage, SQL, Key Vault, Entra ID
  ↓ remote_state
identity/   → Managed Identities, Role Assignments, Federated Credentials
  ↓ remote_state
aks/        → AKS Cluster, ACR, Front Door, AGFC, Document Intelligence
  ↓ remote_state
flux/       → Flux Extension, Git Repository Source

CI/CD & GitOps Flow

Developer Push → Azure DevOps Pipeline (Build + Test + DB Migration)

   ACR (Container Registry)

   Flux Image Reflector (polls every 60s)

   Flux Image Policy (selects latest build)

   Flux Image Update Automation (commits new tag to GitOps repo)

   Flux Kustomize Controller (reconciles)

   AKS Cluster
   ├── user-service (Pod)
   ├── subscription-service (Pod)
   ├── llm-service (Pod)
   ├── AGFC Gateway (HTTPS + WAF)
   ├── External Secrets (← Key Vault)
   ├── cert-manager (TLS)
   ├── OpenSearch + Fluent Bit (Logging)
   └── Prometheus + Grafana (Monitoring)

   Azure Front Door CDN → React SPA (Blob Storage)

   Users (Web + Mobile)

Infrastructure (Terraform)

  • 16 custom, reusable Terraform modules provisioning the entire Azure footprint
  • Layered state architecture: baseidentityaksflux, each referencing prior state via terraform_remote_state
  • Modules include: AKS cluster (OIDC, Workload Identity, Node Auto-Provisioning), VNet with private/public subnets + NSGs + NAT Gateway, Azure SQL (serverless), PostgreSQL Flexible Server, Key Vault (RBAC + private endpoint), Container Registry (zone-redundant, trust policies), Azure Front Door CDN (custom domains, WAF with rate limiting + managed rulesets), Application Gateway for Containers (AGFC with WAF + Bot Manager), Document Intelligence (AI cognitive service), Entra External ID (B2C — 433-line module for OAuth2/OIDC with social providers), Flux GitOps extension, Azure DevOps OIDC service connection, frontend static storage, customer data storage (Data Lake Gen2)
  • Zero static credentials — Workload Identity Federation and Managed Identities throughout
  • Private endpoints for all data services (SQL, PostgreSQL, Key Vault, Cognitive Services)
# Example: AKS Module with Workload Identity
module "aks" {
  source              = "../../modules/aks"
  cluster_name        = "pryvasee-${var.environment}"
  kubernetes_version  = "1.29"
  node_count          = 3
  vm_size             = "Standard_D4s_v5"

  oidc_issuer_enabled       = true
  workload_identity_enabled = true
  node_auto_provisioning    = true

  network = {
    vnet_id   = data.terraform_remote_state.base.outputs.vnet_id
    subnet_id = data.terraform_remote_state.base.outputs.aks_subnet_id
  }
}

Kubernetes Platform

  • Custom Helm chart (pryvasee-microservices-chart) — single reusable chart deployed per service with value overrides
  • 161-line helpers template with intelligent secret type detection (splitting External Secrets into data vs dataFrom groups)
  • Security hardened: runAsNonRoot, readOnlyRootFilesystem, drop ALL capabilities, resource limits enforced
  • Gateway API with AGFC: HTTPS routing, path-prefix rewriting, TLS termination with cert-manager
  • HPA (CPU/memory), PodDisruptionBudgets, liveness/readiness/startup probes
  • External Secrets Operator syncing secrets from Azure Key Vault via ClusterSecretStore
  • Azure Key Vault CSI driver for direct volume-mounted secrets
  • Cost optimization: Karpenter (NAP) with spot node pools
# Example: Security Context
securityContext:
  runAsNonRoot: true
  runAsUser: 1000
  readOnlyRootFilesystem: true
  allowPrivilegeEscalation: false
  capabilities:
    drop:
      - ALL
resources:
  requests:
    cpu: 100m
    memory: 256Mi
  limits:
    cpu: 500m
    memory: 512Mi

GitOps (Flux CD)

  • Three-tier Kustomization hierarchy: flux-system → addons (14 add-ons with dependency ordering) → services
  • Closed-loop automated deployment: code push → Docker build → ACR push → Flux image reflector polls ACR every 60s → ImagePolicy selects latest build → ImageUpdateAutomation commits updated tag → Flux reconciles → deployment rolls out
  • 14 cluster add-ons managed via GitOps: External Secrets Operator, cert-manager, AGFC Gateway, OpenSearch logging cluster, Fluent Bit (DaemonSet log shipping), kube-prometheus-stack (Grafana + Prometheus + Alertmanager), custom Prometheus alerts, NAP spot NodePool, RBAC roles, self-hosted Azure DevOps agent

CI/CD Pipelines (Azure DevOps)

  • Backend: 606-line, 4-stage pipeline — Build (matrix strategy for parallel multi-service builds) → Database Migration (TypeORM with automatic rollback on failure) → Pod Rollout Monitoring (waits for Flux reconciliation, verifies image tags, checks pod health, detects crash loops) → Automated Rollback (kubectl rollout undo, ACR image cleanup, DB migration revert, diagnostic log collection as pipeline artifacts)
  • Frontend: React+Vite build → Azure Blob Storage $web deployment → Front Door CDN cache purge
  • Mobile: 3-stage pipeline (Staging → Production Firebase → App Stores) with parallel Android + iOS jobs, Fastlane, Firebase App Distribution, Match code signing for iOS
  • Self-hosted build agents: Custom multi-arch (amd64/arm64) Docker image running ON the AKS cluster itself

Observability

  • OpenSearch cluster (operator-managed) for centralized logging
  • Fluent Bit DaemonSet shipping logs from all nodes
  • kube-prometheus-stack: Grafana dashboards, Prometheus metrics, Alertmanager
  • Custom Prometheus alert rules with Microsoft Teams integration
  • HTTPRoutes exposing Grafana, Prometheus, Alertmanager dashboards through AGFC Gateway

Key Achievements

  • Single-handedly built the entire DevOps platform from zero
  • 16 reusable Terraform modules — zero static credentials
  • Fully automated deployment pipeline with zero-downtime rollouts
  • Complete observability stack with alerting
  • Cost-optimized with spot instances and right-sized workloads
  • Production-grade security with zero static credentials and private endpoints

Need something similar?

Let's discuss how I can build this kind of infrastructure for your team.