Automated Documentation Engine via Cloud Run
Date posted: 2026-01-19
Objective: Deploy a serverless Python microservice that utilizes the Gemini API to automatically generate project documentation (READMEs and Architecture guides) from source code repositories.
1. Service Architecture & Logic
- Core Engine (
app.py): A Flask-based service designed to run on Google Cloud Run that orchestrates repository cloning, codebase scanning, and interaction with the Google GenAI SDK. - Code Analysis: The service performs a "safe read" of source files, filtered by extensions such as
.py,.js, and.go. - Configurable Extraction: By default, the engine extracts the first 500 lines of key files to build a context window for the AI, though this value can be specified by the user to accommodate larger or smaller code samples.
- AI Integration: Utilizes the
gemini-2.5-flashmodel to analyze the provided code samples. - Template-Driven Generation: Documentation is produced using hardcoded prompts within the service logic that define specific sections like "Quick Start" and "Dependencies". These prompts can be updated or refactored to align with specific team standards or evolving documentation needs.
- Dual-Mode Operation: Supports a
/generateendpoint for processing remote Git repositories and a/generate-inlineendpoint for direct code snippet analysis.
2. CI/CD & Cloud Orchestration
- Cross-Cloud Authentication: The
azure_pipelines.yamlfile establishes a secure handshake by downloading a GCP Service Account JSON key from ADO Secure Files and activating it via thegcloudSDK. - Image Management: The pipeline triggers a
gcloud builds submitcommand, offloading the containerization process to Google Cloud Build. - Container Strategy: The
cloudbuild.yamlconfiguration builds a Docker image and tags it with both a semantic version (v0.0.1) and alatesttag before pushing it to the Google Artifact Registry. - Deployment Target: The service is containerized for Google Cloud Run, allowing it to scale to zero when not in use to minimize costs.
3. Implementation Steps
- Identity Setup: Provisioned a GCP Service Account to allow Azure DevOps (ADO) to manage resources and authenticate during the build process.
- Secure Credentialing: Stored the GCP JSON key within the Azure DevOps "Secure Files" library to prevent credential leakage in the repository.
- Automation Loop: Configured the pipeline to trigger on every push to the
mainbranch, ensuring the Cloud Run service always reflects the latest logic.
4. Technical Rationale ("The Why")
- Context-Aware Docs: By feeding actual code samples to Gemini, the generated documentation accurately reflects the service's dependencies, environment variables, and specific Flask routes.
- Cloud-Native Efficiency: Using Cloud Run ensures the service is "always ready" but only consumes billing seconds during active documentation tasks.
- Security-First Pipeline: The use of
DownloadSecureFile@1in ADO ensures that the "handshake" between Azure and GCP never exposes raw private keys in logs. - Standardization: This approach creates a "Golden Path" for internal teams, ensuring consistent documentation quality across various microservices.
This documentation was generated through an iterative AI process, refined by the author for technical accuracy and clarity.