Provisioning and Managing AWS EKS Cluster Using Terraform: Integrated Monitoring with Prometheus and Grafana

For detailed steps and configurations, follow along with the full video here.

GitHub Repository

https://github.com/0xp4ck3t/eks-retail-app-monitoring

You can download the files and test the setup with the following commands:

git clone https://github.com/0xp4ck3t/eks-retail-app-monitoring.git
cd eks-retail-app-monitoring
terraform init
terraform apply 

Use kubectl to run the application:

kubectl apply -f https://raw.githubusercontent.com/aws-containers/retail-store-sample-app/main/dist/kubernetes/deploy.yaml

Get the URL for the frontend load balancer like so:

kubectl get svc ui

To install Prometheus and Grafana via Kubernetes

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

custom-values.yaml

prometheus:
  service:
    type: LoadBalancer
grafana:
  service:
    type: LoadBalancer
helm upgrade --install kube-prometheus-stack prometheus-community/kube-prometheus-stack -f custom-values.yaml

Prometheus Query: Paste these query into Grafana panels to visualize the metric.

Uptime
time() - process_start_time_seconds{job="catalog"}

Average Request Duration
sum(rate(gin_request_duration_seconds_sum[5m])) / sum(rate(gin_request_duration_seconds_count[5m]))

Total Request
sum(increase(gin_requests_total{job="catalog", url!="/health"}[$__range]))

Request by Status
sum(rate(gin_requests_total{job="catalog"}[15m])) by (code)

Error Rate 
sum(increase(gin_requests_total{code=~"404"}[15m])) / sum(increase(gin_requests_total{job="catalog", url!="/health"}[15m]))

Architecture

Architecture

1. VPC (Virtual Private Cloud)

The VPC is the core networking layer for this infrastructure, configured with the CIDR block 10.10.0.0/16. It provides isolated networking for all resources in the Amazon EKS cluster.


2. Subnets


3. EKS Control Plane

The EKS Control Plane is fully managed by AWS, handling cluster operations such as API server management, health monitoring, and scheduling.


4. Worker Nodes


5. Kubernetes Load Balancer

Deployed in the public subnet, the Kubernetes Load Balancer exposes applications running in the EKS cluster to external users. It efficiently distributes incoming traffic to worker nodes.


6. NAT Gateway

The NAT Gateway, located in the public subnet, allows resources in private subnets (like worker nodes) to access the internet for tasks such as pulling container images, while blocking unsolicited inbound traffic.


7. Route Tables


8. Internet Gateway

The Internet Gateway enables outbound and inbound communication for resources in the public subnet, such as the Load Balancer.


9. Security Groups

Security groups control inbound and outbound traffic across components:


10. Monitoring: Prometheus and Grafana

Prometheus and Grafana are deployed as Kubernetes pods to monitor both cluster and application performance.

Architecture Flow


Traffic Flow

Logo 0xp4ck3t

Site has been visited times.

LinkedIn 𝕏 GitHub