Kubernetes全栈实践:开发、部署、运维与监控的完整闭环

想成为Kubernetes高手?这篇实践手册带你从代码到监控一气呵成!全程干货,直接上代码,跟着做就能掌握全流程。

1. 开发环节:构建云原生应用

1.1 创建简单Web应用

# app.py
from flask import Flask
import os

app = Flask(__name__)

@app.route('/')
def hello():
    return f"Hello Kubernetes! Host: {os.environ.get('HOSTNAME', 'unknown')}"

@app.route('/health')
def health():
    return {"status": "healthy"}, 200

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

1.2 容器化应用

# Dockerfile
FROM python:3.9-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY app.py .
EXPOSE 5000

CMD ["python", "app.py"]
# requirements.txt
flask==2.3.3

构建命令:

docker build -t my-app:v1.0 .
  • 解读:构建Docker镜像,标签为v1.0
docker tag my-app:v1.0 registry.example.com/my-app:v1.0
docker push registry.example.com/my-app:v1.0
  • 解读:推送到镜像仓库,供K8s集群使用

2. 部署环节:Kubernetes部署实战

2.1 基础部署配置

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  labels:
    app: my-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      containers:
      - name: my-app
        image: registry.example.com/my-app:v1.0
        ports:
        - containerPort: 5000
        env:
        - name: ENVIRONMENT
          value: "production"
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"
        livenessProbe:
          httpGet:
            path: /health
            port: 5000
          initialDelaySeconds: 30
          periodSeconds: 10

2.2 服务暴露

# service.yaml
apiVersion: v1
kind: Service
metadata:
  name: my-app-service
spec:
  selector:
    app: my-app
  ports:
  - name: http
    port: 80
    targetPort: 5000
  type: LoadBalancer

2.3 配置管理

# configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: app-config
data:
  app.conf: |
    log_level: INFO
    timeout: 30

部署命令:

kubectl apply -f deployment.yaml
  • 解读:创建Deployment部署应用
kubectl apply -f service.yaml
  • 解读:创建Service暴露服务
kubectl get pods -l app=my-app
  • 解读:检查Pod运行状态
kubectl get svc my-app-service
  • 解读:获取服务访问地址

3. 运维环节:日常运维操作

3.1 应用扩缩容

# 手动扩缩容
kubectl scale deployment my-app --replicas=5
  • 解读:将副本数扩展到5个
# 自动扩缩容配置
kubectl autoscale deployment my-app --min=2 --max=10 --cpu-percent=80
  • 解读:设置HPA,CPU超80%自动扩容

3.2 滚动更新与回滚

# 更新镜像版本
kubectl set image deployment/my-app my-app=registry.example.com/my-app:v2.0
  • 解读:滚动更新到v2.0版本
# 查看更新状态
kubectl rollout status deployment/my-app
  • 解读:监控滚动更新进度
# 回滚到上一版本
kubectl rollout undo deployment/my-app
  • 解读:快速回滚部署
# 查看部署历史
kubectl rollout history deployment/my-app
  • 解读:查看所有部署版本

3.3 故障排查

# 查看Pod详情
kubectl describe pod my-app-xxxxx
  • 解读:获取Pod详细信息和事件
# 查看容器日志
kubectl logs -f my-app-xxxxx
  • 解读:实时查看应用日志
# 进入容器调试
kubectl exec -it my-app-xxxxx -- /bin/bash
  • 解读:进入容器内部排查问题

4. 监控环节:全方位监控体系

4.1 部署Prometheus监控

# prometheus-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prometheus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prometheus
  template:
    metadata:
      labels:
        app: prometheus
    spec:
      containers:
      - name: prometheus
        image: prom/prometheus:v2.45.0
        ports:
        - containerPort: 9090
        volumeMounts:
        - name: config
          mountPath: /etc/prometheus/
      volumes:
      - name: config
        configMap:
          name: prometheus-config
# prometheus-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
    scrape_configs:
    - job_name: 'my-app'
      static_configs:
      - targets: ['my-app-service:80']

4.2 应用指标暴露

# 更新app.py添加指标
from prometheus_client import Counter, generate_latest, CONTENT_TYPE_LATEST

request_count = Counter('http_requests_total', 'Total HTTP Requests')

@app.route('/metrics')
def metrics():
    return generate_latest(), 200, {'Content-Type': CONTENT_TYPE_LATEST}

@app.route('/')
def hello():
    request_count.inc()
    return f"Hello Kubernetes! Host: {os.environ.get('HOSTNAME', 'unknown')}"

4.3 部署Grafana可视化

# grafana-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: grafana
spec:
  replicas: 1
  selector:
    matchLabels:
      app: grafana
  template:
    metadata:
      labels:
        app: grafana
    spec:
      containers:
      - name: grafana
        image: grafana/grafana:10.0.0
        ports:
        - containerPort: 3000
        env:
        - name: GF_SECURITY_ADMIN_PASSWORD
          value: "admin123"

4.4 监控命令实战

# 部署监控组件
kubectl apply -f prometheus-config.yaml
kubectl apply -f prometheus-deployment.yaml
kubectl apply -f grafana-deployment.yaml
  • 解读:一键部署监控栈
# 端口转发访问
kubectl port-forward svc/prometheus 9090:9090
  • 解读:本地访问Prometheus界面
# 查看资源使用
kubectl top pods -l app=my-app
  • 解读:实时查看Pod资源消耗
# 集群状态检查
kubectl get nodes -o wide
  • 解读:检查节点健康状态
# 事件监控
kubectl get events --sort-by='.lastTimestamp'
  • 解读:查看集群最新事件

5. 完整实践案例:电商应用部署

5.1 数据库部署

# mysql-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mysql
spec:
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
      - name: mysql
        image: mysql:8.0
        env:
        - name: MYSQL_ROOT_PASSWORD
          value: "password"
        - name: MYSQL_DATABASE
          value: "ecommerce"
        ports:
        - containerPort: 3306
---
apiVersion: v1
kind: Service
metadata:
  name: mysql-service
spec:
  selector:
    app: mysql
  ports:
  - port: 3306
    targetPort: 3306

5.2 完整部署脚本

#!/bin/bash
# deploy.sh - 一键部署脚本

echo "开始部署电商应用..."

# 构建镜像
docker build -t ecommerce-app:v1.0 .
docker push registry.example.com/ecommerce-app:v1.0

# 部署数据库
kubectl apply -f mysql-deployment.yaml

# 等待数据库就绪
kubectl wait --for=condition=ready pod -l app=mysql --timeout=300s

# 部署应用
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

# 部署监控
kubectl apply -f prometheus-config.yaml
kubectl apply -f prometheus-deployment.yaml
kubectl apply -f grafana-deployment.yaml

echo "部署完成!"
echo "应用地址: $(kubectl get svc my-app-service -o jsonpath='{.status.loadBalancer.ingress[0].ip}')"

6. 快速故障恢复脚本

#!/bin/bash
# quick-recovery.sh

# 检查应用状态
echo "检查应用状态..."
kubectl get pods -l app=my-app

# 重启异常Pod
echo "重启异常Pod..."
kubectl get pods -l app=my-app --field-selector=status.phase!=Running -o name | xargs kubectl delete

# 检查资源使用
echo "检查资源使用..."
kubectl top nodes
kubectl top pods -l app=my-app

# 检查事件日志
echo "最近事件..."
kubectl get events --sort-by='.lastTimestamp' | tail -10

echo "恢复操作完成"

总结

通过这个完整的实践手册,你已掌握:

  • ✅ 应用容器化与镜像管理
  • ✅ Kubernetes部署与服务暴露
  • ✅ 自动化运维与故障恢复
  • ✅ 全方位监控体系搭建

关键要点:

  • 每个命令都有明确用途,日常运维足够使用
  • 监控是稳定运行的保障,必须部署
  • 脚本化操作提高效率,减少人为错误

立即动手实践,30分钟搭建完整Kubernetes应用栈!遇到问题?查看日志和事件,快速定位解决。

下一步提升:

  • 添加CI/CD流水线自动化部署
  • 实现多环境配置管理
  • 设置网络策略和安全策略

记住:实践出真知,多动手操作才能真正掌握Kubernetes!

© 版权声明
THE END
如果内容对您有所帮助,就支持一下吧!
点赞0 分享
十二风凌的头像 - 鹿快
评论 共7条

请登录后发表评论