Systemd Services & Monitoring

System Administration

Hướng dẫn chi tiết về quản lý systemd services cho White-Label instances, monitoring logs, resource usage và maintenance tasks.

Quản lý Systemd Services

Service Commands cơ bản

Check Service Status

Kiểm tra trạng thái và thông tin chi tiết của instance

systemctl status ov-panel-instance@<uuid>

Output mẫu:

● ov-panel-instance@a1b2c3d4-5678-90ab-cdef-1234567890ab.service
     Loaded: loaded (/etc/systemd/system/ov-panel-instance@.service; enabled)
     Active: active (running) since Mon 2025-11-17 10:30:45 UTC; 2h 15min ago
   Main PID: 12345 (python)
      Tasks: 8 (limit: 4915)
     Memory: 85.2M
        CPU: 1min 23.456s
     CGroup: /system.slice/ov-panel-instance@a1b2c3d4.service
             └─12345 /opt/ov-panel/venv/bin/python main.py

Start / Stop / Restart

Điều khiển lifecycle của instance

Start instance:

systemctl start ov-panel-instance@<uuid>

Stop instance:

systemctl stop ov-panel-instance@<uuid>

Restart instance:

systemctl restart ov-panel-instance@<uuid>

Enable / Disable Auto-Start

Cấu hình instance tự động start khi boot

Enable auto-start:

systemctl enable ov-panel-instance@<uuid>

Instance sẽ tự động start khi server reboot

Disable auto-start:

systemctl disable ov-panel-instance@<uuid>

Instance sẽ KHÔNG tự động start khi reboot

Batch Management

Quản lý nhiều instances cùng lúc:

Start/Stop tất cả instances

# Start tất cả instances
systemctl start ov-panel-instance@*.service

# Stop tất cả instances
systemctl stop ov-panel-instance@*.service

# Restart tất cả instances
systemctl restart ov-panel-instance@*.service

List tất cả Instance Services

systemctl list-units 'ov-panel-instance@*'

Output mẫu:

UNIT                                                        LOAD   ACTIVE SUB     
ov-panel-instance@a1b2c3d4-5678-90ab-cdef-1234567890ab.service loaded active running
ov-panel-instance@b2c3d4e5-6789-01bc-def0-234567890abc.service loaded active running
ov-panel-instance@c3d4e5f6-7890-12cd-ef01-34567890abcd.service loaded active running

Log Management

Journalctl Logs

View Real-time Logs

Theo dõi logs trực tiếp từ journalctl

Follow logs (real-time):

journalctl -u ov-panel-instance@<uuid> -f

Last 100 lines:

journalctl -u ov-panel-instance@<uuid> -n 100

Logs từ 1 giờ trước:

journalctl -u ov-panel-instance@<uuid> --since "1 hour ago"

Logs theo ngày cụ thể:

journalctl -u ov-panel-instance@<uuid> \
  --since "2025-11-17" \
  --until "2025-11-18"

File Logs

Instance logs được lưu trong thư mục riêng:

FileMô tảCommand
output.logStdout logstail -f /opt/ov-panel-instances/instance-<uuid>/logs/output.log
error.logStderr logstail -f /opt/ov-panel-instances/instance-<uuid>/logs/error.log
Log Commands
# View last 200 lines of output log
tail -n 200 /opt/ov-panel-instances/instance-<uuid>/logs/output.log

# Follow error log
tail -f /opt/ov-panel-instances/instance-<uuid>/logs/error.log

# Search for specific errors
grep -i "error|exception" /opt/ov-panel-instances/instance-<uuid>/logs/error.log

Resource Monitoring

CPU & Memory Usage

Kiểm tra resource usage của instances

Một instance cụ thể:

systemctl status ov-panel-instance@<uuid> | grep -E 'Memory|CPU'

# Output:
     Memory: 85.2M
        CPU: 1min 23.456s

Tất cả instances:

for svc in $(systemctl list-units 'ov-panel-instance@*' --no-legend | awk '{print $1}'); do
    echo "=== $svc ==="
    systemctl status $svc | grep -E 'Memory|CPU'
done

Disk Usage

Kiểm tra dung lượng đĩa của instances

# Instance directories
du -sh /opt/ov-panel-instances/instance-*

# Database sizes
du -sh /opt/ov-panel-instances/instance-*/data/ov-panel.db

# Output mẫu:
150M    /opt/ov-panel-instances/instance-a1b2c3d4
230M    /opt/ov-panel-instances/instance-b2c3d4e5
180M    /opt/ov-panel-instances/instance-c3d4e5f6

Automated Monitoring Scripts

Health Check Script

Script tự động kiểm tra và alert khi có vấn đề

health-check-instances.sh
#!/bin/bash
# /opt/scripts/health-check-instances.sh

# Get all instances from database
instances=$(python3 -c "
from backend.db.engine import sessionLocal
from backend.db.models import WhiteLabelInstance
db = sessionLocal()
instances = db.query(WhiteLabelInstance).all()
for i in instances:
    print(f'{i.instance_id}:{i.port}')
")

for entry in $instances; do
    IFS=':' read -r uuid port <<< "$entry"
    
    # HTTP health check
    if curl -sf "http://localhost:$port/api/health" > /dev/null; then
        echo "✓ Instance $uuid (port $port) is healthy"
    else
        echo "✗ Instance $uuid (port $port) is DOWN"
        
        # Send alert (email, Slack, etc.)
        # mail -s "Alert: Instance $uuid DOWN" admin@example.com
        
        # Auto restart (optional)
        # systemctl restart ov-panel-instance@$uuid
    fi
done

Setup Cron Job:

# Chạy health check mỗi 5 phút
crontab -e

# Thêm dòng:
*/5 * * * * /opt/scripts/health-check-instances.sh >> /var/log/instance-health.log 2>&1

Error Detection Script

Tự động phát hiện errors trong logs

check-instance-errors.sh
#!/bin/bash
# /opt/scripts/check-instance-errors.sh

for logfile in /opt/ov-panel-instances/instance-*/logs/error.log; do
    uuid=$(echo $logfile | grep -oP 'instance-\K[^/]+')
    
    # Check for errors in last 1 hour
    errors=$(find $logfile -mmin -60 -exec grep -i "error\|exception\|critical" {} \; | wc -l)
    
    if [ $errors -gt 10 ]; then
        echo "WARNING: Instance $uuid has $errors errors in last hour"
        
        # Send alert
        # Example: Slack webhook
        # curl -X POST -H 'Content-type: application/json' \
        #   --data "{\"text\":\"Instance $uuid has $errors errors\"}" \
        #   https://hooks.slack.com/services/YOUR/WEBHOOK/URL
    fi
done

Backup và Recovery

Automated Backup Script

Backup databases định kỳ

backup-instances.sh
#!/bin/bash
# /opt/scripts/backup-instances.sh

BACKUP_DIR="/opt/backups/instances"
DATE=$(date +%Y%m%d_%H%M%S)

mkdir -p $BACKUP_DIR

# Backup mỗi instance database
for instance in /opt/ov-panel-instances/instance-*/; do
    uuid=$(basename $instance | sed 's/instance-//')
    db_file="$instance/data/ov-panel.db"
    
    if [ -f "$db_file" ]; then
        cp "$db_file" "$BACKUP_DIR/${uuid}_${DATE}.db"
        echo "Backed up instance $uuid"
    fi
done

# Cleanup backups older than 30 days
find $BACKUP_DIR -name "*.db" -mtime +30 -delete

echo "Backup completed at $(date)"

Setup Cron Job (Daily 2AM):

0 2 * * * /opt/scripts/backup-instances.sh >> /var/log/instance-backup.log 2>&1

Maintenance Tasks

TaskFrequencyCommand
Health Check
5 minutes
/opt/scripts/health-check-instances.sh
Database Backup
Daily
/opt/scripts/backup-instances.sh
Log Cleanup
Weekly
find /opt/ov-panel-instances/*/logs/ -mtime +30 -delete
System Updates
Weekly
apt update && apt upgrade