搭建完整的前后端分离架构,实现数据概览、预测分析、聚类分析等核心功能模块 详细版: feat: 初始化员工缺勤分析系统项目 - 后端:基于 Flask 搭建 RESTful API,包含数据概览、特征分析、预测模型、聚类分析四大模块 - 前端:基于 Vue.js 构建单页应用,实现 Dashboard、预测、聚类、因子分析等页面 - 模型:集成随机森林、XGBoost、LightGBM、Stacking 等多种机器学习模型 - 文档:完成需求规格说明、系统架构设计、接口设计、数据设计、UI原型设计等文档
892 lines
19 KiB
Markdown
892 lines
19 KiB
Markdown
# 接口设计文档
|
||
|
||
## 基于多维特征挖掘的员工缺勤分析与预测系统
|
||
|
||
**文档版本**:V1.0
|
||
**编写日期**:2026年3月
|
||
**编写人**:张硕
|
||
|
||
---
|
||
|
||
## 1. 概述
|
||
|
||
### 1.1 接口规范
|
||
|
||
本系统采用RESTful API设计风格,所有接口遵循以下规范:
|
||
|
||
| 项目 | 规范 |
|
||
|------|------|
|
||
| 协议 | HTTP/HTTPS |
|
||
| 数据格式 | JSON |
|
||
| 字符编码 | UTF-8 |
|
||
| 时间格式 | ISO 8601 (YYYY-MM-DD HH:mm:ss) |
|
||
|
||
### 1.2 基础路径
|
||
|
||
| 环境 | 基础路径 |
|
||
|------|----------|
|
||
| 开发环境 | http://localhost:5000/api |
|
||
| 生产环境 | http://your-domain/api |
|
||
|
||
### 1.3 响应格式
|
||
|
||
#### 成功响应
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
// 具体数据
|
||
}
|
||
}
|
||
```
|
||
|
||
#### 错误响应
|
||
|
||
```json
|
||
{
|
||
"code": 400,
|
||
"message": "错误描述",
|
||
"data": null
|
||
}
|
||
```
|
||
|
||
### 1.4 状态码说明
|
||
|
||
| 状态码 | 说明 |
|
||
|--------|------|
|
||
| 200 | 请求成功 |
|
||
| 400 | 请求参数错误 |
|
||
| 404 | 资源不存在 |
|
||
| 500 | 服务器内部错误 |
|
||
|
||
---
|
||
|
||
## 2. 数据概览模块
|
||
|
||
### 2.1 获取基础统计指标
|
||
|
||
**接口路径**:`GET /api/overview/stats`
|
||
|
||
**接口描述**:获取数据集的基础统计指标
|
||
|
||
**请求参数**:无
|
||
|
||
**响应示例**:
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"total_records": 740,
|
||
"total_employees": 36,
|
||
"total_absent_hours": 1184,
|
||
"avg_absent_hours": 1.6,
|
||
"max_absent_hours": 120,
|
||
"min_absent_hours": 0,
|
||
"high_risk_ratio": 0.15
|
||
}
|
||
}
|
||
```
|
||
|
||
**字段说明**:
|
||
|
||
| 字段 | 类型 | 说明 |
|
||
|------|------|------|
|
||
| total_records | int | 总记录数 |
|
||
| total_employees | int | 员工总数 |
|
||
| total_absent_hours | float | 缺勤总时长(小时) |
|
||
| avg_absent_hours | float | 平均缺勤时长(小时) |
|
||
| max_absent_hours | int | 最大缺勤时长(小时) |
|
||
| min_absent_hours | int | 最小缺勤时长(小时) |
|
||
| high_risk_ratio | float | 高风险员工占比 |
|
||
|
||
---
|
||
|
||
### 2.2 获取月度趋势数据
|
||
|
||
**接口路径**:`GET /api/overview/trend`
|
||
|
||
**接口描述**:获取全年12个月的缺勤趋势数据
|
||
|
||
**请求参数**:无
|
||
|
||
**响应示例**:
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"months": ["1月", "2月", "3月", "4月", "5月", "6月", "7月", "8月", "9月", "10月", "11月", "12月"],
|
||
"total_hours": [80, 65, 90, 75, 100, 85, 110, 95, 70, 88, 92, 78],
|
||
"avg_hours": [1.2, 1.0, 1.4, 1.1, 1.5, 1.3, 1.7, 1.4, 1.1, 1.3, 1.4, 1.2],
|
||
"record_counts": [67, 65, 64, 68, 67, 65, 65, 68, 64, 68, 66, 65]
|
||
}
|
||
}
|
||
```
|
||
|
||
**字段说明**:
|
||
|
||
| 字段 | 类型 | 说明 |
|
||
|------|------|------|
|
||
| months | string[] | 月份列表 |
|
||
| total_hours | float[] | 每月缺勤总时长 |
|
||
| avg_hours | float[] | 每月平均缺勤时长 |
|
||
| record_counts | int[] | 每月记录数 |
|
||
|
||
---
|
||
|
||
### 2.3 获取星期分布数据
|
||
|
||
**接口路径**:`GET /api/overview/weekday`
|
||
|
||
**接口描述**:获取周一至周五的缺勤分布数据
|
||
|
||
**请求参数**:无
|
||
|
||
**响应示例**:
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"weekdays": ["周一", "周二", "周三", "周四", "周五"],
|
||
"weekday_codes": [2, 3, 4, 5, 6],
|
||
"total_hours": [180, 200, 190, 210, 250],
|
||
"avg_hours": [1.2, 1.4, 1.3, 1.4, 1.7],
|
||
"record_counts": [150, 143, 146, 150, 147]
|
||
}
|
||
}
|
||
```
|
||
|
||
**字段说明**:
|
||
|
||
| 字段 | 类型 | 说明 |
|
||
|------|------|------|
|
||
| weekdays | string[] | 星期名称列表 |
|
||
| weekday_codes | int[] | 星期代码(2=周一, 6=周五) |
|
||
| total_hours | float[] | 每天缺勤总时长 |
|
||
| avg_hours | float[] | 每天平均缺勤时长 |
|
||
| record_counts | int[] | 每天记录数 |
|
||
|
||
---
|
||
|
||
### 2.4 获取缺勤原因分布
|
||
|
||
**接口路径**:`GET /api/overview/reasons`
|
||
|
||
**接口描述**:获取各类缺勤原因的分布数据
|
||
|
||
**请求参数**:无
|
||
|
||
**响应示例**:
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"reasons": [
|
||
{
|
||
"code": 23,
|
||
"name": "医疗咨询",
|
||
"count": 150,
|
||
"percentage": 20.3
|
||
},
|
||
{
|
||
"code": 28,
|
||
"name": "牙科咨询",
|
||
"count": 120,
|
||
"percentage": 16.2
|
||
},
|
||
{
|
||
"code": 27,
|
||
"name": "理疗",
|
||
"count": 100,
|
||
"percentage": 13.5
|
||
}
|
||
// ... 更多原因
|
||
]
|
||
}
|
||
}
|
||
```
|
||
|
||
**字段说明**:
|
||
|
||
| 字段 | 类型 | 说明 |
|
||
|------|------|------|
|
||
| code | int | 缺勤原因代码 |
|
||
| name | string | 缺勤原因名称 |
|
||
| count | int | 出现次数 |
|
||
| percentage | float | 占比百分比 |
|
||
|
||
**缺勤原因对照表**:
|
||
|
||
| 代码 | 名称 | 代码 | 名称 |
|
||
|------|------|------|------|
|
||
| 0 | 未知原因 | 15 | 妊娠相关 |
|
||
| 1 | 传染病 | 16 | 围产期疾病 |
|
||
| 2 | 肿瘤 | 17 | 先天性畸形 |
|
||
| 3 | 血液疾病 | 18 | 症状体征 |
|
||
| 4 | 内分泌疾病 | 19 | 损伤中毒 |
|
||
| 5 | 精神行为障碍 | 20 | 外部原因 |
|
||
| 6 | 神经系统疾病 | 21 | 健康因素 |
|
||
| 7 | 眼部疾病 | 22 | 医疗随访 |
|
||
| 8 | 耳部疾病 | 23 | 医疗咨询 |
|
||
| 9 | 循环系统疾病 | 24 | 献血 |
|
||
| 10 | 呼吸系统疾病 | 25 | 实验室检查 |
|
||
| 11 | 消化系统疾病 | 26 | 无故缺勤 |
|
||
| 12 | 皮肤疾病 | 27 | 理疗 |
|
||
| 13 | 肌肉骨骼疾病 | 28 | 牙科咨询 |
|
||
| 14 | 泌尿生殖疾病 | - | - |
|
||
|
||
---
|
||
|
||
### 2.5 获取季节分布数据
|
||
|
||
**接口路径**:`GET /api/overview/seasons`
|
||
|
||
**接口描述**:获取四季的缺勤分布数据
|
||
|
||
**请求参数**:无
|
||
|
||
**响应示例**:
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"seasons": [
|
||
{
|
||
"code": 1,
|
||
"name": "夏季",
|
||
"total_hours": 320,
|
||
"avg_hours": 1.5,
|
||
"record_count": 213,
|
||
"percentage": 27.0
|
||
},
|
||
{
|
||
"code": 2,
|
||
"name": "秋季",
|
||
"total_hours": 290,
|
||
"avg_hours": 1.4,
|
||
"record_count": 207,
|
||
"percentage": 28.0
|
||
},
|
||
{
|
||
"code": 3,
|
||
"name": "冬季",
|
||
"total_hours": 280,
|
||
"avg_hours": 1.3,
|
||
"record_count": 215,
|
||
"percentage": 29.1
|
||
},
|
||
{
|
||
"code": 4,
|
||
"name": "春季",
|
||
"total_hours": 294,
|
||
"avg_hours": 1.4,
|
||
"record_count": 210,
|
||
"percentage": 28.4
|
||
}
|
||
]
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 3. 影响因素分析模块
|
||
|
||
### 3.1 获取特征重要性排序
|
||
|
||
**接口路径**:`GET /api/analysis/importance`
|
||
|
||
**接口描述**:获取各特征对缺勤的影响权重
|
||
|
||
**请求参数**:
|
||
|
||
| 参数名 | 类型 | 必填 | 说明 |
|
||
|--------|------|------|------|
|
||
| model | string | 否 | 模型类型(rf/xgboost),默认rf |
|
||
|
||
**响应示例**:
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"model_type": "random_forest",
|
||
"features": [
|
||
{
|
||
"name": "Reason for absence",
|
||
"name_cn": "缺勤原因",
|
||
"importance": 0.35,
|
||
"rank": 1
|
||
},
|
||
{
|
||
"name": "Transportation expense",
|
||
"name_cn": "交通费用",
|
||
"importance": 0.12,
|
||
"rank": 2
|
||
},
|
||
{
|
||
"name": "Distance from Residence to Work",
|
||
"name_cn": "通勤距离",
|
||
"importance": 0.10,
|
||
"rank": 3
|
||
},
|
||
{
|
||
"name": "Service time",
|
||
"name_cn": "工龄",
|
||
"importance": 0.08,
|
||
"rank": 4
|
||
},
|
||
{
|
||
"name": "Age",
|
||
"name_cn": "年龄",
|
||
"importance": 0.07,
|
||
"rank": 5
|
||
},
|
||
{
|
||
"name": "Work load Average/day",
|
||
"name_cn": "日均工作负荷",
|
||
"importance": 0.06,
|
||
"rank": 6
|
||
},
|
||
{
|
||
"name": "Body mass index",
|
||
"name_cn": "BMI指数",
|
||
"importance": 0.05,
|
||
"rank": 7
|
||
},
|
||
{
|
||
"name": "Social drinker",
|
||
"name_cn": "饮酒习惯",
|
||
"importance": 0.04,
|
||
"rank": 8
|
||
},
|
||
{
|
||
"name": "Hit target",
|
||
"name_cn": "达标率",
|
||
"importance": 0.03,
|
||
"rank": 9
|
||
},
|
||
{
|
||
"name": "Son",
|
||
"name_cn": "子女数量",
|
||
"importance": 0.03,
|
||
"rank": 10
|
||
},
|
||
{
|
||
"name": "Pet",
|
||
"name_cn": "宠物数量",
|
||
"importance": 0.02,
|
||
"rank": 11
|
||
},
|
||
{
|
||
"name": "Education",
|
||
"name_cn": "学历",
|
||
"importance": 0.02,
|
||
"rank": 12
|
||
},
|
||
{
|
||
"name": "Social smoker",
|
||
"name_cn": "吸烟习惯",
|
||
"importance": 0.01,
|
||
"rank": 13
|
||
}
|
||
]
|
||
}
|
||
}
|
||
```
|
||
|
||
**字段说明**:
|
||
|
||
| 字段 | 类型 | 说明 |
|
||
|------|------|------|
|
||
| model_type | string | 模型类型 |
|
||
| features | array | 特征列表 |
|
||
| name | string | 特征英文名 |
|
||
| name_cn | string | 特征中文名 |
|
||
| importance | float | 重要性得分(0-1) |
|
||
| rank | int | 排名 |
|
||
|
||
---
|
||
|
||
### 3.2 获取相关性矩阵
|
||
|
||
**接口路径**:`GET /api/analysis/correlation`
|
||
|
||
**接口描述**:获取特征之间的相关系数矩阵
|
||
|
||
**请求参数**:无
|
||
|
||
**响应示例**:
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"features": ["Age", "Service time", "Distance", "Work load", "BMI", "Absent hours"],
|
||
"matrix": [
|
||
[1.00, 0.67, 0.12, 0.08, 0.15, 0.05],
|
||
[0.67, 1.00, 0.10, 0.05, 0.12, 0.08],
|
||
[0.12, 0.10, 1.00, 0.03, 0.05, 0.18],
|
||
[0.08, 0.05, 0.03, 1.00, 0.02, 0.10],
|
||
[0.15, 0.12, 0.05, 0.02, 1.00, 0.06],
|
||
[0.05, 0.08, 0.18, 0.10, 0.06, 1.00]
|
||
]
|
||
}
|
||
}
|
||
```
|
||
|
||
**字段说明**:
|
||
|
||
| 字段 | 类型 | 说明 |
|
||
|------|------|------|
|
||
| features | string[] | 特征名称列表 |
|
||
| matrix | float[][] | 相关系数矩阵(n×n) |
|
||
|
||
---
|
||
|
||
### 3.3 群体对比分析
|
||
|
||
**接口路径**:`GET /api/analysis/compare`
|
||
|
||
**接口描述**:按指定维度分组对比缺勤时长
|
||
|
||
**请求参数**:
|
||
|
||
| 参数名 | 类型 | 必填 | 说明 |
|
||
|--------|------|------|------|
|
||
| dimension | string | 是 | 对比维度(drinker/smoker/education/children/pet) |
|
||
|
||
**响应示例**(dimension=drinker):
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"dimension": "drinker",
|
||
"dimension_name": "饮酒习惯",
|
||
"groups": [
|
||
{
|
||
"name": "不饮酒",
|
||
"value": 0,
|
||
"avg_hours": 1.2,
|
||
"count": 400,
|
||
"percentage": 54.1
|
||
},
|
||
{
|
||
"name": "饮酒",
|
||
"value": 1,
|
||
"avg_hours": 2.1,
|
||
"count": 340,
|
||
"percentage": 45.9
|
||
}
|
||
],
|
||
"difference": {
|
||
"value": 0.9,
|
||
"percentage": 75.0
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
**dimension参数说明**:
|
||
|
||
| 值 | 说明 | 分组 |
|
||
|------|------|------|
|
||
| drinker | 饮酒习惯 | 不饮酒(0) / 饮酒(1) |
|
||
| smoker | 吸烟习惯 | 不吸烟(0) / 吸烟(1) |
|
||
| education | 学历 | 高中(1) / 本科(2) / 研究生及以上(3-4) |
|
||
| children | 子女 | 无子女(0) / 有子女(≥1) |
|
||
| pet | 宠物 | 无宠物(0) / 有宠物(≥1) |
|
||
|
||
---
|
||
|
||
## 4. 预测模块
|
||
|
||
### 4.1 单次缺勤预测
|
||
|
||
**接口路径**:`POST /api/predict/single`
|
||
|
||
**接口描述**:根据输入的员工属性预测缺勤时长
|
||
|
||
**请求头**:
|
||
|
||
```
|
||
Content-Type: application/json
|
||
```
|
||
|
||
**请求参数**:
|
||
|
||
```json
|
||
{
|
||
"reason_for_absence": 23,
|
||
"month_of_absence": 7,
|
||
"day_of_week": 3,
|
||
"seasons": 1,
|
||
"transportation_expense": 289,
|
||
"distance": 36,
|
||
"service_time": 13,
|
||
"age": 33,
|
||
"work_load": 239.55,
|
||
"hit_target": 97,
|
||
"disciplinary_failure": 0,
|
||
"education": 1,
|
||
"son": 2,
|
||
"social_drinker": 1,
|
||
"social_smoker": 0,
|
||
"pet": 1,
|
||
"bmi": 30
|
||
}
|
||
```
|
||
|
||
**参数说明**:
|
||
|
||
| 参数名 | 类型 | 取值范围 | 说明 |
|
||
|--------|------|----------|------|
|
||
| reason_for_absence | int | 0-28 | 缺勤原因代码 |
|
||
| month_of_absence | int | 1-12 | 缺勤月份 |
|
||
| day_of_week | int | 2-6 | 星期(2=周一, 6=周五) |
|
||
| seasons | int | 1-4 | 季节(1=夏, 4=春) |
|
||
| transportation_expense | int | 100-400 | 交通费用 |
|
||
| distance | int | 1-60 | 通勤距离(公里) |
|
||
| service_time | int | 1-30 | 工龄(年) |
|
||
| age | int | 18-60 | 年龄 |
|
||
| work_load | float | 200-350 | 日均工作负荷 |
|
||
| hit_target | int | 80-100 | 达标率(%) |
|
||
| disciplinary_failure | int | 0-1 | 是否违纪(0=否, 1=是) |
|
||
| education | int | 1-4 | 学历(1=高中, 4=博士) |
|
||
| son | int | 0-5 | 子女数量 |
|
||
| social_drinker | int | 0-1 | 是否饮酒 |
|
||
| social_smoker | int | 0-1 | 是否吸烟 |
|
||
| pet | int | 0-10 | 宠物数量 |
|
||
| bmi | float | 18-40 | BMI指数 |
|
||
|
||
**响应示例**:
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"predicted_hours": 5.2,
|
||
"risk_level": "medium",
|
||
"risk_label": "中风险",
|
||
"confidence": 0.85,
|
||
"model_used": "random_forest"
|
||
}
|
||
}
|
||
```
|
||
|
||
**响应字段说明**:
|
||
|
||
| 字段 | 类型 | 说明 |
|
||
|------|------|------|
|
||
| predicted_hours | float | 预测缺勤时长(小时) |
|
||
| risk_level | string | 风险等级(low/medium/high) |
|
||
| risk_label | string | 风险等级中文标签 |
|
||
| confidence | float | 模型置信度(0-1) |
|
||
| model_used | string | 使用的模型 |
|
||
|
||
**风险等级判定**:
|
||
|
||
| 预测时长 | risk_level | risk_label | 颜色 |
|
||
|----------|------------|------------|------|
|
||
| < 4小时 | low | 低风险 | 绿色 |
|
||
| 4-8小时 | medium | 中风险 | 黄色 |
|
||
| > 8小时 | high | 高风险 | 红色 |
|
||
|
||
---
|
||
|
||
### 4.2 获取模型性能信息
|
||
|
||
**接口路径**:`GET /api/predict/model-info`
|
||
|
||
**接口描述**:获取当前预测模型的性能指标
|
||
|
||
**请求参数**:无
|
||
|
||
**响应示例**:
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"models": [
|
||
{
|
||
"name": "random_forest",
|
||
"name_cn": "随机森林",
|
||
"metrics": {
|
||
"r2": 0.82,
|
||
"mse": 15.5,
|
||
"rmse": 3.94,
|
||
"mae": 2.8
|
||
},
|
||
"is_active": true
|
||
},
|
||
{
|
||
"name": "xgboost",
|
||
"name_cn": "XGBoost",
|
||
"metrics": {
|
||
"r2": 0.85,
|
||
"mse": 12.8,
|
||
"rmse": 3.58,
|
||
"mae": 2.5
|
||
},
|
||
"is_active": false
|
||
}
|
||
],
|
||
"training_info": {
|
||
"train_samples": 592,
|
||
"test_samples": 148,
|
||
"feature_count": 17,
|
||
"training_date": "2026-03-01"
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
**字段说明**:
|
||
|
||
| 字段 | 类型 | 说明 |
|
||
|------|------|------|
|
||
| r2 | float | 决定系数(越接近1越好) |
|
||
| mse | float | 均方误差(越小越好) |
|
||
| rmse | float | 均方根误差(越小越好) |
|
||
| mae | float | 平均绝对误差(越小越好) |
|
||
|
||
---
|
||
|
||
## 5. 聚类模块
|
||
|
||
### 5.1 获取聚类结果
|
||
|
||
**接口路径**:`GET /api/cluster/result`
|
||
|
||
**接口描述**:获取K-Means聚类分析结果
|
||
|
||
**请求参数**:
|
||
|
||
| 参数名 | 类型 | 必填 | 说明 |
|
||
|--------|------|------|------|
|
||
| n_clusters | int | 否 | 聚类数量,默认3 |
|
||
|
||
**响应示例**:
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"n_clusters": 3,
|
||
"clusters": [
|
||
{
|
||
"id": 0,
|
||
"name": "模范型员工",
|
||
"member_count": 120,
|
||
"percentage": 33.3,
|
||
"center": {
|
||
"age": 42,
|
||
"service_time": 18,
|
||
"work_load": 240,
|
||
"bmi": 25,
|
||
"absent_tendency": 0.8
|
||
},
|
||
"description": "工龄长、工作稳定、缺勤率低"
|
||
},
|
||
{
|
||
"id": 1,
|
||
"name": "压力型员工",
|
||
"member_count": 100,
|
||
"percentage": 27.8,
|
||
"center": {
|
||
"age": 28,
|
||
"service_time": 5,
|
||
"work_load": 280,
|
||
"bmi": 23,
|
||
"absent_tendency": 2.5
|
||
},
|
||
"description": "年轻、工龄短、工作负荷大、缺勤较多"
|
||
},
|
||
{
|
||
"id": 2,
|
||
"name": "生活习惯型员工",
|
||
"member_count": 140,
|
||
"percentage": 38.9,
|
||
"center": {
|
||
"age": 35,
|
||
"service_time": 10,
|
||
"work_load": 250,
|
||
"bmi": 30,
|
||
"absent_tendency": 1.5
|
||
},
|
||
"description": "BMI偏高、有饮酒习惯、中等缺勤率"
|
||
}
|
||
]
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
### 5.2 获取员工画像数据
|
||
|
||
**接口路径**:`GET /api/cluster/profile`
|
||
|
||
**接口描述**:获取用于绘制雷达图的员工画像数据
|
||
|
||
**请求参数**:
|
||
|
||
| 参数名 | 类型 | 必填 | 说明 |
|
||
|--------|------|------|------|
|
||
| n_clusters | int | 否 | 聚类数量,默认3 |
|
||
|
||
**响应示例**:
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"dimensions": ["年龄", "工龄", "工作负荷", "BMI", "缺勤倾向"],
|
||
"dimension_keys": ["age", "service_time", "work_load", "bmi", "absent_tendency"],
|
||
"clusters": [
|
||
{
|
||
"id": 0,
|
||
"name": "模范型",
|
||
"values": [0.75, 0.90, 0.60, 0.55, 0.20]
|
||
},
|
||
{
|
||
"id": 1,
|
||
"name": "压力型",
|
||
"values": [0.35, 0.20, 0.85, 0.45, 0.70]
|
||
},
|
||
{
|
||
"id": 2,
|
||
"name": "生活习惯型",
|
||
"values": [0.55, 0.50, 0.65, 0.80, 0.45]
|
||
}
|
||
]
|
||
}
|
||
}
|
||
```
|
||
|
||
**字段说明**:
|
||
|
||
| 字段 | 类型 | 说明 |
|
||
|------|------|------|
|
||
| dimensions | string[] | 雷达图维度名称(中文) |
|
||
| dimension_keys | string[] | 维度对应的英文键名 |
|
||
| clusters | array | 各聚类的画像数据 |
|
||
| values | float[] | 归一化后的特征值(0-1) |
|
||
|
||
---
|
||
|
||
### 5.3 获取聚类散点图数据
|
||
|
||
**接口路径**:`GET /api/cluster/scatter`
|
||
|
||
**接口描述**:获取用于绘制散点图的聚类分布数据
|
||
|
||
**请求参数**:
|
||
|
||
| 参数名 | 类型 | 必填 | 说明 |
|
||
|--------|------|------|------|
|
||
| n_clusters | int | 否 | 聚类数量,默认3 |
|
||
| x_axis | string | 否 | X轴维度,默认age |
|
||
| y_axis | string | 否 | Y轴维度,默认absent_hours |
|
||
|
||
**响应示例**:
|
||
|
||
```json
|
||
{
|
||
"code": 200,
|
||
"message": "success",
|
||
"data": {
|
||
"x_axis": "age",
|
||
"x_axis_name": "年龄",
|
||
"y_axis": "absent_hours",
|
||
"y_axis_name": "缺勤时长",
|
||
"points": [
|
||
{
|
||
"employee_id": 11,
|
||
"x": 33,
|
||
"y": 4,
|
||
"cluster_id": 2
|
||
},
|
||
{
|
||
"employee_id": 36,
|
||
"x": 50,
|
||
"y": 0,
|
||
"cluster_id": 0
|
||
}
|
||
// ... 更多数据点
|
||
],
|
||
"cluster_colors": {
|
||
"0": "#67C23A",
|
||
"1": "#E6A23C",
|
||
"2": "#F56C6C"
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## 6. 错误码定义
|
||
|
||
| 错误码 | 说明 | 解决方案 |
|
||
|--------|------|----------|
|
||
| 1001 | 数据文件不存在 | 检查数据文件路径 |
|
||
| 1002 | 数据文件格式错误 | 检查CSV文件格式 |
|
||
| 2001 | 模型文件不存在 | 先训练模型 |
|
||
| 2002 | 模型加载失败 | 重新训练并保存模型 |
|
||
| 3001 | 参数缺失 | 检查必填参数 |
|
||
| 3002 | 参数值超出范围 | 检查参数取值范围 |
|
||
| 4001 | 聚类数量无效 | n_clusters应在2-10之间 |
|
||
|
||
---
|
||
|
||
## 7. 附录
|
||
|
||
### 7.1 接口清单汇总
|
||
|
||
| 模块 | 接口 | 方法 | 说明 |
|
||
|------|------|------|------|
|
||
| 数据概览 | /api/overview/stats | GET | 基础统计指标 |
|
||
| 数据概览 | /api/overview/trend | GET | 月度趋势 |
|
||
| 数据概览 | /api/overview/weekday | GET | 星期分布 |
|
||
| 数据概览 | /api/overview/reasons | GET | 原因分布 |
|
||
| 数据概览 | /api/overview/seasons | GET | 季节分布 |
|
||
| 因素分析 | /api/analysis/importance | GET | 特征重要性 |
|
||
| 因素分析 | /api/analysis/correlation | GET | 相关性矩阵 |
|
||
| 因素分析 | /api/analysis/compare | GET | 群体对比 |
|
||
| 预测 | /api/predict/single | POST | 单次预测 |
|
||
| 预测 | /api/predict/model-info | GET | 模型信息 |
|
||
| 聚类 | /api/cluster/result | GET | 聚类结果 |
|
||
| 聚类 | /api/cluster/profile | GET | 员工画像 |
|
||
| 聚类 | /api/cluster/scatter | GET | 散点图数据 |
|
||
|
||
### 7.2 文档修改历史
|
||
|
||
| 版本 | 日期 | 修改人 | 修改内容 |
|
||
|------|------|--------|----------|
|
||
| V1.0 | 2026-03 | 张硕 | 初始版本 |
|
||
|
||
---
|
||
|
||
**文档结束**
|