Building AI Service API with WebSocket

Building a real-time AI service requires careful consideration of API design, data structures, and communication patterns. In this article, we'll explore how to create a WebSocket-based AI Service API for conducting mock interviews with AI, supporting multiple roles and interview types.

System Design Overview

The AI Service API is designed around these key components:

WebSocket Interface: Real-time bidirectional communication
Role-based Access: Support for interviewer and candidate roles
Multiple Interview Types: Behavioral, coding, and HR interviews
Language Support: Multi-language responses
Customizable Prompts: Flexible interview scenarios

Core Data Structures

Request Interface

The API uses a well-defined request structure:

interface AIRequest {
  // Required fields
  transcripts: TranscriptType[];      // Conversation history
  model: ModelType;                   // AI model selection
  promptType: PromptType;             // Interview type
  role: Role;                         // User role
  company: string;                    // Target company
  position: string;                   // Job position
  sessionId: string;                  // Session identifier
 
  // Optional fields
  language?: string;                  // Response language
  userEmail?: string;                 // User identification
  stories?: Story[];                  // Behavioral examples
  personalInfo?: PersonalInfo;        // User background
}

Response Structure

Responses are streamed in chunks for real-time interaction:

interface AIResponse {
  type: 'chunk' | 'error' | 'done';  // Response type
  responseType: 'mock' | 'assistant'; // AI role
  content?: string;                  // Response content
  error?: string;                    // Error details
  originalTranscriptId: string;      // Reference ID
  sessionId: string;                 // Session tracking
}

Feature Implementation

1. Language Support

The service supports multiple languages with easy extensibility:

type SupportedLanguage = 'en' | 'zh' | string;
 
const languageHandler = {
  en: () => ({ locale: 'en-US', format: 'MM/DD/YYYY' }),
  zh: () => ({ locale: 'zh-CN', format: 'YYYY年MM月DD日' }),
  // Add more languages as needed
};

2. Role-based Access

Implement role-based features and permissions:

enum UserRole {
  INTERVIEWER = 'interviewer',
  CANDIDATE = 'candidate',
  ADMIN = 'admin'
}
 
interface AccessControl {
  role: UserRole;
  permissions: string[];
  features: Set<string>;
}

3. Interview Types

Support different interview scenarios:

enum InterviewType {
  BEHAVIORAL = 'BQ',
  CODING = 'CODE',
  HR = 'HR'
}
 
interface InterviewConfig {
  type: InterviewType;
  prompts: string[];
  evaluation: EvaluationCriteria;
  timeLimit?: number;
}

Implementation Best Practices

1. Error Handling

Implement comprehensive error handling:

class AIServiceError extends Error {
  constructor(
    public code: string,
    public message: string,
    public details?: any
  ) {
    super(message);
    this.name = 'AIServiceError';
  }
}
 
const errorHandler = {
  handleConnectionError: (error: Error) => {
    // Handle connection issues
  },
  handleModelError: (error: Error) => {
    // Handle AI model errors
  },
  handleValidationError: (error: Error) => {
    // Handle request validation errors
  }
};

2. Request Validation

Validate incoming requests:

const validateRequest = (request: AIRequest): boolean => {
  const requiredFields = [
    'transcripts',
    'model',
    'promptType',
    'role',
    'company',
    'position',
    'sessionId'
  ];
 
  return requiredFields.every(field => 
    request[field] !== undefined && request[field] !== null
  );
};

3. Response Streaming

Handle streaming responses efficiently:

const handleStreamResponse = (
  response: AIResponse,
  onChunk: (chunk: string) => void,
  onComplete: () => void
) => {
  if (response.type === 'chunk') {
    onChunk(response.content);
  } else if (response.type === 'done') {
    onComplete();
  } else {
    throw new AIServiceError(
      'STREAM_ERROR',
      'Error in response stream'
    );
  }
};

Security Considerations

Authentication
- Implement JWT validation
- Session management
- Rate limiting
Data Protection
- Encrypt sensitive information
- Validate input data
- Sanitize responses
Access Control
- Role-based permissions
- Feature flags
- Usage quotas

Performance Optimization

Connection Management
- Keep-alive mechanisms
- Automatic reconnection
- Connection pooling
Data Efficiency
- Message compression
- Batch processing
- Caching strategies

Monitoring and Logging

Implement comprehensive monitoring:

interface AIServiceMetrics {
  requestCount: number;
  responseTime: number;
  errorRate: number;
  activeConnections: number;
}
 
const monitoringService = {
  logRequest: (request: AIRequest) => {
    // Log request details
  },
  trackMetrics: (metrics: AIServiceMetrics) => {
    // Update monitoring dashboards
  },
  alertOnError: (error: AIServiceError) => {
    // Send alerts for critical issues
  }
};

Testing Strategy

Unit Tests
- Request validation
- Error handling
- Data transformations
Integration Tests
- WebSocket connections
- Response streaming
- Error scenarios
Load Tests
- Connection limits
- Response times
- Error rates

Conclusion

Building a real-time AI Service API requires careful attention to:

Clear interface definitions
Robust error handling
Efficient data streaming
Security measures
Performance optimization
Comprehensive monitoring

By following these patterns and best practices, you can create a reliable and scalable AI service that provides real-time interview simulations with different roles and types.