API Workflow Architecture
This document presents comprehensive architectural patterns for integrating RunAnythingAI's APIs into your applications. These workflows are designed to optimize performance, user experience, and error resilience across different use cases.
Core Workflow: Text Generation with Status Management
The foundational workflow for text generation follows an asynchronous pattern with status polling:
sequenceDiagram
participant Client
participant TextAPI as RunAnythingAI Text API
participant StatusAPI as RunAnythingAI Status API
Client->>TextAPI: POST /api/text/{model_id}
Note over Client,TextAPI: Request queued for processing
TextAPI-->>Client: {id: "request-id", tokens: -1}
activate Client
loop Until Completed or Error
Client->>StatusAPI: GET api/v2/status/{id}
alt Processing
StatusAPI-->>Client: {status: "processing", progress: 0.45}
Note over Client: Wait with exponential backoff
else Completed
StatusAPI-->>Client: {status: "completed", reply: "Generated text"}
Note over Client: Process and display result
else Error
StatusAPI-->>Client: {status: "error", error: "Error message"}
Note over Client: Handle error appropriately
end
end
deactivate Client
Production Implementation
/**
* Generates text using RunAnythingAI's API with robust error handling and backoff strategy
*
* @param {string} model - Model ID (e.g., "default", "Witch", "Mage", "Succubus", "Lightning")
* @param {Array} messages - Conversation history
* @param {string} persona - Character persona description (optional)
* @param {string} botName - Name of the AI assistant/character
* @param {Object} options - Additional configuration options
* @returns {Promise<string>} The generated text response
*/
async function generateText(model, messages, persona, botName, options = {}) {
const {
maxTokens = 150,
temperature = 0.7,
apiKey = process.env.RUNANYTHING_API_KEY,
initialPollingDelay = 1000,
maxPollingDelay = 10000,
backoffFactor = 1.5,
maxAttempts = 30,
timeout = 120000,
onProgress = null
} = options;
// Validate required parameters
if (!model) throw new Error('Model ID is required');
if (!messages || !Array.isArray(messages)) throw new Error('Messages must be an array');
if (!apiKey) throw new Error('API key is required');
try {
// Step 1: Request text generation
console.log(`Requesting generation from model: ${model}`);
const requestBody = {
messages,
samplingParams: {
max_tokens: maxTokens,
temperature
}
};
// Add optional parameters if provided
if (persona) requestBody.persona = persona;
if (botName) requestBody.botName = botName;
const genResponse = await fetch(`https://api.runanythingai.com/api/text/${model}`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${apiKey}`
},
body: JSON.stringify(requestBody)
});
if (!genResponse.ok) {
const errorData = await genResponse.json().catch(() => ({}));
throw new Error(
`Generation request failed (${genResponse.status}): ${errorData.error || genResponse.statusText}`
);
}
const responseData = await genResponse.json();
const requestId = responseData.id;
if (!requestId) {
throw new Error('No request ID returned from generation API');
}
console.log(`Generation queued with ID: ${requestId}`);
// Step 2: Poll for completion with exponential backoff
let attempts = 0;
let currentDelay = initialPollingDelay;
const startTime = Date.now();
while (attempts < maxAttempts) {
// Check for timeout
if (Date.now() - startTime > timeout) {
throw new Error(`Request timed out after ${timeout}ms`);
}
attempts++;
const statusResponse = await fetch(`https://api.runanythingai.com/api/v2/status/${requestId}`, {
headers: { 'Authorization': `Bearer ${apiKey}` }
});
if (!statusResponse.ok) {
// Special handling for rate limiting
if (statusResponse.status === 429) {
console.warn('Rate limit hit during status check, increasing backoff...');
currentDelay = Math.min(currentDelay * 2, maxPollingDelay * 2);
await new Promise(resolve => setTimeout(resolve, currentDelay));
continue;
}
const errorText = await statusResponse.text();
throw new Error(`Status check failed (${statusResponse.status}): ${errorText}`);
}
const statusData = await statusResponse.json();
if (statusData.status === 'completed') {
console.log(`Generation completed in ${statusData.processingTime || 'unknown time'}`);
return statusData.reply;
} else if (statusData.status === 'error') {
throw new Error(`Generation error: ${statusData.error}`);
} else if (statusData.status === 'processing') {
// Call progress callback if provided
if (onProgress && typeof onProgress === 'function') {
onProgress({
progress: statusData.progress,
estimatedTimeRemaining: statusData.estimatedTimeRemaining,
attempts
});
}
// Apply exponential backoff with jitter
const jitter = Math.random() * 0.3 + 0.85; // 0.85-1.15
currentDelay = Math.min(currentDelay * backoffFactor * jitter, maxPollingDelay);
await new Promise(resolve => setTimeout(resolve, currentDelay));
} else {
throw new Error(`Unknown status: ${statusData.status}`);
}
}
throw new Error(`Maximum polling attempts (${maxAttempts}) reached`);
} catch (error) {
console.error('Text generation failed:', error);
throw error;
}
}
Advanced Workflow: Embodied Character Interaction
This workflow orchestrates character generation and text-to-speech synthesis to create immersive character experiences:
sequenceDiagram
participant Client
participant CharAPI as Character Endpoint
participant StatusAPI as Status API
participant TTSAPI as Text-to-Speech API
Client->>CharAPI: POST /api/text/{character} with persona
Note over CharAPI: Character endpoint prioritizes consistency with persona
CharAPI-->>Client: {id: "request-id"}
activate Client
loop With exponential backoff
Client->>StatusAPI: GET api/v2/status/{id}
alt Still processing
StatusAPI-->>Client: {status: "processing", progress: 0.6}
Note over Client: Update UI with generation progress
else Completed
StatusAPI-->>Client: {status: "completed", reply: "Character response"}
Note over Client: Proceed to speech synthesis
else Error
StatusAPI-->>Client: {status: "error", error: "Details"}
Note over Client: Handle error or retry
end
end
deactivate Client
Client->>TTSAPI: POST api/audio/full
Note over TTSAPI: Match voice to character persona
TTSAPI-->>Client: Audio binary data
Note over Client: Simultaneous display of text and audio playback
Enterprise Implementation
/**
* Orchestrates a complete character interaction with text generation and voice synthesis
*
* @param {string} character - Character endpoint ("Witch", "Mage", "Succubus", "Lightning")
* @param {Array} messages - Conversation history with proper structure
* @param {string} persona - Detailed character persona
* @param {string} botName - Character name
* @param {Object} options - Advanced configuration options
* @returns {Promise<Object>} Object containing text and audio response
*/
async function characterInteraction(character, messages, persona, botName, options = {}) {
const {
voice = "af_nicole",
speed = 1.0,
apiKey = process.env.RUNANYTHING_API_KEY,
maxTokens = 150,
temperature = 0.7,
pollingStrategy = "exponential", // "fixed", "exponential", or "adaptive"
onProgress = null,
audioFormat = "mp3",
cacheResults = true,
fallbackVoice = "af_james",
retryOptions = {
maxRetries: 3,
initialDelay: 1000,
maxDelay: 8000
}
} = options;
// Track performance metrics
const metrics = {
startTime: Date.now(),
generationTime: null,
ttsTime: null,
totalTime: null,
attempts: {
generation: 0,
status: 0,
tts: 0
}
};
try {
// Generate a unique request key for potential caching
const requestKey = cacheResults ?
generateCacheKey(character, messages, persona, maxTokens, temperature) : null;
// Check cache if enabled
if (cacheResults && requestKey) {
const cachedResult = checkResultCache(requestKey);
if (cachedResult) {
console.log('Returning cached character interaction');
return cachedResult;
}
}
// Step 1: Generate character response
console.log(`Initiating character interaction with ${character}`);
// Keep track of generation attempts for retries
let generationAttempt = 0;
let generationSuccess = false;
let generationId;
while (generationAttempt < retryOptions.maxRetries && !generationSuccess) {
try {
metrics.attempts.generation++;
generationAttempt++;
const genResponse = await fetch(`https://api.runanythingai.com/api/text/${character}`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${apiKey}`
},
body: JSON.stringify({
messages,
persona,
botName,
samplingParams: {
max_tokens: maxTokens,
temperature
}
})
});
if (!genResponse.ok) {
const errorBody = await genResponse.text();
throw new Error(`API error (${genResponse.status}): ${errorBody}`);
}
const data = await genResponse.json();
generationId = data.id;
if (!generationId) {
throw new Error('No request ID returned from API');
}
generationSuccess = true;
} catch (error) {
if (generationAttempt >= retryOptions.maxRetries) {
throw new Error(`Generation failed after ${retryOptions.maxRetries} attempts: ${error.message}`);
}
console.warn(`Generation attempt ${generationAttempt} failed, retrying: ${error.message}`);
await sleep(retryOptions.initialDelay * Math.pow(2, generationAttempt - 1));
}
}
// Step 2: Poll for completion using the appropriate strategy
let characterResponse;
let completed = false;
let statusAttempts = 0;
let currentDelay = 1000; // Start with 1 second
const pollStart = Date.now();
while (!completed && statusAttempts < 50) { // Safety cap at 50 attempts
metrics.attempts.status++;
statusAttempts++;
try {
const statusResponse = await fetch(`https://api.runanythingai.com/api/v2/status/${generationId}`, {
headers: { 'Authorization': `Bearer ${apiKey}` }
});
if (!statusResponse.ok) {
// Special handling for rate limiting
if (statusResponse.status === 429) {
console.warn('Rate limit hit during status check');
await sleep(Math.min(currentDelay * 2, 10000)); // Longer wait on rate limit
continue;
}
throw new Error(`Status check failed: ${statusResponse.status}`);
}
const statusData = await statusResponse.json();
if (statusData.status === 'completed') {
completed = true;
characterResponse = statusData.reply;
metrics.generationTime = Date.now() - metrics.startTime;
console.log(`Character response generated in ${metrics.generationTime}ms`);
// Trigger progress callback with completion status
if (onProgress) {
onProgress({
stage: 'generation',
status: 'completed',
progress: 1,
response: characterResponse
});
}
} else if (statusData.status === 'error') {
throw new Error(`Generation error: ${statusData.error}`);
} else {
// Still processing
if (onProgress) {
onProgress({
stage: 'generation',
status: 'processing',
progress: statusData.progress || statusAttempts / 30,
estimatedRemaining: statusData.estimatedTimeRemaining
});
}
// Apply the selected polling strategy
if (pollingStrategy === 'fixed') {
await sleep(1000);
} else if (pollingStrategy === 'exponential') {
currentDelay = Math.min(currentDelay * 1.5, 8000);
await sleep(currentDelay);
} else if (pollingStrategy === 'adaptive') {
// Adaptive strategy based on progress indicators
if (statusData.progress && statusData.progress > 0.8) {
await sleep(500); // Shorter waits when almost done
} else if (statusData.progress && statusData.progress > 0.5) {
await sleep(1000); // Medium waits in the middle
} else {
currentDelay = Math.min(currentDelay * 1.3, 5000);
await sleep(currentDelay);
}
}
}
} catch (error) {
console.warn(`Status check attempt ${statusAttempts} failed: ${error.message}`);
if (statusAttempts >= 10) {
throw error;
}
// Backoff on errors
await sleep(Math.min(currentDelay * 2, 10000));
currentDelay *= 2;
}
}
if (!completed) {
throw new Error('Failed to get completion after maximum attempts');
}
// Step 3: Convert response to speech with retry logic
console.log('Generating speech for character response');
let ttsAttempt = 0;
let ttsSuccess = false;
let audioBlob;
while (ttsAttempt < retryOptions.maxRetries && !ttsSuccess) {
try {
metrics.attempts.tts++;
ttsAttempt++;
if (onProgress) {
onProgress({
stage: 'tts',
status: 'processing',
attempt: ttsAttempt
});
}
const ttsResponse = await fetch('https://api.runanythingai.com/api/audio/full', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': `Bearer ${apiKey}`
},
body: JSON.stringify({
text: characterResponse,
voice: ttsAttempt > 1 && fallbackVoice ? fallbackVoice : voice,
speed,
format: audioFormat
})
});
if (!ttsResponse.ok) {
throw new Error(`TTS failed with status: ${ttsResponse.status}`);
}
audioBlob = await ttsResponse.blob();
ttsSuccess = true;
metrics.ttsTime = Date.now() - (pollStart + metrics.generationTime);
} catch (error) {
if (ttsAttempt >= retryOptions.maxRetries) {
console.error(`TTS generation failed after ${retryOptions.maxRetries} attempts`);
// Continue with text-only response but mark TTS as failed
break;
}
console.warn(`TTS attempt ${ttsAttempt} failed, retrying: ${error.message}`);
await sleep(retryOptions.initialDelay * Math.pow(1.5, ttsAttempt - 1));
}
}
// Step 4: Return the complete interaction result
metrics.totalTime = Date.now() - metrics.startTime;
const result = {
text: characterResponse,
audio: ttsSuccess ? audioBlob : null,
metrics: {
generationTime: metrics.generationTime,
ttsTime: metrics.ttsTime,
totalTime: metrics.totalTime,
attempts: metrics.attempts
},
character: {
type: character,
name: botName
}
};
// Cache results if enabled
if (cacheResults && requestKey && ttsSuccess) {
storeInResultCache(requestKey, result);
}
return result;
} catch (error) {
console.error('Character interaction failed:', error);
throw error;
}
}
// Helper functions
function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
function generateCacheKey(character, messages, persona, maxTokens, temperature) {
// Implementation of cache key generation based on request parameters
// This could use a hash of the last few messages + other key parameters
}
function checkResultCache(key) {
// Implementation of cache lookup logic
}
function storeInResultCache(key, result) {
// Implementation of cache storage logic
}
Specialty Workflow: Parallel Processing with Fan-out/Fan-in Pattern
For high-volume applications requiring concurrent character interactions or multiple TTS conversions:
sequenceDiagram
participant Client
participant Orchestrator
participant CharAPI as Character APIs
participant StatusAPI as Status API
participant TTSAPI as TTS API
Client->>Orchestrator: Request batch processing
activate Orchestrator
par Character 1
Orchestrator->>CharAPI: Generate (Character 1)
CharAPI-->>Orchestrator: ID 1
and Character 2
Orchestrator->>CharAPI: Generate (Character 2)
CharAPI-->>Orchestrator: ID 2
end
par Status Check 1
loop Until complete
Orchestrator->>StatusAPI: Check status (ID 1)
StatusAPI-->>Orchestrator: Status/Result 1
end
and Status Check 2
loop Until complete
Orchestrator->>StatusAPI: Check status (ID 2)
StatusAPI-->>Orchestrator: Status/Result 2
end
end
par TTS 1
Orchestrator->>TTSAPI: Convert to speech (Result 1)
TTSAPI-->>Orchestrator: Audio 1
and TTS 2
Orchestrator->>TTSAPI: Convert to speech (Result 2)
TTSAPI-->>Orchestrator: Audio 2
end
Orchestrator->>Client: Combined results
deactivate Orchestrator
Parallel Processing Implementation
/**
* Process multiple character interactions in parallel with controlled concurrency
*
* @param {Array} interactionRequests - Array of character interaction requests
* @param {Object} options - Configuration options
* @returns {Promise<Array>} Results for all interactions
*/
async function batchProcessCharacterInteractions(interactionRequests, options = {}) {
const {
maxConcurrency = 5,
timeout = 300000,
apiKey = process.env.RUNANYTHING_API_KEY,
continueOnError = true
} = options;
console.log(`Processing ${interactionRequests.length} interactions with max concurrency of ${maxConcurrency}`);
// Track batch processing metrics
const batchMetrics = {
startTime: Date.now(),
completed: 0,
failed: 0,
totalRequests: interactionRequests.length
};
// Process in batches to control concurrency
const results = [];
// Process requests in chunks based on maxConcurrency
for (let i = 0; i < interactionRequests.length; i += maxConcurrency) {
const chunk = interactionRequests.slice(i, i + maxConcurrency);
const chunkPromises = chunk.map(async (request, index) => {
try {
const result = await characterInteraction(
request.character,
request.messages,
request.persona,
request.botName,
{
...request.options,
apiKey
}
);
batchMetrics.completed++;
return {
success: true,
requestIndex: i + index,
result
};
} catch (error) {
batchMetrics.failed++;
console.error(`Interaction ${i + index} failed:`, error.message);
if (!continueOnError) {
throw error;
}
return {
success: false,
requestIndex: i + index,
error: error.message
};
}
});
const chunkResults = await Promise.all(chunkPromises);
results.push(...chunkResults);
}
batchMetrics.totalTime = Date.now() - batchMetrics.startTime;
batchMetrics.successRate = (batchMetrics.completed / batchMetrics.totalRequests) * 100;
console.log(`Batch processing completed: ${batchMetrics.completed}/${batchMetrics.totalRequests} successful (${batchMetrics.successRate.toFixed(1)}%) in ${batchMetrics.totalTime}ms`);
return {
results: results.sort((a, b) => a.requestIndex - b.requestIndex),
metrics: batchMetrics
};
}
Enterprise API Integration Best Practices
Ensure production reliability and optimal performance by following these advanced integration practices:
-
Multi-layered Error Resilience
- Implement circuit breaker patterns to prevent cascading failures
- Design graceful degradation pathways for each component failure
- Create comprehensive error taxonomies for accurate diagnosis
- Establish error budgets and monitoring thresholds by endpoint
-
Performance Engineering
- Implement predictive prefetching for common interaction patterns
- Utilize smart caching with TTL strategies based on content volatility
- Design parallel processing pipelines with controlled concurrency
- Employ result reuse strategies for similar queries across users
-
User Experience Optimization
- Implement progressive response rendering with placeholder animations
- Display estimated completion times based on model and query complexity
- Design contextual loading states that match character personalities
- Create fallback response libraries for offline or degraded operations
-
Resource Orchestration
- Implement token usage forecasting to prevent quota exhaustion
- Design adaptive rate limiting based on response times and error rates
- Create prioritization schemas for critical vs. non-critical requests
- Establish request budgeting across different application components
-
Security and Compliance
- Never expose API keys in client-side code
- Implement request signing for sensitive operations
- Design content moderation pipelines for user-influenced generations
- Create comprehensive audit trails for AI-generated content
-
Monitoring and Analytics
- Track end-to-end latency across the full interaction lifecycle
- Establish baseline performance metrics for each character endpoint
- Monitor character consistency and drift over large interaction sets
- Analyze user engagement patterns with different character types