Prerequisites: You need an existing Vapi or Retell AI voice agent. If you don’t have one yet, create one first in your platform of choice.
Step 1: Create Your Relyable Account
1
Sign Up
Visit app.relyable.ai and create your account. You’ll be taken to your workspace dashboard.
2
Create a Workspace
If you’re starting fresh, you’ll create your first workspace. Think of workspaces as separate environments for different clients or projects.
Step 2: Import Your Voice Agent
1
Navigate to Agents
In your Relyable dashboard, click on Agents in the top navigation, then click Create New Agent.
2
Select Your Platform
Choose your AI provider:
- Vapi
- Retell AI
3
Get Your Credentials
You’ll need two things from your voice platform:For Vapi:
- Go to your Vapi Dashboard
- Navigate to Settings → API Keys
- Copy your Private Key
- Go to your assistant and copy the Assistant ID from the URL or settings
- Go to your Retell dashboard
- Get your API key from Settings
- Copy your agent ID
4
Import the Agent
Paste your API key and Assistant/Agent ID into Relyable. Click Next.Relyable will automatically:
- Import your agent configuration
- Pull in your system prompt
- Sync your phone number (if configured)
- Load your agent settings
5
Verify Import
You should see your agent’s prompt displayed in Relyable. Review it to make sure everything synced correctly, then click Create Agent.
Step 3: Generate Test Cases
Test cases are the criteria Relyable uses to evaluate whether your agent is following your instructions correctly.1
Navigate to Test Cases
Click on Test Cases in your agent’s navigation menu.
2
Generate AI Test Cases
Click Generate Test Cases.Relyable’s AI will analyze your prompt and suggest 15-20 test cases. For example:
- “Agent must introduce itself as Emily from Inflate Real Estate”
- “Agent must ask for the caller’s full name and spell it back letter by letter”
- “Agent must confirm the address by reading it back to the caller”
3
Review and Customize
Go through each test case and:
- Verify it’s accurate for your use case
- Adjust the priority level (Critical, High, Medium, Low)
- Edit the description if needed
- Add custom test cases for specific scenarios
Example Test Cases
Here are some examples of well-written test cases: Critical Priority:- “Agent must transfer emergency calls to a human operator within 10 seconds”
- “Agent must capture and confirm the customer’s email address before ending the call”
- “Agent must spell back the customer’s name letter by letter for confirmation”
- “Agent must follow the conversation flow step-by-step without skipping questions”
- “Agent should maintain a friendly and professional tone throughout the call”
- “Agent should handle interruptions gracefully and not lose context”
- “Agent should use the caller’s name at least once during the conversation”
Step 4: Create a Persona
Personas simulate different types of callers to stress-test your agent with realistic scenarios.1
Navigate to Personas
Go to the Simulation section and click on the Personas tab.
2
Generate a Persona
Click Generate Personality and describe the type of caller you want to simulate.Examples:
- “A 65-year-old male who is frustrated and has no patience”
- “A 28-year-old female professional who speaks quickly and is tech-savvy”
- “A non-native English speaker with a strong accent who speaks slowly”
3
Review the Generated Persona
Relyable will create a detailed persona with:
- Name and demographics
- Personality traits
- Communication style
- Speaking patterns
- Accent/voice characteristics
Step 5: Run Your First Test
1
Create a Scenario
Go to the Run tab in the Simulation section.Click Generate Tests and:
- Select a persona (like the one you just created)
- Select which test cases to evaluate
- Let AI generate a realistic scenario, or write your own
“You are Stanley Miller, a 79-year-old retired mechanic. Your kids are forcing you to sell your house you’ve lived in for 50 years because it’s getting too hard to manage. You’re calling the agency reluctantly and annoyed by the whole process.”
2
Run the Test
Select your scenario(s) and click Run Test.You can:
- Run a single scenario multiple times
- Run multiple different scenarios at once
- Give your test run a descriptive name
3
Wait for Results
Relyable will make actual phone calls to your agent using the personas and scenarios. This typically takes 2-3 minutes per call.You can see the progress in real-time on the Results page.
Step 6: Analyze Your Results
1
View Your Score
Once testing completes, you’ll see:
- Overall Score (aim for 70%+ for production-ready)
- Average Latency (response time)
- Words Per Minute (speaking speed)
- Test Case Results (which ones passed/failed)
2
Review Failed Test Cases
Click on any failed test case to see:
- The exact conversation where it failed
- Why it failed (AI explanation)
- Suggestions for fixing it
“Agent asked ‘Would you like to book a walkthrough?’ but skipped the required question ‘What is your timeline for moving in?’ The prompt requires asking about the timeline BEFORE offering the walkthrough.”
3
Listen to Recordings
Click on any conversation to:
- Listen to the full audio recording
- Read the full transcript
- See which test cases passed/failed in that specific call
Step 7: Iterate and Improve
1
Fix Your Prompt
Based on the failures, update your prompt in Vapi or Retell. Common fixes:
- Add step-by-step numbering to enforce question order
- Add explicit confirmation steps (repeat name, address back)
- Add handling for edge cases discovered in testing
2
Sync to Relyable
After updating your prompt in Vapi/Retell, go to your agent in Relyable and click Sync Prompt to pull the latest version.
3
Test Again
Run your tests again with the same scenarios to see if your score improved. Keep iterating until you consistently score 70%+.
Step 8: Enable Live Monitoring (Optional)
Once you’re happy with your test scores, enable live monitoring for production calls:1
Enable Call Monitoring
Go to your agent’s Settings and toggle Enable Call Monitoring.This adds a webhook to your Vapi/Retell agent that sends call data to Relyable.
2
Monitor Real Calls
Every real call to your agent will now be:
- Evaluated against your test cases
- Scored automatically
- Visible in your Relyable dashboard under the Monitoring tab
3
Get Alerted
Set up alerts for critical failures so you know immediately if your agent starts having issues in production.
What’s a Good Score?
Here’s what scores typically mean:| Score | Status | What It Means |
|---|---|---|
| 90%+ | Excellent | Production-ready, minimal failures |
| 70-89% | Good | Acceptable for production with monitoring |
| 50-69% | Fair | Needs improvement before production |
| Below 50% | Poor | Significant issues, not production-ready |
Next Steps
Deep Dive Development Guide
Learn advanced prompt engineering and testing strategies
API Integration
Automate testing with the Relyable API
Best Practices
Learn from production voice AI experts
Join Community
Connect with other Relyable users
Common Issues
My agent won't import
My agent won't import
Make sure you’re using the correct API key and Assistant/Agent ID. For Vapi, you need the Private Key (not Public Key). Check that your agent is published in your voice platform.
Test calls aren't working
Test calls aren't working
Verify that your agent has a phone number configured in Vapi/Retell. Relyable needs a phone number to call for testing.
Scores seem too low
Scores seem too low
This is normal! Most agents that seem to work fine in manual testing score 50-60% initially. This is why automated testing is so valuable - it finds issues you wouldn’t discover manually.
How do I improve my score?
How do I improve my score?
Focus on test cases marked as Critical or High priority first. Read the AI explanations for why tests failed, update your prompt, sync, and test again. It’s an iterative process.
Need Help?
- Check out the full development guide for detailed prompt engineering tips
- Review the API documentation for programmatic access
- Reach out on X/Twitter @relyableai