use shrp from your scripts, automations, and ai agents. rest apis with bearer token auth and json responses. built for developers who want speech, voice, and transcript capabilities without building the infrastructure.
generate your api key to get started
free accounts get 2 keys. authenticated requests only โ no anonymous access.
all requests require a bearer token in the Authorization header. generate your key at /dashboard/api-keys. keys begin with shrp_live_ and are shown only once โ store them securely.
auth header
Authorization: Bearer shrp_live_<your-key>
200request succeeded
401missing or invalid api key
429rate limit or quota reached
400bad request โ check field names and values
404resource not found (e.g. no captions)
502 / 504upstream provider error or timeout
// endpoints
POST
/api/v1/youtube-transcriptlive
extract a transcript from any youtube video. returns plain text or srt subtitle format. no credits charged โ rate limited to 20 requests per key per day.
POST
/api/v1/ttslive
convert text to speech using google neural voices. returns base64-encoded mp3 audio. uses your existing tts character quota (same as the web ui).
POST
/api/v1/transcribecoming soon
submit an audio or video file url for ai transcription via assemblyai. async โ returns a job id. polling endpoint coming with the same release.
// youtube transcript
RATE LIMIT
20 requests / key / day
CREDITS
none charged
FORMATS
text (default), srt
METHOD
POST
REQUEST BODY
json
{
"url": "https://youtube.com/watch?v=...", // required โ full URL or 11-char video ID
"format": "text" // optional โ "text" (default) or "srt"
}
{
"success": true,
"videoId": "dQw4w9WgXcQ",
"language": "en",
"transcript": "We're no strangers to love...",
"format": "text",
"wordCount": 312
}
// text to speech
VOICES
2,000+ google voices
OUTPUT
base64 mp3
QUOTA
shared with web ui
METHOD
POST
REQUEST BODY
json
{
"text": "Hello from SHRP.", // required โ the text to synthesize
"voiceName": "en-US-Neural2-J", // required โ see /api/tts/voices for options
"languageCode": "en-US", // required โ BCP-47 language code
"speed": 1.0 // optional โ 0.25 to 4.0, default 1.0
}
to list available voices and their language codes, call GET /api/tts/voices (no auth required). quota is shared with your web ui usage โ the same monthly character allowance applies.
when a limit is hit, the api returns 429 with an error field describing what was exceeded. credits and quotas are tied to the user account associated with the api key โ not to the key itself.
// why build with shrp
no infra to manage
google cloud tts, assemblyai, and youtube caption extraction handled for you. one endpoint, one key.
works in any language
60+ languages for tts. youtube transcripts in whatever language the video was recorded in.
designed for agents
all responses are clean json. consistent error codes. predictable schema. easy to chain into llm pipelines and automations.
same quota as the ui
api usage and web ui usage share the same character balance. no double billing, no separate api quota to track.
// mcp โ for ai agents
if you are building with claude, cursor, or windsurf, use the mcp server instead of the rest api. your ai agent calls shrp tools natively โ no fetch calls, no response parsing, no glue code.
mcp uses the same api key and same quota as the rest api โ not separate auth, not separate billing. full mcp docs โ
// coming soon
/api/v1/transcribe
submit an audio or video file url for ai-powered transcription via assemblyai. async pattern โ submit returns a job id, poll the status endpoint until complete. supports speaker diarization, confidence scores, and 100+ languages. charges credits from your account.
in development โ join the changelog or check back soon.
mcp server
use shrp directly inside claude, cursor, and windsurf via the model context protocol โ no fetch calls, no glue code. your ai agent calls youtube_transcript, text_to_speech, and transcribe as native tools.