use shrp from your scripts, automations, and ai agents. build with a speech-to-text api, audio transcription api, video transcription api, text-to-speech api, and youtube transcript api using bearer token auth and predictable json responses.
generate your api key to get started
free accounts get 2 keys. authenticated requests only — no anonymous access.
all requests require a bearer token in the Authorization header. generate your key at /dashboard/api-keys. keys begin with shrp_live_ and are shown only once — store them securely.
auth header
Authorization: Bearer shrp_live_<your-key>
200request succeeded
401missing or invalid api key
429rate limit or quota reached
400bad request — check field names and values
404resource not found (e.g. no transcript available)
502 / 504upstream provider error or timeout
// endpoints
POST
/api/v1/youtube-transcriptlive
youtube transcript api for retrieving the available transcript from a video url. returns readable text or srt subtitle format. no credits charged — rate limited to 20 requests per key per day.
POST
/api/v1/ttslive
text-to-speech api for generating natural voice audio with google neural voices. returns base64-encoded mp3 audio and uses your existing tts character quota.
POST
/api/v1/transcribelive
speech-to-text api for audio transcription and video transcription via assemblyai. returns transcript text, duration, word count, confidence, and optional speaker labels.
{
"success": true,
"videoId": "dQw4w9WgXcQ",
"language": "en",
"transcript": "We're no strangers to love...",
"format": "text",
"wordCount": 312
}
// text-to-speech api
VOICES
2,000+ google voices
OUTPUT
base64 mp3
QUOTA
shared with web ui
METHOD
POST
REQUEST BODY
json
{
"text": "Hello from SHRP.", // required — the text to synthesize
"voiceName": "en-US-Neural2-J", // required — see /api/tts/voices for options
"languageCode": "en-US", // required — BCP-47 language code
"speed": 1.0 // optional — 0.25 to 4.0, default 1.0
}
to list available voices and their language codes, call GET /api/tts/voices (no auth required). quota is shared with your web ui usage — the same monthly character allowance applies.
file=@meeting.mp3 // required — audio or video file
language=en // optional — BCP-47 language code
speaker_labels=true // optional — Starter/Pro only
this audio transcription api returns transcript json directly for uploaded audio and video files. it does not store your uploaded file or create a dashboard history item. use the web upload flow if you need saved transcription history.
when a limit is hit, the api returns 429 with an error field describing what was exceeded. credits and quotas are tied to the user account associated with the api key — not to the key itself.
// why build with shrp
no infra to manage
google cloud tts, assemblyai, and youtube transcript retrieval handled for you. one endpoint, one key.
works in any language
60+ languages for tts, 100+ transcription languages, and youtube transcripts in whatever language the video was recorded in.
designed for agents
all responses are clean json. consistent error codes. predictable schema. easy to chain into llm pipelines and automations.
same quota as the ui
api usage and web ui usage share the same character balance. no double billing, no separate api quota to track.
// mcp — for ai agents
if you are building with claude, cursor, or windsurf, use the mcp server instead of the rest api. your ai agent calls shrp tools natively — no fetch calls, no response parsing, no glue code.
mcp uses the same api key and same quota as the rest api — not separate auth, not separate billing. full mcp docs →
// mcp server for ai agents
mcp server
use shrp directly inside claude, cursor, and windsurf via the model context protocol — no fetch calls, no glue code. your ai agent calls youtube_transcript, text_to_speechas native tools. speech-to-text is available today through the rest api; the mcp transcribe tool will be added separately once file upload semantics are finalized.
Yes. POST /api/v1/transcribe accepts multipart audio or video uploads and returns transcript text, confidence, duration, word count, and optional speaker-labelled utterances.
Does SHRP have a text-to-speech API?
Yes. POST /api/v1/tts converts text to speech and returns base64 MP3 audio using the same SHRP TTS quota as the web app.
Can I extract YouTube transcripts with the API?
Yes. POST /api/v1/youtube-transcript retrieves the available transcript for a YouTube video as plain text or SRT.
Is the API designed for AI agents?
Yes. SHRP returns predictable JSON responses and uses bearer token authentication, so it works well in scripts, automations, and AI agent workflows.
api usage draws from your existing shrp plan and credits — no separate api subscription required.