Model licenses and attribution notices stay public.

Vox Jot uses third-party speech, OCR, and audio models. This page publishes the practical license and attribution notices users need without requiring access to a private source repository.

Last Updated

May 15, 2026

Ownership and distribution

Vox Jot does not claim ownership of third-party model weights. Unless a model is listed as a platform runtime, Vox Jot either downloads the upstream model directly or mirrors/converts assets for app-managed installation.

App-managed mirrors and converted assets should preserve upstream license files, model cards, notices, source links, attribution, and lineage metadata where available.

Notice rules

  • MIT, BSD, and Apache-2.0 models require preserved copyright and license text.
  • CC-BY models require attribution, license link, source link, and conversion or mirror notes.
  • Gated models require upstream provider terms acceptance before download.
  • Non-commercial, research-only, custom, or unknown licenses require separate legal/product review.
  • Voice cloning models require permission for any reference voice sample, regardless of model license.

Speech-to-text and file ASR

  • OpenAI Whisper / whisper.cpp assets - MIT. Sources: openai/whisper and ggerganov/whisper.cpp.
  • NVIDIA Parakeet - CC-BY-4.0 attribution required. Vox Jot may distribute converted or quantized assets through its managed model mirror.
  • Useful Sensors Moonshine - MIT.
  • FunAudioLLM SenseVoice Small - custom model license requiring review before normal commercial distribution.
  • GigaAM v3 - MIT.
  • Breeze ASR 25 whisper.cpp conversion - Apache-2.0.
  • Qwen3 ASR MLX conversions - Apache-2.0.
  • FireRedASR2 AED MLX conversion - Apache-2.0 per current catalog policy.
  • Microsoft VibeVoice ASR MLX conversion - MIT.
  • Apple SpeechAnalyzer - platform runtime governed by Apple platform terms.

Speech analysis and speaker isolation

  • IBM Granite Speech 4.1 2B - Apache-2.0.
  • Cohere Transcribe 03-2026 - Apache-2.0 with gated provider terms.
  • pyannote Speaker Diarization Community-1 - CC-BY-4.0 attribution required and gated Hugging Face terms apply.
  • pyannote Speaker Diarization 3.1 - MIT and gated Hugging Face terms apply.
  • NVIDIA Sortformer v2.1 MLX conversion - CC-BY-4.0 attribution required.
  • NVIDIA Sortformer v1 - CC-BY-NC-4.0; gated before download for non-commercial terms acknowledgement.
  • Rev.ai Reverb Diarization V2 - custom/gated terms; gated before download.
  • Polyvoice ONNX diarization uses WeSpeaker embeddings and Silero VAD assets.

Text-to-speech

  • Qwen3 TTS - Apache-2.0.
  • OpenVoice - MIT.
  • Coqui XTTS v2 - Coqui Public Model License; gated before download for non-commercial terms and reference-voice consent.
  • Supertonic 3 - OpenRAIL-M; requires legal/product review before normal commercial distribution.
  • Resemble AI Chatterbox - MIT.
  • Kokoro - Apache-2.0.
  • Dia - Apache-2.0.
  • Sesame CSM - Apache-2.0.
  • Spark TTS - CC-BY-NC-SA-4.0; gated before download for non-commercial terms.
  • Llama OuteTTS - CC-BY-NC-SA-4.0; gated before download for non-commercial terms and reference-voice consent.
  • Ming Omni TTS - Apache-2.0.
  • KugelAudio - MIT.
  • Bark Small MLX conversion - MIT.
  • Fish Audio S2 Pro - Fish Audio Research License; gated before download and commercial use requires a separate license.
  • Liquid LFM2.5 Audio 1.5B - LFM Open License v1.0; gated before download for custom license review.
  • LongCat AudioDiT - MIT.
  • Soprano - Apache-2.0.
  • MeloTTS English MLX - MIT.
  • Higgs Audio v2 - Apache-2.0.
  • MOSS-TTS Nano - Apache-2.0.
  • Irodori TTS - MIT.
  • IndexTTS - Apache-2.0.
  • OmniVoice - Apache-2.0.
  • VibeVoice Realtime - MIT.
  • Voxtral TTS - CC-BY-NC-4.0; gated before download for non-commercial terms.
  • VoxCPM2 - Apache-2.0.
  • Pocket TTS - CC-BY-4.0 attribution required.
  • k2-fsa sherpa-onnx TTS runtime and VITS voice packs - preserve upstream runtime and voice-pack notices.

OCR

  • PaddlePaddle PP-OCRv5 - Apache-2.0.
  • LightOnOCR-2 1B - Apache-2.0.
  • Allen AI olmOCR-2 7B - Apache-2.0.
  • GLM-OCR - MIT.
  • Chandra OCR 2 - Modified OpenRAIL-M; gated before download for custom license review.
  • Qwen2.5-VL OCR - Qwen Research License; gated before download for non-commercial terms.
  • Tesseract tessdata_best - Apache-2.0.

Known review-required models

The following catalog rows need legal/product review before normal commercial distribution or should be explicitly gated: XTTS v2, Fish Audio S2 Pro, Spark TTS, Llama OuteTTS, Voxtral TTS, NVIDIA Sortformer v1, Chandra OCR 2, Qwen2.5-VL OCR, Reverb Diarization V2, LFM Audio, and any row whose current license is Other, Custom, Research-only, or missing.