Pipeline API
Pretok
The main pipeline class for processing text.
Constructor
Pretok(
config: PretokConfig | None = None,
*,
target_language: str | None = None,
model_id: str | None = None,
detector: BaseDetector | None = None,
translator: BaseTranslator | None = None,
cache: Cache | None = None,
)
Parameters:
config: Full configuration objecttarget_language: Target language for translation (overrides config)model_id: Model ID for capability lookupdetector: Custom detector instancetranslator: Custom translator instancecache: Custom cache instance
Methods
process
Process text through the pipeline.
Parameters:
- text: Input text to process
Returns: PipelineResult with processed text and metadata
detect
Detect language of text.
Parameters:
- text: Text to analyze
Returns: DetectionResult with language and confidence
translate
def translate(
self,
text: str,
target_language: str,
source_language: str | None = None,
) -> TranslationResult
Translate text to target language.
Parameters:
- text: Text to translate
- target_language: Target language code
- source_language: Source language code (optional)
Returns: TranslationResult with translated text
PipelineResult
Result from pipeline processing.
@dataclass
class PipelineResult:
original_text: str
processed_text: str
segments: list[Segment]
detections: list[DetectionResult]
translations: list[TranslationResult]
from_cache: bool
metadata: dict[str, Any]
@property
def was_modified(self) -> bool: ...
Factory Function
create_pretok
def create_pretok(
model_id: str | None = None,
target_language: str | None = None,
**kwargs,
) -> Pretok
Factory function to create a Pretok instance.
Parameters:
- model_id: Model ID for capability lookup
- target_language: Target language (overrides model's primary language)
- **kwargs: Additional arguments passed to Pretok
Returns: Configured Pretok instance