Changelog
This document tracks all notable changes to the toksum library.
Version 0.7.0 (Current)
Major Documentation Enhancement: - ✅ Comprehensive Sphinx Documentation - Added professional-grade documentation for all modules - ✅ Enhanced Docstrings - All functions, classes, and methods now have detailed Google/NumPy style docstrings - ✅ Rich Examples - Code examples for every major feature and use case - ✅ API Reference - Complete API documentation with cross-references - ✅ Usage Guides - Comprehensive examples including advanced patterns and integrations - ✅ Error Handling - Detailed exception documentation with handling patterns - ✅ Build Tools - Convenient documentation build and serve scripts - ✅ Mobile Responsive - Read the Docs theme with search functionality
Documentation Features: - 📚 200+ models from 25+ providers fully documented - 💡 Rich code examples with syntax highlighting - 🛡️ Comprehensive error handling patterns - ⚡ Performance optimization guidance - 🔗 Cross-referenced API documentation - 📱 Mobile-responsive design - 🔍 Full-text search functionality
Build System: - Sphinx configuration with autodoc and Napoleon extensions - Read the Docs theme integration - Automated API reference generation - Local development server support
Version 0.6.0 (Previous)
Features: - Comprehensive support for 200+ models from 25+ providers - Precise token counting for OpenAI models using tiktoken - Intelligent approximation algorithms for all other providers - Cost estimation with USD/INR currency support - Chat message format token counting - Case-insensitive model name matching - Command-line interface with comprehensive options
Supported Providers: - OpenAI (GPT-4, GPT-3.5, GPT-4o, O1 models, embeddings) - Anthropic (Claude 3/3.5, Claude 2, Instant) - Google (Gemini Pro/Flash, Gemini 1.5/2.0, PaLM) - Meta (LLaMA 2/3/3.1/3.2/3.3 variants) - Mistral (Mistral 7B, Mixtral, Large variants) - Cohere (Command, Command-R, Command-R+) - xAI (Grok models) - Alibaba (Qwen models) - Baidu (ERNIE models) - Huawei (PanGu models) - Yandex (YaLM models) - DeepSeek (Coder, VL, LLM models) - Tsinghua (ChatGLM models) - Databricks (DBRX, Dolly models) - Voyage AI (embedding models) - And 10+ more providers
API:
- count_tokens(text, model) - Quick token counting
- TokenCounter(model) - Reusable token counter class
- get_supported_models() - List all supported models
- estimate_cost(tokens, model) - Cost estimation
CLI:
- Basic token counting: toksum "text" model
- File input: toksum --file document.txt model
- Cost estimation: toksum --cost "text" model
- Model listing: toksum --list-models
Error Handling:
- UnsupportedModelError - For unsupported models
- TokenizationError - For tokenization failures
- EmptyTextError - For empty text validation
- InvalidTokenError - For token-specific issues
Future Versions
Planned Features: - Additional model providers as they become available - Enhanced approximation algorithms based on user feedback - Streaming token counting for large texts - Batch processing optimizations - Integration with more LLM libraries
Version History: - v0.6.0 - Current comprehensive release - v0.5.x - Previous versions with limited provider support - v0.4.x - Early releases with basic OpenAI support - v0.3.x - Initial public releases - v0.2.x - Beta versions - v0.1.x - Alpha versions
Contributing
We welcome contributions to expand model support and improve approximation accuracy. Please see the project repository for contribution guidelines.
Areas for Contribution: - New model provider support - Approximation algorithm improvements - Documentation enhancements - Test coverage expansion - Performance optimizations
Reporting Issues: - Model support requests - Accuracy improvements - Bug reports - Feature requests
License
toksum is released under the MIT License. See the LICENSE file for details.