DeepEval is a simple-to-use, open-source LLM evaluation framework, for evaluating large-language model systems. It is similar to Pytest but specialized for unit testing LLM apps. DeepEval incorporates ...
Note: Nightly builds include the latest features and bug fixes but may be less stable than official releases. They follow the version format X.Y.Z.devYYYYMMDDHHMM. We welcome contributions! Please ...