Introduction
Welcome to EzLLM, the best way to develop LLM Applications
Goal
The goal of EzLLM is to provide the simplest interface for creating LLM based applications
EzLLM simplifies the process of
- File parsing
- Storage
- Retrieval
- LLM optimization
- LLM output parsing
Fundamental Concepts
Document
A document is a file uploaded to and parsed by EzLLm.
Valid file types include
- txt
- md
- html
- csv, tsv
- png, jpeg, webp
Collection
a collection is a set of documents with shared method of storage. There’s two main methods of storage within a collection.
2. Embedded Collection
where the data is ran through an embedding model and stored in a vector database, enabling semantic search on the documents within the collection.
3. Text Collection
Where the data is stored as plain text, reducing cost & parsing time. This is useful when you don’t need to search a collection or document.
Developer Console
you can view your usage and diagnostics in the developer console
Quickstart
to get started, read our quickstart
Use Cases
1. Semantic Search
If you want to build a search engine based off of PDFs & websites.
The following is an example searching on EzLLM documentation
related_documents = c.search("What is a collection?")
print(related_documents)
"""
[
Document("a collection is a set of documents with shared method of storage. ..."),
...
]
"""
It’s that simple to add semantic search to your project, no need to manage
- document parsing
- embeddings
- vector databases
2. Extraction
It has never been easier to extract structured data from unstructured input.
The following is an example of using LLM to parse contact information from a PDF
contacts = doc.scan().run(Extract(model=ContactInformationModel))
print(contacts)
"""
[
{
"name" : "John Smith",
"phone" : "111-111-1111",
"email" : "john@example.com"
},
{
"name" : "Jane Smith",
"phone" : "111-111-1111",
"email" : "jane@example.com"
},
...
]
"""
3. Q&A With Documents
This is an example of Question and Answering w/ Semantic Search.
We first search for similar documents, based off of a user query - “What is a collection?” and then we provide the relevant documents to the LLM to produce a concise answer.
# QA Semantically Similar Context of an Individual Document
response = doc.search("What is a collection?").run(QA([
"What is a collection?",
"How do you instantiate a collection?"
]))
print(response)
"""
{
"docs" : [
Document("a collection is a set of documents with shared method of storage. ..."),
],
"output" : [
{
"question" : "What is a collection?",
"output" : "a collection is a set of documents",
},
{
"question" : "How do you instantiate a collection?",
"output" : "There are multiple ways to instantiate a collection\n col = Collection()",
},
]
}
"""
4. Web Scrape
the following is an example of a webscraper that extracts all the items being sold on an ecommerce page, the descriptions and the prices
import ezllm
from ezllm.methods import Extract
from pydantic import BaseModel, Field
# 1. define the structure of the data you want to extract
class ItemModel(BaseModel):
"""extract the fields for each item"""
item: str = Field(
...,
description="name of the item"
)
description: str = Field(
...,
description="description of the item"
)
price: float = Field(
...,
description="price of the item"
)
class ExtractModel(BaseModel):
"""extract data from item"""
items: List[ItemModel] = Field(
...,
description="list of items extracted from provided text"
)
# 2. Create the web document
doc = WebDocument("https://example.com")
# 3. run the extract method
doc.scan().run(Extract(model=ExtractModel))