Open AI Hosted File Search and Vector Stores in Python
This document explains what the OpenAI Python client supports for file ingestion, vector stores, semantic search, attribute filtering, and how those pieces connect to “talk to your documents” experiences, while keeping the focus on API support.
What OpenAI is solving here
OpenAI is solving both parts:
-
Hosted search: you upload files into a vector store, then run semantic search over chunked content and get back matching chunks plus scores.
-
Hosted retrieval plus generation: you can enable the
file_searchtool in the Responses API, so the model can automatically retrieve from your vector stores before generating an answer.
If you only want retrieval, use vector store search directly. If you want “chat with docs”, use Responses with file_search.
Core objects and the workflow
Files
You upload a document and get a file_id. For file search workflows, docs show using purpose="assistants" when creating the file.
Simple file upload example (local file):
Vector stores
A vector store is a hosted index used by the Retrieval guide and by the file_search tool.
Create a vector store:
Attach files to a vector store and wait for indexing
You can attach a file by file_id, then poll status until it is completed.
Polling style check:
SDK helper option for local files, one shot upload plus poll:
Searching a vector store
Vector store search returns relevant chunks, similarity scores, and file info.
Minimal search:
Search request options include filters, max_num_results (1 to 50), and rewrite_query.
If you want a custom ordering, sort client side using score:
Attribute metadata, filtering, and update
Each vector store file can have up to 16 key value pairs as attributes, and search filters can reference them.
Attach a file with attributes:
Filter search by attributes:
Update attributes later:
Chunking, limits, and supported file types
For Retrieval indexing, docs state:
-
Max file size is 512 MB, and max 5,000,000 tokens per file when attached.
-
Default chunking is 800 tokens with 400 overlap, and you can set a
chunking_strategywith constraints. -
A list of supported file types is provided, including pdf, docx, pptx, md, txt, and various code formats.
Getting an AI answer using file search in Responses
File search is a built in hosted tool in the Responses API. The model can automatically call it, retrieve from your vector store IDs, and then generate a response.
Basic Responses call with file_search enabled:
If you want the raw retrieval hits returned in the API response, use include=["file_search_call.results"].
You can also guide tool usage using tool_choice in the request.
Manual RAG style synthesis from search results
The Retrieval guide also shows a manual pattern: call vector store search, format the chunks, then send them to a model to synthesize a grounded answer.
A simplified version:
Operational details worth knowing
Costs
Pricing includes:
-
File Search storage: 0.10 USD per GB of vector storage per day, first GB free
-
File Search tool call cost (Responses API): priced per tool call
Expiration and cost control
Vector stores support expires_after, and the Retrieval guide notes that once expired, associated vector store files are deleted and you stop being charged for them.
Multipart uploads
Uploads can accept up to 8 GB total and expire about an hour after creation, producing a normal File when completed.
This matters if your ingestion pipeline needs multipart transport, even though Retrieval indexing has its own file size limits.
Inspect parsed content
You can retrieve the parsed content of a vector store file, which is useful for debugging chunking and extraction.
Removing a file from a vector store is not the same as deleting the file
Deleting a vector store file removes it from the vector store but does not delete the underlying file object.
Listing and sorting vector stores
The list endpoint supports sorting by created_at via the order query parameter.
Data controls
OpenAI’s platform docs say API data is not used to train or improve models by default unless you explicitly opt in.
Assistants deprecation timeline
OpenAI’s deprecations page states the Assistants API is deprecated and will be removed one year after Aug 26, 2025, with a shutdown date of Aug 26, 2026, and guidance to use the Responses API. https://platform.openai.com/docs/deprecations
Example
End to end search plus an AI answer
This example shows: input document, create vector store, upload, search, then ask for an answer using Responses file_search.
No comments:
Post a Comment