Highly efficient data storage and search system

Indexing and Vector Database

A new generation of vector databases

Highly efficient vector data indexing and querying

Build generative AI solutions at scale, optimize infrastructure requirements, and use Shapelets Core to index and manage data in a vector database in real time.

Python

Copy to Clipboard Tabs

# Load your documents

loader = TextLoader("../state_of_the_union.txt")
documents = loader.load()
        
      

# Split the document into chunks

CharacterTextSplitter(chunk_size=1000, chunk_overlap=50)
docs = text_splitter.split_documents(documents)
        
      

# Create an embedding function

embedding_function = SentenceTransformerEmbeddings(model_name="all-MiniLM-L6-v2")
        
      

# Convert the chunks into embeddings and load them into the vectorDB

db = sh.from_documents(docs, embedding_function)
        
      

# Run a similarity query

query = "What did the president say about Ketanji Brown Jackson?"
docs = db.similarity_search(query)
        
      

# Print the retrieved results

print(docs[0].page_content)
        
      

Challenges

Are you having issues using context to find relevant data or searching in large vector stores?

Long Indexing Times

Latest acquired data points need to be indexed before they become available to search processes

Advanced indexing

Use various index types to contextualize your data and speed up geospatial and date/time-based searches.

Scalability

Unable to cope with large streams of data while keeping a low memory footprint.

Accuracy & Relevance

Speed usually comes at the cost of lower accuracy. Not with Shapelets Core.

Security & Privacity

Pure SaaS offerings are not suitable for everyone.

ARCHITECTURE

Core Architecture Scheme

Fast

Achieve not only fast responses to queries but also indexing times in the order of milliseconds.

Accurate

Obtain excellent recall metrics for both exact and approximate similarity searches

Context Rich

Index all kinds of vectors and scalar data, including dates, times, durations and geospatial data.

Versatile

Compatible with any LLM/model that produces embeddings/vectors and integrated with popular frameworks such as LangChain.

Indexing

Shapelets Core provides millisecond-scale indexing and querying, with minimal computational requirements.

Efficient indexing is crucial to accelerate similarity searches, but is usually disregarded in favour of fast query responses. Indexing can be distributed across multiple nodes for horizontal scaling, allowing for real-time indexing operations. Accelerate contextualized searches by using not just vector indices but also indices for scalar, datetimes and geospatial data.

Abstract representation indexing
Abstract representation store

Store

Depending on requirements, data can be stored in memory, local disk or in the cloud to optimize performance and costs.

The use of optimizations like cache line alignment reduces cache misses and improves overall efficiency. Furthermore, indices are based on compressible bitmaps with minimal size and always stored in memory, making search processes extremely efficient.

Shapelets Core Benchmarks

Incredible speed with minimal memory footprint

In this benchmark we have compared the indexing time of Shapelets Core with other popular databases. Specially with large throughputs, Shapelets Core can easily run up to two orders of magnitude faster than other solutions.

Shapelets Core Benchmarks

Use Case / RAG

Shapelets Vector DB is perfect for retrieval augmented generation (RAG) applications based on sets of documents that grow periodically.

Using standard databases for feeding dashboards and information systems that require multiple queries/views on big data usually involves high TCOs and causes slow responsiveness.

Building a system in which users interact with a corpus of legal documents is hard when new documents are continuously added, rocketing computing costs for indexing.
Shapelets Core uses highly efficient algorithms for indexing, offering real-time indexing capabilities with minimal CPU and memory requirements.

Use it as a server-based vector DB or integrate it in your projects as a python library.

Indexing and Vector Database.

A scalable and multidimensional indexing solution

Core Spheres Abstract Representation

Shapelets Vector DB offers both efficient storage and indexing capabilities

Just Storage…

  • ‘Archive and Move On’ scenarios (Compliance, Regulations, Proof of Record)
  • Deferred Processing scenarios (Backtesting, System Of Record)

Just Index…

  • Great for complementing your existing storage solutions.
  • Integration with LLM / ML solutions
  • Complex IoT and metric scenarios.

Combined Solution

Remove the need to integrate and maintain multiple systems with an all-in-one solution

Accelerate your data access today

Shapelets Core helps data scientists and data engineering in their daily tasks handling big data. Contact us today for a free demo.

Core Spheres Abstract Representation

Pin It on Pinterest