Paper
Samyama: A Unified Graph-Vector Database with In-Database Optimization, Agentic Enrichment, and Hardware Acceleration
Authors
Madhulatha Mandarapu, Sandeep Kunkunuru
Abstract
Modern data architectures are fragmented across graph databases, vector stores, analytics engines, and optimization solvers, resulting in complex ETL pipelines and synchronization overhead. We present \textbf{Samyama}, a high-performance graph-vector database written in Rust that unifies these workloads into a single engine. Samyama combines a RocksDB-backed persistent store with a versioned-arena MVCC model, a vectorized query executor with 35 physical operators, a cost-based query planner with plan enumeration and predicate pushdown, a dedicated CSR-based analytics engine, and native RDF/SPARQL support. The system integrates 22 metaheuristic optimization solvers directly into its query language, implements HNSW vector indexing~\citep{malkov2020hnsw} with Graph RAG capabilities, and introduces ``Agentic Enrichment'' for autonomous graph expansion via LLMs. The \textbf{Enterprise Edition} adds GPU acceleration via wgpu, production-grade observability, point-in-time recovery, and hardened high availability with HTTP/2 Raft transport. Our evaluation on commodity hardware (Mac Mini M4, 16\,GB RAM) demonstrates: ingestion at 255K nodes/s (CPU) and 412K nodes/s (GPU-accelerated); 115K Cypher queries/sec at 1M nodes; 4.0--4.7$\times$ latency reduction from late materialization on multi-hop traversals; 8.2$\times$ GPU PageRank speedup at 1M nodes; and 100\% LDBC Graphalytics validation (28/28 tests). These results demonstrate that a unified graph-vector-optimization engine can achieve competitive performance on commodity hardware while maintaining Rust's memory safety guarantees.
Metadata
Related papers
Cosmic Shear in Effective Field Theory at Two-Loop Order: Revisiting $S_8$ in Dark Energy Survey Data
Shi-Fan Chen, Joseph DeRose, Mikhail M. Ivanov, Oliver H. E. Philcox • 2026-03-30
Stop Probing, Start Coding: Why Linear Probes and Sparse Autoencoders Fail at Compositional Generalisation
Vitória Barin Pacela, Shruti Joshi, Isabela Camacho, Simon Lacoste-Julien, Da... • 2026-03-30
SNID-SAGE: A Modern Framework for Interactive Supernova Classification and Spectral Analysis
Fiorenzo Stoppa, Stephen J. Smartt • 2026-03-30
Acoustic-to-articulatory Inversion of the Complete Vocal Tract from RT-MRI with Various Audio Embeddings and Dataset Sizes
Sofiane Azzouz, Pierre-André Vuissoz, Yves Laprie • 2026-03-30
Rotating black hole shadows in metric-affine bumblebee gravity
Jose R. Nascimento, Ana R. M. Oliveira, Albert Yu. Petrov, Paulo J. Porfírio,... • 2026-03-30
Raw Data (Debug)
{
"raw_xml": "<entry>\n <id>http://arxiv.org/abs/2603.08036v1</id>\n <title>Samyama: A Unified Graph-Vector Database with In-Database Optimization, Agentic Enrichment, and Hardware Acceleration</title>\n <updated>2026-03-09T07:17:17Z</updated>\n <link href='https://arxiv.org/abs/2603.08036v1' rel='alternate' type='text/html'/>\n <link href='https://arxiv.org/pdf/2603.08036v1' rel='related' title='pdf' type='application/pdf'/>\n <summary>Modern data architectures are fragmented across graph databases, vector stores, analytics engines, and optimization solvers, resulting in complex ETL pipelines and synchronization overhead. We present \\textbf{Samyama}, a high-performance graph-vector database written in Rust that unifies these workloads into a single engine. Samyama combines a RocksDB-backed persistent store with a versioned-arena MVCC model, a vectorized query executor with 35 physical operators, a cost-based query planner with plan enumeration and predicate pushdown, a dedicated CSR-based analytics engine, and native RDF/SPARQL support. The system integrates 22 metaheuristic optimization solvers directly into its query language, implements HNSW vector indexing~\\citep{malkov2020hnsw} with Graph RAG capabilities, and introduces ``Agentic Enrichment'' for autonomous graph expansion via LLMs. The \\textbf{Enterprise Edition} adds GPU acceleration via wgpu, production-grade observability, point-in-time recovery, and hardened high availability with HTTP/2 Raft transport.\n Our evaluation on commodity hardware (Mac Mini M4, 16\\,GB RAM) demonstrates: ingestion at 255K nodes/s (CPU) and 412K nodes/s (GPU-accelerated); 115K Cypher queries/sec at 1M nodes; 4.0--4.7$\\times$ latency reduction from late materialization on multi-hop traversals; 8.2$\\times$ GPU PageRank speedup at 1M nodes; and 100\\% LDBC Graphalytics validation (28/28 tests). These results demonstrate that a unified graph-vector-optimization engine can achieve competitive performance on commodity hardware while maintaining Rust's memory safety guarantees.</summary>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.DB'/>\n <published>2026-03-09T07:17:17Z</published>\n <arxiv:comment>16 pages, 4 figures, 12 tables</arxiv:comment>\n <arxiv:primary_category term='cs.DB'/>\n <author>\n <name>Madhulatha Mandarapu</name>\n </author>\n <author>\n <name>Sandeep Kunkunuru</name>\n </author>\n </entry>"
}