Paper
Dataflow-Oriented Classification and Performance Analysis of GPU-Accelerated Homomorphic Encryption
Authors
Ai Nozaki, Takuya Kojima, Hideki Takase, Hiroshi Nakamura
Abstract
Fully Homomorphic Encryption (FHE) enables secure computation over encrypted data, but its computational cost remains a major obstacle to practical deployment. To mitigate this overhead, many studies have explored GPU acceleration for the CKKS scheme, which is widely used for approximate arithmetic. In CKKS, CKKS parameters are configured for each workload by balancing multiplicative depth, security requirements, and performance. These parameters significantly affect ciphertext size, thereby determining how the memory footprint fits within the GPU memory hierarchy. Nevertheless, prior studies typically apply their proposed optimization methods uniformly, without considering differences in CKKS parameter configurations. In this work, we demonstrate that the optimal GPU optimization strategy for CKKS depends on the CKKS parameter configuration. We first classify prior optimizations by two aspects of dataflows which affect memory footprint and then conduct both qualitative and quantitative performance analyses. Our analysis shows that even on the same GPU architecture, the optimal strategy varies with CKKS parameters with performance differences of up to 1.98 $\times$ between strategies, and that the criteria for selecting an appropriate strategy differ across GPU architectures.
Metadata
Related papers
Vibe Coding XR: Accelerating AI + XR Prototyping with XR Blocks and Gemini
Ruofei Du, Benjamin Hersh, David Li, Nels Numan, Xun Qian, Yanhe Chen, Zhongy... • 2026-03-25
Comparing Developer and LLM Biases in Code Evaluation
Aditya Mittal, Ryan Shar, Zichu Wu, Shyam Agarwal, Tongshuang Wu, Chris Donah... • 2026-03-25
The Stochastic Gap: A Markovian Framework for Pre-Deployment Reliability and Oversight-Cost Auditing in Agentic Artificial Intelligence
Biplab Pal, Santanu Bhattacharya • 2026-03-25
Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA
Saahil Mathur, Ryan David Rittner, Vedant Ajit Thakur, Daniel Stuart Schiff, ... • 2026-03-25
MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination
Zhuo Li, Yupeng Zhang, Pengyu Cheng, Jiajun Song, Mengyu Zhou, Hao Li, Shujie... • 2026-03-25
Raw Data (Debug)
{
"raw_xml": "<entry>\n <id>http://arxiv.org/abs/2603.16692v1</id>\n <title>Dataflow-Oriented Classification and Performance Analysis of GPU-Accelerated Homomorphic Encryption</title>\n <updated>2026-03-17T15:49:32Z</updated>\n <link href='https://arxiv.org/abs/2603.16692v1' rel='alternate' type='text/html'/>\n <link href='https://arxiv.org/pdf/2603.16692v1' rel='related' title='pdf' type='application/pdf'/>\n <summary>Fully Homomorphic Encryption (FHE) enables secure computation over encrypted data, but its computational cost remains a major obstacle to practical deployment. To mitigate this overhead, many studies have explored GPU acceleration for the CKKS scheme, which is widely used for approximate arithmetic. In CKKS, CKKS parameters are configured for each workload by balancing multiplicative depth, security requirements, and performance. These parameters significantly affect ciphertext size, thereby determining how the memory footprint fits within the GPU memory hierarchy. Nevertheless, prior studies typically apply their proposed optimization methods uniformly, without considering differences in CKKS parameter configurations. In this work, we demonstrate that the optimal GPU optimization strategy for CKKS depends on the CKKS parameter configuration. We first classify prior optimizations by two aspects of dataflows which affect memory footprint and then conduct both qualitative and quantitative performance analyses. Our analysis shows that even on the same GPU architecture, the optimal strategy varies with CKKS parameters with performance differences of up to 1.98 $\\times$ between strategies, and that the criteria for selecting an appropriate strategy differ across GPU architectures.</summary>\n <category scheme='http://arxiv.org/schemas/atom' term='cs.DC'/>\n <published>2026-03-17T15:49:32Z</published>\n <arxiv:comment>This work has been submitted to the IEEE for possible publication</arxiv:comment>\n <arxiv:primary_category term='cs.DC'/>\n <author>\n <name>Ai Nozaki</name>\n </author>\n <author>\n <name>Takuya Kojima</name>\n </author>\n <author>\n <name>Hideki Takase</name>\n </author>\n <author>\n <name>Hiroshi Nakamura</name>\n </author>\n </entry>"
}