This is a 3 month program of professional development that also furthers the mission of Kwaai. 8hrs/week with daily standups Mon-Thurs.
Introduction to AI - Matrix Math of Neural Networks
Intro to Vectors in AI - Semantic representation, Vector DB
Advances VectorDB - Distributed Knowledge
Research Project - Decentralized AI Infrastructure
Journal Publication - co-author a research paper
Showcase your work
This is a 6 month program of research and development that will contribute to the Kwaai mission. 8-20hrs/week with daily standups Mon-Thurs.
Advanced AI Infrastructure - get up to speed on Kwaai's research
Confidential Vector Search - homomorphic encryption
Distributed Knowledge base - sharding vector DBs
Research Project - Decentralized AI Infrastructure
Journal Publication - get credit in high impact publication
Showcase your work
Deadline Extended
Kwaai is a 501c3 nonprofit. Internships are unpaid, volunteer programs to further the Kwaai charitable mission.
Figure 1: Multi-Database Scaling Performance Comparison. Comprehensive analysis across seven vector databases with N=10 statistical rigor. (a) Query latency scaling with power-law complexity exponents—FAISS demonstrates sub-linear scaling (α=0.48) while Chroma achieves near-constant time (α=0.02). Linear y-axis (0-70ms) optimized for data visibility. (b) Query throughput reveals Chroma's dominance (130-144 QPS), pgvector's excellent 101 QPS at 50k scale, and FAISS's sustained performance (90+ QPS even at 2.2M chunks). (c) Data ingestion time on log-log scale showing FAISS as fastest across all scales. (d) Ingestion throughput consistency with coefficient of variation annotations—FAISS demonstrates exceptional consistency (CV=2.5%) while OpenSearch shows problematic variance (CV=45-94%). Error bars show ±1σ with horizontal jittering to reduce overlap; bars exceeding ±50% of mean are capped and marked with ‡ symbol for readability while maintaining statistical honesty. Databases are differentiated using both distinct colors and line styles (solid, dashed, dotted, dash-dot) for optimal accessibility.