Project (Others can be seen in my Github main page)

  • GeoSpatial RAG System Development 10 2024 -
    • guided by Professor Ibrahim Sabek and Cyrus Shahabi
    • Used PostGIS, Neo4j, Milvus and data from Open Street Map to construct a Q&A RAG System prototype.
    • Designed geospatial questions and tested on local LLM and OPENAI API.
    • Working on the design of geospatial info embedding to enhance the performance of LLM.
  • Enhanced AI Query Optimizer with Continual Learning and Uncertainty Estimation 09 2024 - 12 2024
    • An incremental work for Research LifeLong Reinforcement Learning Query Optimizer. The goal is to make the model more general and handle multiple environments less heavily. And also deal with the last unsolved environment in the research.
    • Now this work has become part of my future work in the paper submitted to VLDB 2025.
  • Supported Adaptive-Radix-Index (ART) Index in Rosedb Internal GitHub 05 2024 - 06 2024
    • Found the optimal implementation of ART Index written in Go language. Tested its performance and robustness.
    • Wrapped the ART Index with WAL log and supported several actions(Put, Get, Delete, Add, Iteration, etc.), using the interfaces to support more index structures in the future.
    • Passed all the test cases in Rosedb, and also tested the performance difference with B+ Tree Index in original Internal.
  • Support Roaring Bitmap in DuckDB Internal GitHub, Technical Report 02 2024 - 05 2024
    • Debugged DuckDB Internal version 0.10.2, using GDB in Linux Platform with VSCode C++ extensions.
    • Figured out the mechanism of Roaring Bitmap and use it independently. Record the procedures of Hash Join and Equality filter in internal.
    • Targeted the entrance of modification. Heavily modify the source code and attach Roaring Bitmap to be a number object of Row Group.
    • Gained some insights from Art-Index supported by DuckDB, we follow the pipeline paradigm and create a wrapped class for CreateRoaringbitmap and some utility cpp classes (like a new parser, modified data organization classes, .etc.).
    • After adequate tests and experiments, we achieve performance improvement on TPC-H, Join-Order Benchmark, and our self-defined workload.
  • Data Backup System Development GitHub, 09 2022 - 12 2022
    • Independently developed data backup software that supports both local data backup and cloud backup, and that contains functions including: 1. Filtering of special file types and alternative files in Windows system 2. Compression and decompression, packing and unpacking, and encrypted storage of files (MD5, sha algorithm etc.) 3. User management, transmission encryption, and incremental backup.
    • Developed the software with Java, and improved the efficiency using Maven administrative tool and IDEA integrated development environment and Spring Framework.
    • Applied JavaFX, AWS S3, AWS Storage Gate Way and AliCloud RDS to develop GUI and netdisk model.
    • Designed use case diagrams, class diagrams, and sequence diagrams using Power Designer, and built the conceptual and physical model of the database.