Skip to content
@OpenDCAI

OpenDCAI

Define the future of Data-centric AI together

OpenDCAI

Website Google Scholar X Bilibili RedNote Stars Followers

👋 Welcome

✨We are dedicated to advancing research and open-source tools in Data-Centric Artificial Intelligence (DCAI).✨

🚀Our goal is to develop effective and efficient DCAI systems and algorithms that support and enhance the performance of AI models and applications.

🤝 Community

QR_en

Pinned Loading

  1. DataFlow DataFlow Public

    Easy Data Preparation with latest LLMs-based Operators and Pipelines.

    Python 2.9k 192

  2. MyScaleDB MyScaleDB Public

    Forked from OriginHubAI/MyScaleDB

    AI Database for unified, scalable SQL + vector data management, search and analytics

    C++ 39 1

  3. DataFlex DataFlex Public

    DataFlex is a data-centric training framework that enhances model performance by either selecting the most influential samples, optimizing their weights, or adjusting their mixing ratios.

    Python 113 10

  4. Paper2Any Paper2Any Public

    Turn paper/text/topic into editable research figures, technical route diagrams, and presentation slides.

    Python 1.8k 123

  5. AgentFlow AgentFlow Public

    The First Unified Agent Data Synthesis Framework for Custom Agentic Task with all-in-one envrionment.

    Python 26

Repositories

Showing 10 of 29 repositories
  • OpenPrism Public

    Open-source implementation of AI-powered academic writing workspace inspired by OpenAI Prism, featuring LaTeX editing, PDF preview, and intelligent AI assistance

    OpenDCAI/OpenPrism’s past year of commit activity
    TypeScript 211 14 2 (1 issue needs help) 2 Updated Mar 2, 2026
  • leonai Public
    OpenDCAI/leonai’s past year of commit activity
    Python 22 MIT 1 0 4 Updated Mar 2, 2026
  • DataFlow-Doc Public

    Documentation for DataFlow, Data-centric AI system for LLM.

    OpenDCAI/DataFlow-Doc’s past year of commit activity
    Python 11 30 4 0 Updated Mar 1, 2026
  • DataFlow Public

    Easy Data Preparation with latest LLMs-based Operators and Pipelines.

    OpenDCAI/DataFlow’s past year of commit activity
    Python 2,871 Apache-2.0 192 12 2 Updated Mar 1, 2026
  • Paper2Any Public

    Turn paper/text/topic into editable research figures, technical route diagrams, and presentation slides.

    OpenDCAI/Paper2Any’s past year of commit activity
    Python 1,837 Apache-2.0 122 5 4 Updated Feb 28, 2026
  • OpenDCAI/DataFlow-WebUI’s past year of commit activity
    Python 17 12 0 0 Updated Feb 28, 2026
  • DataFlow-MM-Doc Public

    Documentation for DataFlow-MM

    OpenDCAI/DataFlow-MM-Doc’s past year of commit activity
    Python 2 8 0 0 Updated Feb 27, 2026
  • AgentFlow Public

    The First Unified Agent Data Synthesis Framework for Custom Agentic Task with all-in-one envrionment.

    OpenDCAI/AgentFlow’s past year of commit activity
    Python 26 0 0 0 Updated Feb 27, 2026
  • DataFlow-Agent Public

    Agent for DataFlow: Automatic Data Workflow Design

    OpenDCAI/DataFlow-Agent’s past year of commit activity
    Python 53 Apache-2.0 11 1 0 Updated Feb 27, 2026
  • Open-NotebookLM Public

    An Open Source implementation of Notebook LM.

    OpenDCAI/Open-NotebookLM’s past year of commit activity
    Python 34 Apache-2.0 5 2 1 Updated Feb 27, 2026