Graphbook

The Framework for AI-Driven Data Pipelines

Github
Example Workflow with Llama

Build AI/ML Pipelines

Graphbook is a framework for building efficient, interactive DAG-structured data pipelines composed of your own custom-built nodes, and it all works best with PyTorch, Hugging Face, and more...

Get started

1. Build

Write processing nodes using Python in your favorite code editor and keep your code version controlled

@step("DinoV2")
@batch(8, "images")
@param("model", "transformers/automodel", default="facebook/dinov2-base")
@param("processor", "transformers/autoimageprocessor", default="facebook/dinov2-base")
def process_images(ctx, images, items, notes):
    inputs = ctx.processor(images=images, return_tensors='pt')
    outputs = ctx.model(**inputs)
    last_hidden_states = outputs.last_hidden_state
    return last_hidden_states
@step("DinoV2")
@batch(8, "images")
@param(
    "model",
    "transformers/automodel",
    default="facebook/dinov2-base"
)
@param(
    "processor",
    "transformers/autoimageprocessor",
    default="facebook/dinov2-base"
)
def process_images(ctx, images, items, notes):
    inputs = ctx.processor(images=images)
    outputs = ctx.model(**inputs)
    last_hidden_states = outputs.last_hidden_state
    return last_hidden_states

2. Assemble

Assemble a pipeline in our graph-based editor with your own processing nodes

Abstract workflow

3. Run

Run, monitor, and adjust parameters in your workflow

Elaborate workflow

Build AI Pipelines

Iterate, operate, and monitor an ML-based data processing pipeline all in one tool. Connect to any data source, use your own PyTorch or Tensorflow models, and maximize your GPU utilization without having to write tedious multiprocessing code.

Expedite Development

Graphbook gives users the ability to easily develop solutions to various tasks that require AI/ML inference pipelines. It facilitates development by offering interactivity, visualizations, and multiprocessing IO. Built to address diverse needs, Graphbook significantly reduces pipeline development time.

Always Open-Source

Do not trust a third party with your data. Graphbook is always free and open source. Deploy your own Graphbook instances on-premise or in the cloud, and start building.

Learn more

Core features

At its core, Graphbook is a framework for building efficient DAG-structured AI/ML data pipelines, but there are many features that help you build.

Web UI
Graphbook offers a visual workflow editor enabling users to effortlessly combine the functional units that were previously written in Python. This makes the editor accessible by everyone, from ML engineers to non-technical professionals, so that business logic can always be adjusted.
Extensible
Extend the capabilities of Graphbook, by making your own functional nodes in Python. Create a ML processing node to annotate a data point, a data source node to ingest from a TCP stream, or a human-in-the-loop node that awaits for human feedback.
Data
Seamlessly connect to multiple data sources, including S3 and local storage, to process unstructured data within your workflow. Explore and analyze your entire dataset directly within the editor interface, simplifying data exploration and analysis.
Models
Integrate your own ML models or off-the-shelf models from Huggingface. Experiment with different models to find the best fit for your data processing workflow. Automatic UI elements on each node simplify parameter adjustments and checkpoint management, enabling quick and seamless experimentations.
Views
Gain valuable insights into your data processing workflow with automated visualizations of node outputs, facilitating human-in-the-loop analysis. Monitor model categorizations, sentiment analysis, segmentations, and other labelings directly within the editor interface, empowering informed decision-making based on real-time insights.
Logs
Access detailed logs from each node or the entire system to generate valuable reports and diagnose workflow issues effectively. Enhance visibility by inserting custom log statements into the code of any node's execution lifecycle, and conveniently view them within the editor interface for comprehensive monitoring and troubleshooting.
Interactive
While iterating upon your workflow, you can run from an individual cell, which will also execute its dependent cells. You can sample single batches of data by stepping through a node, or run the entire workflow with all of your data when you’re ready. And, for whatever reason, you can pause/resume your data pipeline processing mid-execution.
Optimization
Graphbook prioritizes the optimization of ML workflow execution, handling multiprocessing I/O for reading and writing, batching inputs, and maximizing GPU and CPU utilization. ML engineers can concentrate on refining business logic, knowing that workflow performance is optimized for efficiency and scalability.

Hosting Options

Self-Hosted | Always Free
The software is free. Visit our open-source repository, clone, and deploy the most convenient way.
Download
Graphbook Cloud
Get started quickly and forget about provisioning enough resources such as GPUs.
Coming soon

Our most common asked questions.

Is this no-code ML?

No. But you can build no-code ML for your customers and internal teams with this framework.

Can I use a VCS with Graphbook?

Yes. Your nodes are written in Python and workflows are serialized as .json files. You are recommended to track everything with Git.

Can I write entire pipelines in Python?

Not yet, but we plan on adding this feature soon.  For now, you must assemble pipelines in the UI.

How can we deploy to production?

In Graphbook, you can continue to use your workflow as-is or set new variables (directly in the workflow) such as where your production database is.

Can I use an LLM like GPT?

Yes, Graphbook is abstract enough where you can implement anything that can be written in Python including sending API requests to OpenAI.

What makes Graphbook efficient?

The framework has a custom implementation of multiprocessing workers that run in the background for both loading and dumping to keep your GPU at max utilization.

How do I contribute?

We are actively looking for collaborators. You are very welcome to contribute! Visit our repo.