Skip to content

Container Tools Tutorial

This tutorial explains how to use Kubiya SDK to create tools based on Docker containers, providing a powerful way to integrate existing software into your AI workflows.

Prerequisites

  • Python 3.8 or higher
  • Docker installed and running
  • Basic understanding of Python and Docker
  • Kubiya SDK installed (pip install kubiya-sdk)

What You'll Learn

  1. How to create simple container tools
  2. Using existing Docker images for tools
  3. Passing parameters to container tools
  4. Working with more complex container configurations
  5. Running tools on different infrastructures

1. Creating Your First Container Tool

Let's start with a simple tool that uses a Python container:

Python
from kubiya_sdk import tool

@tool(image="python:3.12-slim")
def hello_world(name: str) -> str:
    """A simple greeting tool"""
    return f"Hello, {name}!"

# Use the tool
result = hello_world("Kubiya")
print(result)  # Output: Hello, Kubiya!

What's happening here: - We decorated a function with @tool and specified a Docker image - When we call the function, Kubiya: 1. Pulls the specified image (if not already available) 2. Creates a container from this image 3. Executes the function code inside the container 4. Returns the result

2. Using Existing Docker Images

One of the most powerful aspects of Kubiya is the ability to leverage existing Docker images:

Python
from kubiya_sdk import tool

@tool(
    image="alpine:latest",
    command=["echo", "Hello, ${NAME}!"]
)
def alpine_greeting(name: str) -> str:
    """Greeting using Alpine Linux"""
    # No code needed as we're using the container's command directly
    pass

# Use the tool
result = alpine_greeting("Kubiya")
print(result)  # Output: Hello, Kubiya!

In this example: - We use the Alpine Linux image - We provide a command to run in the container - The ${NAME} in the command is replaced with the name parameter - No Python function body is needed as execution happens via the command

3. Passing Parameters to Containers

There are several ways to pass parameters to containers:

Environment Variables

Python
from kubiya_sdk import tool

@tool(
    image="postgres:14",
    command=["psql", "-c", "SELECT '${QUERY}'"],
    environment={
        "PGUSER": "${username}",
        "PGPASSWORD": "${password}",
        "PGDATABASE": "${database}",
        "PGHOST": "${host}",
        "QUERY": "${query}"
    }
)
def run_sql(query: str, username: str, password: str, database: str, host: str) -> str:
    """Run a SQL query against a PostgreSQL database"""
    # Execution happens in the container
    pass

Command Arguments

Python
from kubiya_sdk import tool

@tool(
    image="node:18-slim",
    command=["node", "-e", "console.log(`Processing ${ITEM_COUNT} items: ${ITEMS}`);"],
    environment={
        "ITEMS": "${items}",
        "ITEM_COUNT": "${count}"
    }
)
def process_with_node(items: str, count: int) -> str:
    """Process items using Node.js"""
    # Execution happens in the container
    pass

4. Working with Files

You can include files in your container tools:

Python
from kubiya_sdk import tool, FileSpec

@tool(
    image="python:3.12-slim",
    with_files=[
        FileSpec(
            destination="/app/script.py",
            content="""
import sys
name = sys.argv[1]
print(f"Hello, {name}! This is from a file.")
"""
        )
    ],
    command=["python", "/app/script.py", "${name}"]
)
def file_based_greeting(name: str) -> str:
    """Greeting using a file-based script"""
    # Execution happens in the container
    pass

5. Using Python Packages

You can install Python packages in the container:

Python
from kubiya_sdk import tool

@tool(
    image="python:3.12-slim",
    requirements=["requests", "beautifulsoup4"]
)
def scrape_webpage(url: str) -> dict:
    """Scrape a web page and extract basic information"""
    import requests
    from bs4 import BeautifulSoup

    response = requests.get(url)
    soup = BeautifulSoup(response.text, "html.parser")

    return {
        "title": soup.title.string if soup.title else "",
        "links": len(soup.find_all("a", href=True)),
        "paragraphs": len(soup.find_all("p")),
        "headings": len(soup.find_all(["h1", "h2", "h3", "h4", "h5", "h6"]))
    }

6. Running Tools on Different Infrastructures

Kubiya tools can run on various infrastructures:

Local Docker (Default)

By default, tools run using your local Docker installation:

Python
from kubiya_sdk import tool

@tool(image="python:3.12-slim")
def local_tool(data: str) -> str:
    """Runs on local Docker"""
    return f"Processed: {data}"

# Runs on local Docker
result = local_tool("test data")

Kubernetes

You can run tools on Kubernetes:

Python
from kubiya_sdk import tool
from kubiya_sdk.execution import execute_tool_in_kubernetes

@tool(image="python:3.12-slim")
def data_processor(data: list) -> dict:
    """Process data items"""
    return {"processed_count": len(data)}

# Run the tool on Kubernetes
result = execute_tool_in_kubernetes(
    "data_processor",
    {"data": [1, 2, 3, 4, 5]},
    namespace="kubiya-tools",
    service_account="tool-runner"
)

Other Environments

Kubiya supports other execution environments like AWS, GCP, and Azure:

Python
from kubiya_sdk import tool
from kubiya_sdk.execution import execute_tool_in_cloud

# Run on cloud provider
result = execute_tool_in_cloud(
    "data_processor",
    {"data": [1, 2, 3, 4, 5]},
    provider="aws",
    service="ecs",
    cluster="processing-cluster"
)

7. Advanced Container Configuration

For more complex use cases, you can configure your containers extensively:

Python
from kubiya_sdk import tool

@tool(
    image="python:3.12-slim",
    requirements=["pandas", "numpy", "matplotlib"],
    container_name="data-analyzer",
    network="kubiya-network",
    working_dir="/app/data",
    user="appuser",
    volumes=[
        {"source": "/tmp/data", "target": "/app/data"},
        {"source": "/tmp/config", "target": "/app/config", "readonly": True}
    ],
    environment={
        "PYTHONPATH": "/app",
        "CONFIG_FILE": "/app/config/settings.json",
        "DATA_DIR": "/app/data"
    },
    ports=[{"container": 8080, "host": 8080}],
    command_timeout=300  # 5 minutes
)
def analyze_data(dataset: str) -> dict:
    """Analyze a dataset with advanced container configuration"""
    import pandas as pd
    import numpy as np
    import os
    import json

    # Read configuration
    with open(os.environ["CONFIG_FILE"], "r") as f:
        config = json.load(f)

    # Load dataset
    data_path = os.path.join(os.environ["DATA_DIR"], dataset)
    df = pd.read_csv(data_path)

    # Perform analysis
    return {
        "rows": len(df),
        "columns": len(df.columns),
        "mean": df.mean().to_dict(),
        "config": config
    }

8. Handling Tool Output

Kubiya tools can return various output types:

Python
from kubiya_sdk import tool
from typing import Dict, List, Any

@tool(
    image="python:3.12-slim",
    requirements=["pandas", "matplotlib"]
)
def visualize_data(data: List[Dict[str, Any]]) -> Dict[str, Any]:
    """Visualize data and return statistics with a chart"""
    import pandas as pd
    import matplotlib.pyplot as plt
    import base64
    from io import BytesIO

    # Convert to DataFrame
    df = pd.DataFrame(data)

    # Create a visualization
    plt.figure(figsize=(8, 6))
    df.plot(kind="bar")
    plt.title("Data Visualization")

    # Save the plot to a buffer
    buffer = BytesIO()
    plt.savefig(buffer, format="png")
    buffer.seek(0)

    # Convert to base64 for embedding
    image_str = base64.b64encode(buffer.read()).decode("utf-8")

    # Return both statistics and the visualization
    return {
        "statistics": {
            "mean": df.mean().to_dict(),
            "sum": df.sum().to_dict(),
            "count": len(df)
        },
        "visualization": {
            "format": "png",
            "encoding": "base64",
            "data": image_str
        }
    }

Next Steps

Now that you've learned how to create container tools, you can:

  1. Explore more advanced tool configurations in our reference documentation
  2. Learn about file mounting for more complex data handling
  3. Discover integration with service providers
  4. Explore our container reference for all available options

Happy building with Kubiya!