Container Tools Tutorial¶
This tutorial explains how to use Kubiya SDK to create tools based on Docker containers, providing a powerful way to integrate existing software into your AI workflows.
Prerequisites¶
- Python 3.8 or higher
- Docker installed and running
- Basic understanding of Python and Docker
- Kubiya SDK installed (
pip install kubiya-sdk)
What You'll Learn¶
- How to create simple container tools
- Using existing Docker images for tools
- Passing parameters to container tools
- Working with more complex container configurations
- Running tools on different infrastructures
1. Creating Your First Container Tool¶
Let's start with a simple tool that uses a Python container:
from kubiya_sdk import tool
@tool(image="python:3.12-slim")
def hello_world(name: str) -> str:
"""A simple greeting tool"""
return f"Hello, {name}!"
# Use the tool
result = hello_world("Kubiya")
print(result) # Output: Hello, Kubiya!
What's happening here:
- We decorated a function with @tool and specified a Docker image
- When we call the function, Kubiya:
1. Pulls the specified image (if not already available)
2. Creates a container from this image
3. Executes the function code inside the container
4. Returns the result
2. Using Existing Docker Images¶
One of the most powerful aspects of Kubiya is the ability to leverage existing Docker images:
from kubiya_sdk import tool
@tool(
image="alpine:latest",
command=["echo", "Hello, ${NAME}!"]
)
def alpine_greeting(name: str) -> str:
"""Greeting using Alpine Linux"""
# No code needed as we're using the container's command directly
pass
# Use the tool
result = alpine_greeting("Kubiya")
print(result) # Output: Hello, Kubiya!
In this example:
- We use the Alpine Linux image
- We provide a command to run in the container
- The ${NAME} in the command is replaced with the name parameter
- No Python function body is needed as execution happens via the command
3. Passing Parameters to Containers¶
There are several ways to pass parameters to containers:
Environment Variables¶
from kubiya_sdk import tool
@tool(
image="postgres:14",
command=["psql", "-c", "SELECT '${QUERY}'"],
environment={
"PGUSER": "${username}",
"PGPASSWORD": "${password}",
"PGDATABASE": "${database}",
"PGHOST": "${host}",
"QUERY": "${query}"
}
)
def run_sql(query: str, username: str, password: str, database: str, host: str) -> str:
"""Run a SQL query against a PostgreSQL database"""
# Execution happens in the container
pass
Command Arguments¶
from kubiya_sdk import tool
@tool(
image="node:18-slim",
command=["node", "-e", "console.log(`Processing ${ITEM_COUNT} items: ${ITEMS}`);"],
environment={
"ITEMS": "${items}",
"ITEM_COUNT": "${count}"
}
)
def process_with_node(items: str, count: int) -> str:
"""Process items using Node.js"""
# Execution happens in the container
pass
4. Working with Files¶
You can include files in your container tools:
from kubiya_sdk import tool, FileSpec
@tool(
image="python:3.12-slim",
with_files=[
FileSpec(
destination="/app/script.py",
content="""
import sys
name = sys.argv[1]
print(f"Hello, {name}! This is from a file.")
"""
)
],
command=["python", "/app/script.py", "${name}"]
)
def file_based_greeting(name: str) -> str:
"""Greeting using a file-based script"""
# Execution happens in the container
pass
5. Using Python Packages¶
You can install Python packages in the container:
from kubiya_sdk import tool
@tool(
image="python:3.12-slim",
requirements=["requests", "beautifulsoup4"]
)
def scrape_webpage(url: str) -> dict:
"""Scrape a web page and extract basic information"""
import requests
from bs4 import BeautifulSoup
response = requests.get(url)
soup = BeautifulSoup(response.text, "html.parser")
return {
"title": soup.title.string if soup.title else "",
"links": len(soup.find_all("a", href=True)),
"paragraphs": len(soup.find_all("p")),
"headings": len(soup.find_all(["h1", "h2", "h3", "h4", "h5", "h6"]))
}
6. Running Tools on Different Infrastructures¶
Kubiya tools can run on various infrastructures:
Local Docker (Default)¶
By default, tools run using your local Docker installation:
from kubiya_sdk import tool
@tool(image="python:3.12-slim")
def local_tool(data: str) -> str:
"""Runs on local Docker"""
return f"Processed: {data}"
# Runs on local Docker
result = local_tool("test data")
Kubernetes¶
You can run tools on Kubernetes:
from kubiya_sdk import tool
from kubiya_sdk.execution import execute_tool_in_kubernetes
@tool(image="python:3.12-slim")
def data_processor(data: list) -> dict:
"""Process data items"""
return {"processed_count": len(data)}
# Run the tool on Kubernetes
result = execute_tool_in_kubernetes(
"data_processor",
{"data": [1, 2, 3, 4, 5]},
namespace="kubiya-tools",
service_account="tool-runner"
)
Other Environments¶
Kubiya supports other execution environments like AWS, GCP, and Azure:
from kubiya_sdk import tool
from kubiya_sdk.execution import execute_tool_in_cloud
# Run on cloud provider
result = execute_tool_in_cloud(
"data_processor",
{"data": [1, 2, 3, 4, 5]},
provider="aws",
service="ecs",
cluster="processing-cluster"
)
7. Advanced Container Configuration¶
For more complex use cases, you can configure your containers extensively:
from kubiya_sdk import tool
@tool(
image="python:3.12-slim",
requirements=["pandas", "numpy", "matplotlib"],
container_name="data-analyzer",
network="kubiya-network",
working_dir="/app/data",
user="appuser",
volumes=[
{"source": "/tmp/data", "target": "/app/data"},
{"source": "/tmp/config", "target": "/app/config", "readonly": True}
],
environment={
"PYTHONPATH": "/app",
"CONFIG_FILE": "/app/config/settings.json",
"DATA_DIR": "/app/data"
},
ports=[{"container": 8080, "host": 8080}],
command_timeout=300 # 5 minutes
)
def analyze_data(dataset: str) -> dict:
"""Analyze a dataset with advanced container configuration"""
import pandas as pd
import numpy as np
import os
import json
# Read configuration
with open(os.environ["CONFIG_FILE"], "r") as f:
config = json.load(f)
# Load dataset
data_path = os.path.join(os.environ["DATA_DIR"], dataset)
df = pd.read_csv(data_path)
# Perform analysis
return {
"rows": len(df),
"columns": len(df.columns),
"mean": df.mean().to_dict(),
"config": config
}
8. Handling Tool Output¶
Kubiya tools can return various output types:
from kubiya_sdk import tool
from typing import Dict, List, Any
@tool(
image="python:3.12-slim",
requirements=["pandas", "matplotlib"]
)
def visualize_data(data: List[Dict[str, Any]]) -> Dict[str, Any]:
"""Visualize data and return statistics with a chart"""
import pandas as pd
import matplotlib.pyplot as plt
import base64
from io import BytesIO
# Convert to DataFrame
df = pd.DataFrame(data)
# Create a visualization
plt.figure(figsize=(8, 6))
df.plot(kind="bar")
plt.title("Data Visualization")
# Save the plot to a buffer
buffer = BytesIO()
plt.savefig(buffer, format="png")
buffer.seek(0)
# Convert to base64 for embedding
image_str = base64.b64encode(buffer.read()).decode("utf-8")
# Return both statistics and the visualization
return {
"statistics": {
"mean": df.mean().to_dict(),
"sum": df.sum().to_dict(),
"count": len(df)
},
"visualization": {
"format": "png",
"encoding": "base64",
"data": image_str
}
}
Next Steps¶
Now that you've learned how to create container tools, you can:
- Explore more advanced tool configurations in our reference documentation
- Learn about file mounting for more complex data handling
- Discover integration with service providers
- Explore our container reference for all available options
Happy building with Kubiya!