Real-World Examples
Complete, working examples for common ML use cases and infrastructure patterns
E-commerce Recommendation Pipeline
A complete end-to-end ML pipeline for product recommendations using collaborative filtering and content-based approaches. This example demonstrates multi-service orchestration across AWS Glue (ETL), SageMaker (training), and Spark (similarity processing).
Pipeline Architecture
(Glue)
(SageMaker)
(Spark)
(SageMaker)
Configuration (mlknife-compose.yaml)
name: ecommerce-recommendation-pipeline
author: ml-team
description: End-to-end product recommendation system
parameters:
environment: dev
data_path: s3://ml-pipeline-data/ecommerce/
model_path: s3://ml-models/recommendation/
executors:
glue_etl:
type: glue_etl
job_name: "ecommerce-data-preprocessing"
runtime: python3.9
role: AWSGlueServiceRole
glue_version: "5.0"
worker_type: G.2X
number_of_workers: 5
python_processor:
type: sagemaker_processor
class: sagemaker.sklearn.processing.SKLearnProcessor
role: ${pipeline.role}
instance_type: ml.c5.2xlarge
framework_version: 1.2-1
max_runtime_in_seconds: 7200
spark_processor:
type: sagemaker_processor
class: sagemaker.spark.processing.PySparkProcessor
role: ${pipeline.role}
instance_type: ml.c5.xlarge
instance_count: 2
max_runtime_in_seconds: 7200
services:
recommendation_endpoint:
type: sagemaker_endpoint
repository: ../services/
configuration:
endpoint_name: "ecommerce-recommendations-${parameters.environment}"
model_name: "recommendation-model-v1"
instance_type: ml.m5.large
initial_instance_count: 2
depends_on: []
tags:
service_type: sagemaker_endpoint
modules:
data_cleaning:
repository: ../modules
executor: ${executors.glue_etl}
entry_point: ./glue_jobs/clean_ecommerce_data.py
description: "Clean and validate raw ecommerce data"
job_parameters:
input_path: ${parameters.data_path}raw/
output_path: ${parameters.data_path}cleaned/
depends_on: []
feature_engineering:
repository: ../modules
executor: ${executors.python_processor}
entry_point: ./jobs/build_features.py
description: "Engineer features for model training"
depends_on: [data_cleaning]
job_parameters:
input_path: ${parameters.data_path}cleaned/
output_path: ${parameters.data_path}features/
similarity_computation:
executor: ${executors.spark_processor}
entry_point: com.company.SimilarityJob
build_command: "mvn clean package"
depends_on: [feature_engineering]
job_parameters:
input_path: ${parameters.data_path}features/
output_path: ${parameters.data_path}similarities/
model_training:
executor: ${executors.python_processor}
entry_point: train_recommendation_model.py
depends_on: [similarity_computation]
job_parameters:
features_path: ${parameters.data_path}features/
similarity_path: ${parameters.data_path}similarities/
model_path: ${parameters.model_path}
recommendation_table: "${services.recommendation_table.outputs.table_name}"
Try This Example
cd examples/pipeline-examples/ecommerce-recommendation/config/
mk p deploy
mk p visualize
Located in: examples/pipeline-examples/ecommerce-recommendation/
Semantic Search Service
A complete search service with OpenSearch Serverless, Lambda APIs, and Bedrock embeddings (Amazon Titan). Supports hybrid search (vector + keyword) with automatic vectorization during indexing.
Infrastructure Architecture
Configuration Highlights
services:
search_backend_service:
type: search_service
configuration:
service_name: "hermes-user-search-${parameters.version}-${parameters.env}"
search_type: "hybrid_search"
performance_tier: "balanced"
indices:
- name: "posts"
fields:
- name: "summary"
type: "text"
searchable: true
- name: "embedding_en"
type: "vector"
dimensions: 1024
similarity_function: "cosine"
embedding_config:
model_id: "cohere.embed-multilingual-v3"
service: "bedrock"
auto_vectorize: true
languages: ["english", "german", "french", "italian", "spanish", "chinese", "turkish"]
search_api:
type: lambda_function
configuration:
function_name: "hermes-search-api-${parameters.version}-${parameters.env}"
runtime: "python3.9"
timeout: 60
environment:
OPENSEARCH_ENDPOINT: "${services.search_backend_service.outputs.search_endpoint}"
BEDROCK_EMBED_MODEL_ID: "cohere.embed-multilingual-v3"
search_api_gateway:
type: api_gateway
configuration:
api_name: "hermes-search-api-gw-${parameters.version}-${parameters.env}"
resources:
- path: "posts/search"
methods: ["GET", "POST", "OPTIONS"]
integration:
type: "AWS_PROXY"
lambda_function: "hermes-search-api-${parameters.version}-${parameters.env}"
Try This Example
cd examples/service-examples/semantic-search-service/
mk s deploy
mk s status
Located in: examples/service-examples/semantic-search-service/
Basic DynamoDB Setup
A simple service deployment example showing DynamoDB table creation with automatic security configuration. Perfect for understanding the basics of ModelKnife service deployment.
Complete Configuration
name: basic-dynamodb-example
author: team
description: Simple DynamoDB table deployment
parameters:
environment: dev
table_name: "user-profiles-${parameters.environment}"
services:
user_profiles_table:
type: dynamodb_table
configuration:
table_name: ${parameters.table_name}
partition_key: "user_id"
partition_key_type: "S"
sort_key: "created_at"
sort_key_type: "S"
billing_mode: "PAY_PER_REQUEST"
# Global Secondary Index
global_secondary_indexes:
- index_name: "email-index"
partition_key: "email"
partition_key_type: "S"
projection_type: "ALL"
# Tags for resource management
tags:
Environment: ${parameters.environment}
Project: basic-example
Owner: ml-team
Try This Example
cd examples/service-examples/basic-dynamodb/
mk s validate
mk s deploy
mk s status
Located in: examples/service-examples/basic-dynamodb/
More Examples
Additional examples and templates available in the repository
Bedrock Batch Inference
Batch AILarge-scale batch inference using AWS Bedrock with managed processing and automatic scaling.
Social Media Platform
Complete AppFull social media application with user management, content storage, and real-time APIs.
Need a Custom Example?
If you have a specific use case that isn't covered by these examples, let us know! We're continuously adding new examples based on community needs.