Services Reference
Complete reference for all infrastructure service types available in ModelKnife
Available Service Types
Compute
Storage & Data
Search & API
Streaming & Events
Environment Variable Usage Examples
Standard Syntax: Use consistent syntax for environment variables and service references.
Parameter References
-
${parameters.environment}
- Current environment (dev, staging, prod) ${parameters.region}
- AWS region-
${parameters.project_name}
- Project identifier
Service Output References
-
${services.my_function.outputs.arn}
- Lambda function ARN -
${services.data_bucket.outputs.bucket_name}
- S3 bucket name -
${services.user_table.outputs.table_name}
- DynamoDB table name -
${services.api.outputs.invoke_url}
- API Gateway invoke URL
Common Patterns
-
resource-name-${parameters.environment}
- Environment-specific naming -
${parameters.project_name}-${parameters.environment}
- Project and environment -
arn:aws:service:region:account:resource/${services.service.outputs.name}
- ARN construction
Validation Tips
- Ensure referenced services exist in your configuration
- Use correct output property names for each service type
- Maintain consistent naming conventions across environments
- Test parameter substitution in different environments
Lambda Function
Serverless compute for APIs, real-time inference, and event processing
Key Features
Supports Python, Go, Node.js, and Java with automatic dependency management. Python functions auto-detect native dependencies and choose optimal build strategy. Features include EventBridge scheduling, VPC access, and dependency layer separation.
services:
simple_function:
type: lambda_function
repository: "../src"
function_name: "simple-${parameters.environment}"
runtime: "python3.11"
entry_point: "app.lambda_handler"
code_path: "python-lambda"
# Uses defaults: memory_size=128MB, timeout=30s
services:
api_function:
type: lambda_function
repository: "../src"
function_name: "api-${parameters.environment}"
runtime: "python3.11"
entry_point: "app.lambda_handler"
code_path: "python-lambda"
memory_size: 512 # Override default (128 MB)
timeout: 60 # Override default (30 seconds)
environment:
API_KEY: "${parameters.api_key}"
services:
# Python: entry point = "filename.function_name"
# Expects requirements.txt in src/ directory
python_api:
type: lambda_function
repository: "../src"
function_name: "python-api-${parameters.environment}"
runtime: "python3.11"
entry_point: "app.lambda_handler" # Calls lambda_handler() in app.py
code_path: "python_lambda/"
build_strategy: "auto"
build_layer: true
# Node.js: entry point = "filename.function_name"
nodejs_api:
type: lambda_function
repository: "../src"
function_name: "nodejs-api-${parameters.environment}"
runtime: "nodejs18.x"
entry_point: "index.handler" # Calls exports.handler in index.js
code_path: "python_lambda"
# Java: entry point = "package.Class::method"
java_api:
type: lambda_function
repository: "../src"
function_name: "java-api-${parameters.environment}"
runtime: "java11"
entry_point: "com.example.Handler::handleRequest"
code_path: "java-lambda"
memory_size: 1024
timeout: 60
# Go: entry point = executable name (typically "main")
go_api:
type: lambda_function
repository: "../src"
function_name: "go-api-${parameters.environment}"
runtime: "go1.x"
entry_point: "main" # Compiled binary name
code_path: "go-lambda/api"
Project Organization
The following folder structure shows how to organize code based on the above examples. The final code path is resolved as: configuration_file_directory/repository/code_path
- configuration_file_directory: Directory containing
mlknife-compose.yaml
(e.g.,conf/
) - repository: Code repository path relative to the configuration file directory (e.g.,
"../src"
) - code_path: Specific code directory (e.g.,
"python-lambda"
,"go-lambda/api"
)
my-lambda-project/
├── conf/
│ └── mlknife-compose.yaml # Configuration file
└── src/ # Source code directory
├── python-lambda/ # Python Lambda functions
│ ├── app.py # Main entry point file
│ ├── processor.py # Additional modules
│ └── requirements.txt # Python dependencies
├── nodejs-lambda/ # Node.js Lambda functions
│ ├── index.js # Main entry point file
│ ├── utils.js # Helper modules
│ └── package.json # Node.js dependencies
├── java-lambda/ # Java Lambda functions
│ ├── src/
│ │ └── main/
│ │ └── java/
│ │ └── com/
│ │ └── example/
│ │ └── Handler.java
│ ├── pom.xml # Maven dependencies
│ └── build.gradle # Gradle dependencies (alternative)
└── go-lambda/ # Go Lambda functions
├── api/
│ ├── main.go # Main entry point file
│ └── go.mod # Go module file
└── processor/
├── main.go
└── go.mod
services:
scheduled_processor:
type: lambda_function
repository: "../src"
function_name: "processor-${parameters.environment}"
runtime: "python3.11"
entry_point: "processor.main"
code_path: "python-lambda"
memory_size: 1024
timeout: 300
# EventBridge scheduling
schedule:
cron: "0 8 * * 1-5" # Weekdays at 8 AM
timezone: "America/New_York"
enabled: true
# VPC access for private resources
vpc_config:
subnet_ids: ["subnet-12345678"]
security_group_ids: ["sg-abcdef12"]
# Custom layers
layers:
- "arn: aws:lambda:us-east-1:123456789012:layer:utils:1"
environment:
DB_HOST: "${parameters.database_host}"
Configuration Parameters
Basic Configuration
- function_name (string, optional) - Lambda function name
- repository (string, optional) - Code repository location (required, service level)
Service Configuration
- build_strategy (string, optional) - Python build strategy: "local", "docker", "auto" (default: "auto")
- code_path (string, optional) - Path within repository
- environment (string, optional) - Environment variables (key-value pairs, default: empty)
- entry_point (string, optional) - Entry point or handler function (format varies by runtime, see Entry point Formats section)
- runtime (string, optional) - Runtime (python3.8-3.12, nodejs18.x, java11, go1.x, etc.)
Performance Settings
- memory_size (string, optional) - Memory in MB (128-10240, 64MB increments, default: 128)
- timeout (string, optional) - Max execution time in seconds (1-900, default: 30)
Security Configuration
- vpc_config (string, optional) - VPC access (subnet_ids, security_group_ids, default: none)
Scheduling Configuration
- schedule (string, optional) - EventBridge cron scheduling (cron, timezone, enabled, default: none)
Advanced Options
- build_layer (string, optional) - Separate dependencies into reusable layer (default: false)
- layers (string, optional) - List of Lambda layer ARNs (default: empty)
Parameter Validation
Validation Rules: Parameters are validated according to AWS service limits and naming conventions.
-
function_name: Must be 1-64 characters,
alphanumeric and hyphens only
Examples:my-function
,data-processor-prod
,api-handler-v2
-
memory_size: 128-10240 MB in 64MB increments
Examples:128
,512
,1024
-
timeout: 1-900 seconds
Examples:30
,300
,900
-
runtime: python3.8, python3.9, python3.10,
python3.11, python3.12, nodejs18.x, java11, go1.x
Examples:python3.11
,nodejs18.x
,java11
Common Validation Errors
-
Parameter name contains invalid characters: Use
only alphanumeric characters, hyphens, underscores, and periods
Example: Change "my bucket" to "my-bucket" -
Parameter value exceeds maximum length: Reduce
the parameter value to within the allowed range
Example: Function names must be 64 characters or less -
Service reference not found: Ensure the
referenced service exists and is properly named
Example: Check that ${services.my-service.outputs.arn} references an existing service
Entry point Formats by Runtime
-
Python:
filename.function_name
(e.g.,app.lambda_handler
callslambda_handler()
inapp.py
) -
Node.js:
filename.function_name
(e.g.,index.handler
callsexports.handler
inindex.js
) -
Java:
package.Class::method
(e.g.,com.example.Handler::handleRequest
) -
Go:
main
orbootstrap
(executable name, typicallymain
) -
.NET:
Assembly::Namespace.Class::Method
(e.g.,MyApp::MyApp.Function::Handler
) -
Ruby:
filename.method_name
(e.g.,lambda_function.lambda_handler
)
Runtime-Specific Features
-
Python (3.8-3.12): Automatically installs
dependencies from
requirements.txt
located in thecode_path
directory. Uses local pip for pure Python packages (fast) or Docker for native dependencies (reliable). Auto-detects packages like numpy, pandas, scikit-learn that require compiled C extensions. Supportsbuild_strategy
andbuild_layer
options. - Node.js (14.x-20.x): Automatic npm install for package.json dependencies. Supports ES modules and CommonJS.
-
Java (8, 11, 17, 21): Auto-detects Maven
(pom.xml) or Gradle (build.gradle) and runs build automatically.
Maven:
mvn clean package -DskipTests
, Gradle:./gradlew build -x test
. - Go (1.x, provided.al2): Compiles to bootstrap executable. Binary must be named according to handler value.
- Other Runtimes: .NET (dotnet6-8), Ruby (2.7-3.3) with automatic dependency resolution
Python Requirements & Build Strategies
Python functions automatically look for
requirements.txt
in the
code_path
directory and choose the optimal build
strategy:
- Local Build: Fast pip install with manylinux targeting. Works for pure Python packages (requests, boto3, pydantic)
- Docker Build: Uses AWS Lambda Python base image. Required for native dependencies with C extensions (pandas, numpy, scikit-learn, opencv)
- Auto Mode: Automatically detects native packages in requirements.txt and chooses docker when needed
Example requirements.txt location:
./src/requirements.txt
for
code_path: "src"
Common packages:
-
Pure Python:
requests==2.31.0
,boto3>=1.26.0
,pydantic==2.0.0
-
Native deps:
pandas==1.5.0
,numpy>=1.20.0
,scikit-learn==1.3.0
Advanced Features
-
EventBridge Scheduling: Cron expressions with
timezone support (
"0 8 * * 1-5"
= weekdays 8AM) -
VPC Access: Private subnet access with
vpc_config
- Layer Management: Automatic layer creation and reuse for dependencies
SageMaker Endpoint
Real-time ML model inference endpoints with auto-scaling
services:
model_endpoint:
type: sagemaker_endpoint
repository: "."
configuration:
endpoint_name: "ml-model-${parameters.environment}"
model_name: "my-model"
code_path: "inference"
instance_type: "ml.m5.large"
initial_instance_count: 1
requirements_file: "requirements.txt"
environment_variables:
MODEL_VERSION: "v1.0"
Configuration Parameters
Basic Configuration
- endpoint_name (string, optional) - SageMaker endpoint name
- instance_type (string, optional) - EC2 instance type for hosting
- model_name (string, optional) - Model name for the endpoint
Service Configuration
- code_path (string, optional) - Path to inference code directory
- initial_instance_count (string, optional) - Number of instances to start with
- requirements_file (string, optional) - Python dependencies file
Advanced Options
- environment_variables (string, optional) - Environment variables for the model
S3 Bucket
Object storage for data lakes, model artifacts, and file storage
services:
data_lake:
type: s3_bucket
bucket_name: "ml-data-${parameters.environment}"
versioning: true
services:
secure_data_lake:
type: s3_bucket
configuration:
bucket_name: "ml-secure-data-${parameters.environment}"
versioning: true
encryption:
SSEAlgorithm: "aws:kms"
KMSMasterKeyID: "alias/s3-encryption-key"
BucketKeyEnabled: true
lifecycle_configuration:
- Id: "DeleteOldVersions"
Status: "Enabled"
NoncurrentVersionExpiration:
NoncurrentDays: 30
- Id: "TransitionToIA"
Status: "Enabled"
Transitions:
- Days: 30
StorageClass: "STANDARD_IA"
- Days: 90
StorageClass: "GLACIER"
Configuration Parameters
Basic Configuration
- bucket_name (string, required) - S3 bucket name (must be globally unique, 3-63 characters, letters/numbers/hyphens/periods only)
Service Configuration
- BucketKeyEnabled (string, optional) - Use S3 Bucket Keys to reduce KMS costs (boolean)
- Expiration (string, optional) - Object expiration settings
- Id (string, optional) - Unique rule identifier
- KMSMasterKeyID (string, optional) - KMS key ID or alias (required for aws:kms)
- NoncurrentVersionExpiration (string, optional) - Expiration for non-current versions
- SSEAlgorithm (string, optional) - Encryption algorithm ("AES256" or "aws:kms")
- Status (string, optional) - Rule status ("Enabled" or "Disabled")
- Transitions (string, optional) - List of transition rules with Days and StorageClass
Storage Configuration
- versioning (boolean, optional, default: false) - Enable object versioning (cannot be disabled once enabled, only suspended)
Parameter Validation
Validation Rules: Parameters are validated according to AWS service limits and naming conventions.
-
bucket_name: 3-63 characters, lowercase letters,
numbers, hyphens, and periods only
Examples:my-data-bucket
,company-logs-2024
,ml-model-artifacts
-
versioning: Boolean value
Examples:true
,false
Common Validation Errors
-
Parameter name contains invalid characters: Use
only alphanumeric characters, hyphens, underscores, and periods
Example: Change "my bucket" to "my-bucket" -
Parameter value exceeds maximum length: Reduce
the parameter value to within the allowed range
Example: Function names must be 64 characters or less -
Service reference not found: Ensure the
referenced service exists and is properly named
Example: Check that ${services.my-service.outputs.arn} references an existing service
Implementation Notes
- Bucket names must be globally unique across all AWS accounts
- Versioning cannot be disabled once enabled, only suspended
- Lifecycle rules are processed in the order they appear in the configuration
- Encryption settings apply to all objects in the bucket by default
- Use KMS encryption for enhanced security and compliance requirements
DynamoDB Table
NoSQL database for feature stores, metadata, and real-time data access with advanced scaling and security features
services:
feature_store:
type: dynamodb_table
configuration:
table_name: "ml-features-${parameters.environment}"
attribute_definitions:
feature_id: "S" # String type
timestamp: "S" # String type
user_id: "S" # String type for GSI
partition_key: "feature_id"
sort_key: "timestamp"
billing_mode: "PAY_PER_REQUEST" # On-demand billing
point_in_time_recovery: true
deletion_protection: true
global_secondary_indexes:
- index_name: "user-index"
partition_key: "user_id"
projection:
projection_type: "ALL"
services:
advanced_table:
type: dynamodb_table
configuration:
table_name: "ml-features-${parameters.environment}"
attribute_definitions:
feature_id: "S"
timestamp: "S"
user_id: "S"
score: "N"
partition_key: "feature_id"
sort_key: "timestamp"
billing_mode: "PROVISIONED"
provisioned_throughput:
ReadCapacityUnits: 10
WriteCapacityUnits: 10
auto_scaling:
table:
read_capacity:
min_capacity: 5
max_capacity: 100
target_utilization: 70.0
write_capacity:
min_capacity: 5
max_capacity: 100
target_utilization: 70.0
global_secondary_indexes:
user-index:
read_capacity:
min_capacity: 5
max_capacity: 50
target_utilization: 70.0
server_side_encryption:
enabled: true
kms_key_id: "alias/dynamodb-key"
global_secondary_indexes:
- index_name: "user-index"
partition_key: "user_id"
sort_key: "score"
projection:
projection_type: "ALL"
local_secondary_indexes:
- index_name: "score-index"
sort_key: "score"
projection:
projection_type: "KEYS_ONLY"
stream_specification:
stream_enabled: true
stream_view_type: "NEW_AND_OLD_IMAGES"
backup_configuration:
on_demand_backup:
backup_name: "ml-features-backup"
scheduled_backup:
schedule_expression: "cron(0 2 * * ? *)"
retention_period_days: 30
table_class: "STANDARD"
# Table-level auto-scaling only
services:
scalable_table:
type: dynamodb_table
configuration:
table_name: "user-sessions-${parameters.environment}"
attribute_definitions:
session_id: "S"
user_id: "S"
partition_key: "session_id"
billing_mode: "PROVISIONED"
provisioned_throughput:
ReadCapacityUnits: 5
WriteCapacityUnits: 5
auto_scaling:
table:
read_capacity:
min_capacity: 5
max_capacity: 200
target_utilization: 70.0
scale_in_cooldown: 300 # 5 minutes
scale_out_cooldown: 60 # 1 minute
write_capacity:
min_capacity: 5
max_capacity: 100
target_utilization: 80.0
---
# GSI auto-scaling configuration
services:
multi_index_table:
type: dynamodb_table
configuration:
table_name: "product-catalog-${parameters.environment}"
attribute_definitions:
product_id: "S"
category: "S"
price: "N"
brand: "S"
partition_key: "product_id"
billing_mode: "PROVISIONED"
provisioned_throughput:
ReadCapacityUnits: 10
WriteCapacityUnits: 10
global_secondary_indexes:
- index_name: "category-price-index"
partition_key: "category"
sort_key: "price"
projection:
projection_type: "ALL"
provisioned_throughput:
ReadCapacityUnits: 5
WriteCapacityUnits: 5
- index_name: "brand-index"
partition_key: "brand"
projection:
projection_type: "KEYS_ONLY"
provisioned_throughput:
ReadCapacityUnits: 3
WriteCapacityUnits: 3
auto_scaling:
table:
read_capacity:
min_capacity: 5
max_capacity: 100
target_utilization: 70.0
write_capacity:
min_capacity: 5
max_capacity: 50
target_utilization: 70.0
global_secondary_indexes:
category-price-index:
read_capacity:
min_capacity: 5
max_capacity: 50
target_utilization: 75.0
write_capacity:
min_capacity: 5
max_capacity: 25
target_utilization: 75.0
brand-index:
read_capacity:
min_capacity: 3
max_capacity: 20
target_utilization: 80.0
write_capacity:
min_capacity: 3
max_capacity: 10
target_utilization: 80.0
# Comprehensive index configuration
services:
analytics_table:
type: dynamodb_table
configuration:
table_name: "user-analytics-${parameters.environment}"
attribute_definitions:
user_id: "S" # Table partition key
timestamp: "S" # Table sort key
event_type: "S" # GSI partition key
session_id: "S" # GSI sort key
score: "N" # LSI sort key
region: "S" # Additional GSI partition key
partition_key: "user_id"
sort_key: "timestamp"
billing_mode: "PAY_PER_REQUEST"
# Global Secondary Indexes (different partition key)
global_secondary_indexes:
# GSI with both partition and sort key
- index_name: "event-session-index"
partition_key: "event_type"
sort_key: "session_id"
projection:
projection_type: "ALL" # Include all attributes
# GSI with partition key only
- index_name: "region-index"
partition_key: "region"
projection:
projection_type: "KEYS_ONLY" # Only key attributes
# GSI with selective attribute projection
- index_name: "event-timestamp-index"
partition_key: "event_type"
sort_key: "timestamp"
projection:
projection_type: "INCLUDE"
non_key_attributes:
- "user_id"
- "score"
- "session_id"
# Local Secondary Indexes (same partition key, different sort key)
local_secondary_indexes:
# LSI for querying by score within user_id
- index_name: "user-score-index"
sort_key: "score"
projection:
projection_type: "KEYS_ONLY"
# LSI for querying by event_type within user_id
- index_name: "user-event-index"
sort_key: "event_type"
projection:
projection_type: "ALL"
---
# Provisioned billing with GSI-specific capacity
services:
ecommerce_table:
type: dynamodb_table
configuration:
table_name: "orders-${parameters.environment}"
attribute_definitions:
order_id: "S"
customer_id: "S"
order_date: "S"
status: "S"
total_amount: "N"
partition_key: "order_id"
sort_key: "order_date"
billing_mode: "PROVISIONED"
provisioned_throughput:
ReadCapacityUnits: 20
WriteCapacityUnits: 10
global_secondary_indexes:
# Customer orders index with custom capacity
- index_name: "customer-date-index"
partition_key: "customer_id"
sort_key: "order_date"
projection:
projection_type: "ALL"
provisioned_throughput:
ReadCapacityUnits: 10 # Different from table capacity
WriteCapacityUnits: 5
# Status index for order management
- index_name: "status-date-index"
partition_key: "status"
sort_key: "order_date"
projection:
projection_type: "INCLUDE"
non_key_attributes:
- "customer_id"
- "total_amount"
provisioned_throughput:
ReadCapacityUnits: 5
WriteCapacityUnits: 2
local_secondary_indexes:
# Sort orders by total amount within order_id
- index_name: "order-amount-index"
sort_key: "total_amount"
projection:
projection_type: "INCLUDE"
non_key_attributes:
- "customer_id"
- "status"
Configuration Parameters
Basic Configuration
- index_name (string, required) - Unique index name
- index_name (string, required) - Unique index name
- table_name (string, required) - DynamoDB table name (supports environment variables). Must be 3-255 characters, alphanumeric with underscores, hyphens, and periods allowed.
- global_secondary_indexes.{index_name} (string, optional) - Per-GSI scaling configuration
- stream_view_type (string, optional) - Stream content: "KEYS_ONLY", "NEW_IMAGE", "OLD_IMAGE", "NEW_AND_OLD_IMAGES"
Service Configuration
- partition_key (string, required) - Primary partition key attribute name. Must be defined in attribute_definitions.
- partition_key (string, required) - GSI partition key (must be in attribute_definitions)
- sort_key (string, required) - LSI sort key (different from table sort_key, must be in attribute_definitions)
- cross_account_access (string, optional) - Cross-account access configuration with trusted_account_ids and access_level
- enabled (boolean, optional) - Enable/disable encryption
- iam_policy_templates (string, optional) - List of predefined policy templates: read_only, read_write, stream_consumer, ml_feature_store, admin
- on_demand_backup (string, optional) - Manual backup configuration with backup_name
- point_in_time_recovery (string, optional, default: false) - Enable continuous backups for 35 days with point-in-time recovery.
- projection (string, optional) - Projection configuration: projection_type ("ALL", "KEYS_ONLY", "INCLUDE") and non_key_attributes for INCLUDE
- projection (string, optional) - Projection configuration
- resource_policy (string, optional) - Custom resource-based policy document
- route_table_ids (string, optional) - List of route table IDs
- sort_key (string, optional) - Sort key attribute name. Must be defined in attribute_definitions and different from partition_key.
- sort_key (string, optional) - GSI sort key (must be in attribute_definitions)
- stream_enabled (boolean, optional) - Enable/disable streams
- table.read_capacity/write_capacity (string, optional) - Table-level scaling with min_capacity, max_capacity, target_utilization (0-100)
- vpc_endpoint_id (string, optional) - VPC endpoint ID (vpce-*)
Performance Settings
- provisioned_throughput (string, required) - Capacity settings with ReadCapacityUnits and WriteCapacityUnits (minimum 1 each).
- billing_mode (string, optional, default: "PAY_PER_REQUEST") - Billing mode: "PAY_PER_REQUEST" for on-demand or "PROVISIONED" for predictable capacity.
- provisioned_throughput (string, optional) - GSI-specific capacity (PROVISIONED billing only)
Security Configuration
- kms_key_id (string, optional) - KMS key ID, alias, or ARN. Use "alias/aws/dynamodb" for AWS managed key
Network Configuration
- security_group_ids (string, optional) - List of security group IDs
- subnet_ids (string, optional) - List of subnet IDs
Scheduling Configuration
- scheduled_backup (string, optional) - Automated backup with schedule_expression (cron format) and retention_period_days (1-35)
Parameter Validation
Validation Rules: Parameters are validated according to AWS service limits and naming conventions.
-
table_name: 3-255 characters, alphanumeric,
hyphens, underscores, and periods only
Examples:user-sessions
,product_catalog
,event.logs
-
billing_mode: PAY_PER_REQUEST or PROVISIONED
Examples:PAY_PER_REQUEST
,PROVISIONED
Common Validation Errors
-
Parameter name contains invalid characters: Use
only alphanumeric characters, hyphens, underscores, and periods
Example: Change "my bucket" to "my-bucket" -
Parameter value exceeds maximum length: Reduce
the parameter value to within the allowed range
Example: Function names must be 64 characters or less -
Service reference not found: Ensure the
referenced service exists and is properly named
Example: Check that ${services.my-service.outputs.arn} references an existing service
Implementation Notes
- Billing Mode Selection: Use PAY_PER_REQUEST for unpredictable workloads and PROVISIONED with auto-scaling for consistent traffic patterns.
- Auto-Scaling: Requires PROVISIONED billing mode and appropriate IAM permissions for CloudWatch and Application Auto Scaling.
-
Global Secondary Indexes (GSI):
- Maximum 20 GSIs per table
- Each GSI consumes additional read/write capacity and storage
- Can have different partition and sort keys from the main table
- Can be created after table creation
- In PROVISIONED billing mode, each GSI can have independent capacity settings
- Projection types: ALL (all attributes), KEYS_ONLY (key attributes only), INCLUDE (specified attributes)
-
Local Secondary Indexes (LSI):
- Maximum 10 LSIs per table
- Can only be created at table creation time (not after)
- Must use the same partition key as the main table
- Must have a different sort key from the main table
- Share read/write capacity with the main table
- Total item size limit of 400KB per partition key value (including all LSI items)
-
Index Validation Rules:
- All partition and sort keys used in indexes must be defined in attribute_definitions
- Index names must be unique within the table
- LSI sort key cannot be the same as the table sort key
- For INCLUDE projection, non_key_attributes must not include key attributes
- Encryption: Encryption at rest is enabled by default with AWS managed keys. Customer managed KMS keys provide additional control but incur extra costs.
- Streams: Stream records are retained for 24 hours. Use for real-time processing, replication, or analytics.
- Table Class: STANDARD_INFREQUENT_ACCESS reduces storage costs by ~60% but increases access costs. Suitable for infrequently accessed data.
- Capacity Planning: For PROVISIONED billing, consider query patterns when setting GSI capacity. Read-heavy GSIs may need higher read capacity than the main table.
SageMaker Feature Store
Centralized feature repository for ML model training and inference
services:
user_features:
type: sagemaker_feature_store
configuration:
feature_group_name: "user-features-${parameters.environment}"
record_identifier_name: "user_id"
event_time_feature_name: "event_time"
feature_definitions:
user_id: "String"
age: "Integral"
income: "Fractional"
event_time: "String"
online_store_config:
enable_online_store: true
offline_store_config:
s3_storage_config:
s3_uri: "s3://${services.feature_bucket.bucket_name}/features/"
Configuration Parameters
Basic Configuration
- event_time_feature_name (string, optional) - Timestamp field name
- feature_group_name (string, optional) - Feature group name in SageMaker
- record_identifier_name (string, optional) - Primary key field name
Service Configuration
- feature_definitions (string, optional) - Feature schema with data types
- offline_store_config (string, optional) - Offline store configuration for batch processing
- online_store_config (string, optional) - Online store configuration for real-time access
Glue Database
Data catalog database for schema management and metadata storage
services:
analytics_db:
type: glue_database
configuration:
database_name: "analytics_${parameters.environment}"
description: "Analytics data warehouse database"
services:
data_catalog:
type: glue_database
configuration:
database_name: "data_catalog_${parameters.environment}"
description: "Centralized data catalog for ML features"
location_uri: "s3://${services.data_lake.outputs.bucket_name}/catalog/"
Configuration Parameters
Basic Configuration
- database_name (string, optional) - Glue database name (lowercase, underscore-separated)
Service Configuration
- description (string, optional) - Database description (optional)
- location_uri (string, optional) - Default storage location S3 URI (optional)
Parameter Validation
General Validation: All parameters are validated according to AWS service limits and naming conventions.
- Names: Must follow AWS naming conventions (alphanumeric, hyphens, underscores)
- Required Parameters: All required parameters must be provided
- Type Validation: Parameters must match expected data types
- Range Validation: Numeric parameters must be within allowed ranges
Database Management
- Schema Registry: Central location for table schemas and metadata
- Table Organization: Tables are organized under databases
- Integration: Works with Athena, EMR, and other analytics services
Glue Table
Data catalog table definitions for structured data schema management
Simplified Column Format
Glue tables use a simplified column definition format:
column_name: "type"
. This makes schema definition
clean and readable while supporting all standard data types.
services:
events_table:
type: glue_table
configuration:
table_name: "events"
database_name: "${services.analytics_db.outputs.database_name}"
storage_location: "s3://${services.data_lake.outputs.bucket_name}/events/"
columns:
event_id: "string"
user_id: "string"
event_type: "string"
timestamp: "bigint"
amount: "double"
is_premium: "boolean"
services:
partitioned_events:
type: glue_table
configuration:
table_name: "partitioned_events"
database_name: "${services.analytics_db.outputs.database_name}"
storage_location: "s3://${services.data_lake.outputs.bucket_name}/partitioned-events/"
columns:
event_id: "string"
user_id: "string"
event_type: "string"
timestamp: "bigint"
year: "int"
month: "int"
day: "int"
partition_keys: ["year", "month", "day"]
description: "Events table partitioned by date for efficient querying"
services:
json_table:
type: glue_table
configuration:
table_name: "json_events"
database_name: "${services.analytics_db.outputs.database_name}"
storage_location: "s3://${services.data_lake.outputs.bucket_name}/json-events/"
columns:
event_id: "string"
payload: "struct<user_id:string,action:string,metadata:map<string,string>>"
timestamp: "timestamp"
serde_library: "org.apache.hive.hcatalog.data.JsonSerDe"
serde_parameters:
"serialization.format": "1"
input_format: "org.apache.hadoop.mapred.TextInputFormat"
output_format: "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"
services:
complex_table:
type: glue_table
configuration:
table_name: "complex_data"
database_name: "${services.analytics_db.outputs.database_name}"
storage_location: "s3://${services.data_lake.outputs.bucket_name}/complex-data/"
columns:
id: "string"
tags: "array<string>"
metadata: "map<string,string>"
user_profile: "struct<name:string,age:int,preferences:array<string>>"
scores: "array<double>"
created_at: "timestamp"
is_active: "boolean"
Configuration Parameters
Basic Configuration
- database_name (string, optional) - Glue database name (reference to database service)
- table_name (string, optional) - Glue table name (lowercase, underscore-separated)
- table_type (string, optional) - Table type ("EXTERNAL_TABLE" or "VIRTUAL_VIEW", default: "EXTERNAL_TABLE")
Service Configuration
- description (string, optional) - Table description (optional)
- input_format (string, optional) - Input format class name (optional)
- output_format (string, optional) - Output format class name (optional)
- parameters (string, optional) - Additional table parameters (optional)
- partition_keys (string, optional) - List of column names to use as partitions (optional)
- serde_library (string, optional) - SerDe library class name (optional)
- serde_parameters (string, optional) - SerDe configuration parameters (optional)
- storage_location (string, optional) - S3 URI where table data is stored
Parameter Validation
General Validation: All parameters are validated according to AWS service limits and naming conventions.
- Names: Must follow AWS naming conventions (alphanumeric, hyphens, underscores)
- Required Parameters: All required parameters must be provided
- Type Validation: Parameters must match expected data types
- Range Validation: Numeric parameters must be within allowed ranges
Supported Data Types
- Primitive Types: string, int, bigint, double, float, boolean, timestamp, date, binary, decimal
- Complex Types: array<type>, map<key_type,value_type>, struct<field:type,field:type>
- Examples: array<string>, map<string,int>, struct<name:string,age:int>
Partitioning
- Partition Keys: List of column names that define table partitions
- Performance: Partitioning improves query performance by limiting data scanned
- Common Patterns: Date-based partitioning (year, month, day) or categorical partitioning
- Validation: Partition key columns must be defined in the columns section
SerDe Configuration
- JSON SerDe: org.apache.hive.hcatalog.data.JsonSerDe for JSON data
- Parquet: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
- CSV/TSV: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
- Custom Parameters: Configure SerDe behavior through serde_parameters
Configuration Parameters
Basic Configuration
- database_name (string, optional) - Parent Glue database name
- table_name (string, optional) - Glue table name
Service Configuration
- columns (string, optional) - Table column definitions in simplified format (column_name: type)
- input_format (string, optional) - Data input format class (optional)
- output_format (string, optional) - Data output format class (optional)
- partition_keys (string, optional) - List of column names to use as partition keys (optional)
- serde_library (string, optional) - Serialization/deserialization library (optional)
- serde_parameters (string, optional) - SerDe-specific parameters (optional)
- storage_location (string, optional) - S3 location for table data
Simplified Configuration
-
Columns: Use simple key-value format:
column_name: "type"
- Partition Keys: List column names that exist in the columns definition
- Convention: Follows ModelKnife's "less configuration" principle
- Deployer Mapping: Automatically converts to AWS Glue API format
Search Service
Business-oriented search and analytics with automatic infrastructure optimization
services:
product_search:
type: search_service
configuration:
service_name: "product-search-${parameters.environment}"
search_type: "product_search"
environment: "production"
performance_tier: "high_performance"
access_level: "private"
indices:
- name: "products"
fields:
- name: "title"
type: "text"
analyzer: "english"
copy_to: ["search_all"]
- name: "category"
type: "keyword"
facetable: true
- name: "price"
type: "number"
facetable: true
- name: "brand"
type: "keyword"
facetable: true
- name: "search_all"
type: "search_as_you_type"
max_shingle_size: 3
languages: ["english"]
services:
document_search:
type: search_service
configuration:
service_name: "doc-search-${parameters.environment}"
search_type: "vector_search"
environment: "production"
performance_tier: "balanced"
access_level: "team_access"
embedding_config:
model_id: "amazon.titan-embed-text-v1"
service: "bedrock"
batch_size: 25
auto_vectorize: true
indices:
- name: "documents"
fields:
- name: "content"
type: "text"
analyzer: "english"
- name: "title"
type: "text"
analyzer: "english"
- name: "embedding"
type: "vector"
dimensions: 1536
similarity_function: "cosine"
- name: "document_type"
type: "keyword"
facetable: true
- name: "created_date"
type: "date"
facetable: true
languages: ["english"]
services:
hybrid_search:
type: search_service
configuration:
service_name: "hybrid-search-${parameters.environment}"
search_type: "hybrid_search"
environment: "production"
performance_tier: "high_performance"
access_level: "public"
embedding_config:
model_id: "amazon.titan-embed-text-v1"
service: "bedrock"
batch_size: 50
auto_vectorize: true
indices:
- name: "content"
fields:
- name: "title"
type: "text"
analyzer: "english"
searchable: true
- name: "content"
type: "text"
analyzer: "english"
searchable: true
- name: "embedding"
type: "vector"
dimensions: 1536
similarity_function: "cosine"
- name: "tags"
type: "keyword"
facetable: true
- name: "category"
type: "keyword"
facetable: true
languages: ["english", "spanish"]
Business-Oriented Search Types
-
full_text_search
- Optimized for text search and analysis with advanced text processing -
vector_search
- Optimized for semantic/vector search with embedding integration -
product_search
- Optimized for e-commerce search with faceting and filtering -
log_search
- Optimized for time-series log analysis and monitoring -
document_search
- Optimized for document content search and retrieval -
hybrid_search
- Combines full-text and vector search for comprehensive results
Parameter Validation
General Validation: All parameters are validated according to AWS service limits and naming conventions.
- Names: Must follow AWS naming conventions (alphanumeric, hyphens, underscores)
- Required Parameters: All required parameters must be provided
- Type Validation: Parameters must match expected data types
- Range Validation: Numeric parameters must be within allowed ranges
Configuration Parameters
Basic Configuration
- search_type (string, required) - Business-oriented search type (see above)
- service_name (string, required) - Search service name (1-32 characters, lowercase, alphanumeric and hyphens)
Service Configuration
- access_level (string, optional) - Access level ("public", "private", "team_access")
- data_sources (string, optional) - Data ingestion source configurations
- embedding_config (string, optional) - Embedding model configuration for vector search
- environment (string, optional) - Deployment environment ("development", "staging", "production")
- indices (string, optional) - Index configurations with field definitions
- languages (string, optional) - Supported languages for text analysis
- performance_tier (string, optional) - Performance tier ("development", "balanced", "high_performance", "cost_optimized")
Field Types
-
text
- Full-text searchable content with analyzers keyword
- Exact match, faceting, and filtering-
number
- Numeric values for range queries and faceting -
date
- Date/timestamp fields for temporal filtering vector
- Vector embeddings for semantic search-
search_as_you_type
- Auto-complete and search suggestions
Performance Tiers
-
development
- Cost-optimized for development and testing -
balanced
- Balanced performance and cost for most use cases -
high_performance
- Performance-optimized for production workloads -
cost_optimized
- Minimum cost configuration for light usage
Access Levels
-
public
- Public internet access with authentication private
- Private VPC access only-
team_access
- Team-based access control with IAM integration
Embedding Configuration
-
model_id
- Embedding model identifier (e.g., "amazon.titan-embed-text-v1") -
service
- Embedding service ("bedrock", "openai", "huggingface") -
batch_size
- Batch size for embedding generation (1-100) -
timeout_seconds
- Request timeout for embedding service -
auto_vectorize
- Automatically generate embeddings for text fields
Implementation Notes
- Search service automatically selects optimal infrastructure (serverless vs managed) based on configuration
- Vector search types require embedding_config for automatic vectorization
- Field configurations determine index mappings and query capabilities
- Performance tiers affect underlying infrastructure provisioning and costs
- Access levels control network access patterns and authentication requirements
- Multi-language support affects text analyzers and tokenization strategies
- Data sources enable automatic ingestion and transformation pipelines
API Gateway (REST)
REST API management for serverless applications and microservices
Recommendation: Use API Gateway v2 (HTTP APIs)
For new projects, we recommend using
API Gateway v2 (HTTP APIs) instead of REST APIs.
HTTP APIs are 70% cheaper, 60% faster, and use the same
configuration format. Simply use
type: api_gateway_v2
instead of
type: api_gateway
.
services:
ml_api:
type: api_gateway
configuration:
api_name: "ml-api-${parameters.environment}"
stage_name: "prod"
resources:
- path: "/predict"
method: "POST"
integration_type: "lambda"
lambda_function: "${services.inference_lambda.function_name}"
- path: "/health"
method: "GET"
integration_type: "mock"
Configuration Parameters
Basic Configuration
- api_name (string, optional) - API Gateway REST API name
- stage_name (string, optional) - Deployment stage name (e.g., dev, prod)
Service Configuration
- resources (string, optional) - API resource definitions with paths and methods
Network Configuration
- cors_enabled (string, optional) - Enable CORS for cross-origin requests (optional)
API Gateway v2 (HTTP APIs)
Modern HTTP APIs with JWT/REQUEST authorizers, 70% cheaper and 60% faster than REST APIs
services:
my_api:
type: api_gateway_v2
configuration:
api_name: "my-api-${parameters.environment}"
stage_name: "dev"
resources:
# Lambda integration
- path: "/users"
methods: ["GET", "POST"]
integration_type: "lambda"
lambda_function: "${services.user_lambda.function_name}"
# HTTP integration (public endpoints)
- path: "/external"
methods: ["GET"]
integration_type: "http"
integration_uri: "https://api.external-service.com/data"
# Mock integration (testing)
- path: "/health"
methods: ["GET"]
integration_type: "mock"
mock_response:
status_code: 200
response_body: '{"status": "healthy"}'
Private Integration
The integration_type: "private"
provides a simplified and more intuitive way to configure private resource integrations.
It automatically handles path extraction, parameter conversion, and VPC Link setup, reducing configuration complexity and errors.
Why Private Integration?
Connect API Gateway to internal services (ECS, EKS, EC2) without exposing them to the internet. Ideal for security, compliance, and enterprise architectures requiring private backend connectivity.
services:
enrichment_api:
type: api_gateway_v2
configuration:
api_name: "enrichment-api-${parameters.environment}"
stage_name: "dev"
resources:
- path: "/users"
methods: ["GET"]
integration_type: "private"
integration_uri: "arn:aws:elasticloadbalancing:eu-west-1:123456789012:listener/net/my-nlb/abcd1234/efgh5678#/v1/users"
vpc_link_id: "abc123"
- path: "/users/{id}"
methods: ["GET"]
integration_type: "private"
integration_uri: "arn:aws:elasticloadbalancing:eu-west-1:123456789012:listener/net/my-nlb/abcd1234/efgh5678#/v1/users/{id}"
vpc_link_id: "abc123"
services:
internal_api:
type: api_gateway_v2
configuration:
api_name: "internal-api-${parameters.environment}"
stage_name: "dev"
resources:
- path: "/api/data"
methods: ["GET"]
integration_type: "private"
integration_uri: "https://internal-service.vpc.local/v1/data"
vpc_link_id: "abc123"
- path: "/api/data/{category}"
methods: ["GET"]
integration_type: "private"
integration_uri: "https://internal-service.vpc.local/v1/data/{category}"
vpc_link_id: "abc123"
Private Integration URI Formats
Supported URI Formats
- ELB Listener ARN:
arn:aws:elasticloadbalancing:region:account:listener/net/nlb-name/id/listener-id#/path
- Cloud Map Service ARN:
arn:aws:servicediscovery:region:account:service/service-id#/path
- HTTP/HTTPS URL:
https://internal-service.vpc.local/path
- Path Parameters: Use
{paramName}
format (automatically converted to AWS format)
Authentication
services:
secure_api:
type: api_gateway_v2
configuration:
api_name: "secure-api-${parameters.environment}"
stage_name: "prod"
authorizers:
- name: "jwt-auth"
type: "JWT"
configuration:
# Cognito User Pool
issuer: "https://cognito-idp.us-east-1.amazonaws.com/us-east-1_XXXXXXXXX"
audience: ["your-app-client-id"]
# Google OAuth (alternative)
# issuer: "https://accounts.google.com"
# audience: ["your-google-client-id.apps.googleusercontent.com"]
# Auth0 (alternative)
# issuer: "https://your-domain.auth0.com/"
# audience: ["your-auth0-api-identifier"]
resources:
- path: "/protected"
methods: ["GET", "POST"]
integration_type: "lambda"
lambda_function: "${services.api_lambda.function_name}"
authorizer: "jwt-auth"
services:
auth_api:
type: api_gateway_v2
configuration:
api_name: "auth-api-${parameters.environment}"
stage_name: "prod"
authorizers:
- name: "custom-auth"
type: "REQUEST"
configuration:
authorizer_uri: "${services.auth_lambda.function_arn}"
identity_sources: ["$request.header.Authorization"]
authorizer_result_ttl_in_seconds: 300
# Simplified format (alternative)
# lambda_function: "custom-authorizer"
# cache_ttl: 300
# identity_sources: ["$request.header.Authorization", "$request.header.X-API-Key"]
resources:
- path: "/protected"
methods: ["GET", "POST"]
integration_type: "lambda"
lambda_function: "${services.api_lambda.function_name}"
authorizer: "custom-auth"
Advanced Features
services:
production_api:
type: api_gateway_v2
configuration:
api_name: "production-api-${parameters.environment}"
stage_name: "prod"
# CORS configuration
cors:
allow_credentials: true
allow_headers: ["Content-Type", "Authorization"]
allow_methods: ["GET", "POST", "PUT", "DELETE"]
allow_origins: ["https://myapp.com"]
max_age: 3600
# Rate limiting
throttling:
burst_limit: 5000
rate_limit: 2000.0
resources:
- path: "/api/{proxy+}"
methods: ["ANY"]
integration_type: "lambda"
lambda_function: "${services.api_lambda.function_name}"
Configuration Reference
- api_name - HTTP API name (supports variables)
- stage_name - Deployment stage (default: "dev")
- description - API description
- resources - Route definitions with paths and methods
- authorizers - JWT and REQUEST authorizer definitions
- cors - Cross-origin resource sharing settings
- throttling - Rate limiting configuration
Integration Types
HTTP API v2 supports four integration types for connecting routes to backend services:
-
lambda - AWS Lambda proxy integration for serverless functions
integration_type: "lambda" lambda_function: "my-function-name"
-
http - HTTP proxy integration for public HTTP/HTTPS endpoints
integration_type: "http" integration_uri: "https://api.external.com/endpoint"
-
private Simplified private resource integration
integration_type: "private" integration_uri: "arn:aws:elasticloadbalancing:region:account:listener/net/nlb/id/listener-id#/v1/path/{id}" vpc_link_id: "vpc-link-id"
-
mock - Mock responses for testing and prototyping
integration_type: "mock" mock_response: status_code: 200 response_body: '{"status": "ok"}'
Recommendation: Use Private Integration for VPC Resources
For private resources behind VPC Links, use integration_type: "private"
instead of integration_type: "http"
.
The private integration type automatically handles path extraction, parameter conversion, and VPC Link configuration.
Authorizer Types
- JWT Authorizers: Support Cognito User Pools and OIDC providers (Auth0, Google, etc.)
- REQUEST Authorizers: Custom Lambda-based authorization with configurable identity sources
- Route-level Authorization: Apply different authorizers to different routes
CORS & Throttling
-
cors.allow_credentials
- Allow cookies and authorization headers -
cors.allow_headers/methods/origins
- Configure allowed headers, methods, and origins -
throttling.burst_limit
- Maximum concurrent requests -
throttling.rate_limit
- Steady-state requests per second
Key Benefits
- Cost: Up to 70% cheaper than REST APIs
- Performance: Up to 60% faster response times
- Configuration: Simple and intuitive resource definitions
- Features: Modern JWT/REQUEST authorizers, enhanced CORS, better throttling
services:
complete_api:
type: api_gateway_v2
configuration:
api_name: "complete-api-${parameters.environment}"
stage_name: "prod"
description: "Complete HTTP API with all features"
# JWT and REQUEST authorizers
authorizers:
- name: "jwt-auth"
type: "JWT"
configuration:
issuer: "https://cognito-idp.us-east-1.amazonaws.com/us-east-1_XXXXXXXXX"
audience: ["client-id"]
- name: "custom-auth"
type: "REQUEST"
configuration:
authorizer_uri: "${services.auth_lambda.function_arn}"
identity_sources: ["$request.header.X-API-Key"]
authorizer_result_ttl_in_seconds: 300
# CORS configuration
cors:
allow_credentials: true
allow_headers: ["Content-Type", "Authorization"]
allow_methods: ["GET", "POST", "PUT", "DELETE"]
allow_origins: ["https://myapp.com"]
max_age: 3600
# Throttling limits
throttling:
burst_limit: 5000
rate_limit: 2000.0
resources:
# Public endpoints
- path: "/health"
methods: ["GET"]
integration_type: "mock"
mock_response:
status_code: 200
response_body: '{"status": "healthy"}'
# JWT protected endpoints
- path: "/users"
methods: ["GET", "POST"]
integration_type: "lambda"
lambda_function: "${services.user_lambda.function_name}"
authorizer: "jwt-auth"
- path: "/users/{id}"
methods: ["GET", "PUT", "DELETE"]
integration_type: "lambda"
lambda_function: "${services.user_lambda.function_name}"
authorizer: "jwt-auth"
# Custom auth protected endpoints
- path: "/admin/stats"
methods: ["GET"]
integration_type: "lambda"
lambda_function: "${services.admin_lambda.function_name}"
authorizer: "custom-auth"
# HTTP proxy integration
- path: "/external/{proxy+}"
methods: ["GET", "POST"]
integration_type: "http"
integration_uri: "https://api.external.com/{proxy}"
# Stage-specific configuration
stage_configuration:
auto_deploy: true
throttling_burst_limit: 4000
throttling_rate_limit: 1500.0
Configuration Options
Resource Configuration: Define API endpoints
using the resources
format with paths, methods, and
integrations.
Route-based Configuration: HTTP APIs also support
native routes
format for advanced use cases.
Kinesis Stream
Real-time data streaming for event processing and analytics
Real-time Data Streaming
Kinesis Data Streams enables real-time processing of streaming data at massive scale with automatic scaling and built-in durability.
services:
event_stream:
type: kinesis_stream
configuration:
stream_name: "events-${parameters.environment}"
shard_count: 2
retention_period: 48
services:
encrypted_stream:
type: kinesis_stream
configuration:
stream_name: "secure-events-${parameters.environment}"
shard_count: 4
retention_period: 168 # 7 days
encryption_type: "KMS"
kms_key_id: "alias/aws/kinesis"
Configuration Parameters
Basic Configuration
- encryption_type (string, optional) - Encryption type ("KMS" for server-side encryption)
- stream_name (string, optional) - Kinesis stream name (supports environment variables)
Service Configuration
- retention_period (string, optional) - Data retention in hours (24-8760)
- role_arn (string, optional) - Custom IAM role ARN (optional, uses global default if not specified)
- shard_count (string, optional) - Number of shards (1-1000), affects throughput capacity
Security Configuration
- kms_key_id (string, optional) - KMS key for encryption (alias or key ID)
Parameter Validation
Validation Rules: Parameters are validated according to AWS service limits and naming conventions.
-
stream_name: 1-128 characters, alphanumeric,
hyphens, underscores, and periods only
Examples:event-stream
,user_activity
,log.stream
-
shard_count: 1-1000 shards
Examples:1
,5
,100
Common Validation Errors
-
Parameter name contains invalid characters: Use
only alphanumeric characters, hyphens, underscores, and periods
Example: Change "my bucket" to "my-bucket" -
Parameter value exceeds maximum length: Reduce
the parameter value to within the allowed range
Example: Function names must be 64 characters or less -
Service reference not found: Ensure the
referenced service exists and is properly named
Example: Check that ${services.my-service.outputs.arn} references an existing service
IAM Integration
-
Default: Uses global
kinesis_stream_role
with appropriate permissions -
Custom: Provide
role_arn
parameter for custom IAM role - Validation: Custom roles are validated for correct permissions
Kinesis Firehose
Managed data delivery service for streaming data to data lakes and analytics services
Data Delivery Modes
Kinesis Data Firehose supports two delivery modes: Direct PUT (applications write directly to Firehose) and Kinesis Data Streams as source (Firehose reads from an existing stream). Both modes support buffering, compression, and format conversion.
Source Configuration Options
For stream-to-S3 delivery, you can specify the source Kinesis stream using either: source_service (recommended - references service name) or source_stream_arn (direct ARN reference). The source_service approach is more maintainable and follows ModelKnife service reference patterns.
services:
data_pipeline:
type: kinesis_firehose
configuration:
delivery_stream_name: "events-firehose-${parameters.environment}"
destination_s3_bucket: "${services.data_lake.outputs.bucket_name}"
destination_s3_prefix: "events/"
buffer_size: 5
buffer_interval: 300
compression_format: "GZIP"
services:
stream_to_s3:
type: kinesis_firehose
configuration:
delivery_stream_name: "stream-to-s3-${parameters.environment}"
source_service: "event_stream" # Reference to Kinesis stream service
destination_s3_bucket: "${services.data_lake.outputs.bucket_name}"
destination_s3_prefix: "processed-events/"
buffer_size: 10
buffer_interval: 60
compression_format: "GZIP"
services:
stream_to_s3_alt:
type: kinesis_firehose
configuration:
delivery_stream_name: "stream-alt-${parameters.environment}"
source_stream_arn: "${services.event_stream.outputs.stream_arn}"
destination_s3_bucket: "${services.data_lake.outputs.bucket_name}"
destination_s3_prefix: "processed-events/"
buffer_size: 10
buffer_interval: 60
compression_format: "GZIP"
services:
advanced_firehose:
type: kinesis_firehose
configuration:
delivery_stream_name: "advanced-firehose-${parameters.environment}"
source_stream_arn: "${services.event_stream.outputs.stream_arn}"
destination_s3_bucket: "${services.data_lake.outputs.bucket_name}"
destination_s3_prefix: "events/year=!{timestamp:yyyy}/month=!{timestamp:MM}/day=!{timestamp:dd}/"
error_output_prefix: "errors/"
buffer_size: 64
buffer_interval: 300
compression_format: "GZIP"
# Convert JSON to Parquet format
data_format_conversion:
enabled: true
output_format: "parquet"
schema_database: "${services.analytics_db.outputs.database_name}"
schema_table: "events"
# Custom IAM role for cross-account access
role_arn: "arn:aws:iam::123456789012:role/FirehoseDeliveryRole"
services:
high_throughput_firehose:
type: kinesis_firehose
configuration:
delivery_stream_name: "high-throughput-${parameters.environment}"
source_stream_arn: "${services.high_volume_stream.outputs.stream_arn}"
destination_s3_bucket: "${services.data_lake.outputs.bucket_name}"
destination_s3_prefix: "high-volume-data/"
# Optimize for high throughput
buffer_size: 128 # Maximum buffer size
buffer_interval: 60 # Minimum buffer interval
compression_format: "GZIP"
tags:
Environment: "${parameters.environment}"
DataType: "HighVolume"
CostCenter: "Analytics"
Configuration Parameters
Basic Configuration
- delivery_stream_name (string, optional) - Firehose delivery stream name (1-64 characters, alphanumeric, hyphens, underscores)
Service Configuration
- buffer_interval (string, optional) - Buffer interval in seconds (60-900, default: 60)
- buffer_size (string, optional) - Buffer size in MB (1-128, default: 64)
- compression_format (string, optional) - Data compression format (GZIP, ZIP, Snappy, HADOOP_SNAPPY, UNCOMPRESSED, default: GZIP)
- data_format_conversion (string, optional) - Convert data format (JSON to Parquet/ORC, optional)
- destination_s3_bucket (string, optional) - Target S3 bucket for data delivery (required)
- destination_s3_prefix (string, optional) - S3 key prefix for data organization (supports dynamic partitioning)
- error_output_prefix (string, optional) - S3 prefix for error records (optional)
- role_arn (string, optional) - Custom IAM role ARN (optional, uses global default if not specified)
- source_service (string, optional) - Source Kinesis stream service name (alternative to source_stream_arn)
- source_stream_arn (string, optional) - Source Kinesis stream ARN (alternative to source_service, for stream-to-S3 mode)
Advanced Options
- tags (string, optional) - Resource tags (optional)
Parameter Validation
Validation Rules: Parameters are validated according to AWS service limits and naming conventions.
-
bucket_name: 3-63 characters, lowercase letters,
numbers, hyphens, and periods only
Examples:my-data-bucket
,company-logs-2024
,ml-model-artifacts
-
versioning: Boolean value
Examples:true
,false
Common Validation Errors
-
Parameter name contains invalid characters: Use
only alphanumeric characters, hyphens, underscores, and periods
Example: Change "my bucket" to "my-bucket" -
Parameter value exceeds maximum length: Reduce
the parameter value to within the allowed range
Example: Function names must be 64 characters or less -
Service reference not found: Ensure the
referenced service exists and is properly named
Example: Check that ${services.my-service.outputs.arn} references an existing service
Data Format Conversion
- Supported Formats: Convert JSON input to Parquet or ORC format
- Schema Integration: Uses AWS Glue Data Catalog for schema information
-
Configuration:
-
enabled
- Enable format conversion (true/false) -
output_format
- Target format ("parquet" or "orc") -
schema_database
- Glue database name for schema schema_table
- Glue table name for schema
-
- Benefits: Improved query performance and reduced storage costs
Dynamic Partitioning
-
Time-based Partitioning: Use timestamp
expressions in S3 prefix
-
year=!{timestamp:yyyy}/month=!{timestamp:MM}/
- Year/month partitions -
!{timestamp:yyyy/MM/dd/HH}/
- Hourly partitions
-
-
Content-based Partitioning: Extract values from
record content
-
event_type=!{partitionKeyFromQuery:event_type}/
- Partition by event type
-
- Performance: Improves query performance by reducing data scanned
Buffering and Delivery
- Buffer Size: Amount of data (1-128 MB) to buffer before delivery
- Buffer Interval: Maximum time (60-900 seconds) to wait before delivery
- Delivery Trigger: Whichever condition is met first triggers delivery
- Optimization: Larger buffers reduce S3 PUT costs but increase latency
Compression Options
- GZIP: Good compression ratio, widely supported (recommended)
- ZIP: Compatible with many tools, moderate compression
- Snappy: Fast compression/decompression, lower ratio
- HADOOP_SNAPPY: Hadoop-compatible Snappy format
- UNCOMPRESSED: No compression, fastest delivery
Delivery Modes
- Direct PUT: Applications write directly to Firehose (no source configuration needed)
-
Kinesis Data Streams Source: Firehose reads from
existing stream
-
source_service
- Reference to Kinesis stream service name (recommended) -
source_stream_arn
- Direct ARN reference (alternative approach)
-
- Use Cases: Direct PUT for simple ingestion, stream source for complex processing pipelines
Error Handling
- Error Records: Failed records are delivered to error output prefix
- Retry Logic: Automatic retries for transient failures
- Monitoring: CloudWatch metrics for delivery success/failure rates
IAM Integration
-
Default: Uses global
kinesis_firehose_role
with appropriate permissions -
Custom: Provide
role_arn
parameter for custom IAM role - Permissions: Role needs S3 write permissions and Kinesis read permissions (if using source stream)
- Cross-Account: Custom roles enable cross-account S3 delivery
EventBridge Pipe
Event-driven integrations with filtering, transformation, and routing
Simplified Service Configuration
EventBridge Pipes support both simplified service name references
(source_service
, target_service
) and
direct ARN specification. Service names are automatically resolved
to ARNs during deployment, following ModelKnife's "less
configuration" principle.
services:
event_processor:
type: eventbridge_pipe
configuration:
pipe_name: "event-processor-${parameters.environment}"
source_service: "event_stream"
target_service: "processor_lambda"
description: "Process events from stream to Lambda"
services:
external_pipe:
type: eventbridge_pipe
configuration:
pipe_name: "external-pipe-${parameters.environment}"
source_arn: "arn:aws:kinesis:us-east-1:123456789012:stream/external-stream"
target_arn: "arn:aws:lambda:us-east-1:123456789012:function:external-processor"
description: "Process events from external Kinesis stream"
services:
filtered_processor:
type: eventbridge_pipe
configuration:
pipe_name: "filtered-processor-${parameters.environment}"
source_service: "event_stream"
target_service: "processor_lambda"
description: "Process only high-priority events"
source_parameters:
kinesis_stream_parameters:
batch_size: 10
starting_position: "LATEST"
maximum_batching_window_in_seconds: 5
filter_criteria:
filters:
- pattern: '{"event_type": ["click", "purchase", "signup"]}'
- pattern: '{"priority": ["high", "critical"]}'
services:
sqs_processor:
type: eventbridge_pipe
configuration:
pipe_name: "sqs-processor-${parameters.environment}"
source_service: "message_queue"
target_service: "message_processor"
description: "Process SQS messages asynchronously"
source_parameters:
sqs_queue_parameters:
batch_size: 5
maximum_batching_window_in_seconds: 10
target_parameters:
lambda_function_parameters:
invocation_type: "FIRE_AND_FORGET"
services:
# Primary processing pipe
db_stream_processor:
type: eventbridge_pipe
configuration:
pipe_name: "db-stream-processor-${parameters.environment}"
source_arn: "${services.user_table.outputs.stream_arn}"
target_service: "user_change_processor"
description: "Process user table changes"
source_parameters:
dynamodb_stream_parameters:
batch_size: 5
starting_position: "LATEST"
filter_criteria:
filters:
- pattern: '{"eventName": ["INSERT", "MODIFY"]}'
# Analytics pipe
db_analytics_pipe:
type: eventbridge_pipe
configuration:
pipe_name: "db-analytics-pipe-${parameters.environment}"
source_arn: "${services.user_table.outputs.stream_arn}"
target_service: "analytics_processor"
description: "Send user changes to analytics"
filter_criteria:
filters:
- pattern: '{"eventName": ["INSERT", "REMOVE"]}'
services:
# Multi-stage processing with enrichment
enriched_processor:
type: eventbridge_pipe
configuration:
pipe_name: "enriched-processor-${parameters.environment}"
source_service: "raw_events_stream"
target_service: "enriched_processor_lambda"
description: "Enrich and process raw events"
source_parameters:
kinesis_stream_parameters:
batch_size: 25
starting_position: "TRIM_HORIZON"
maximum_batching_window_in_seconds: 5
target_parameters:
lambda_function_parameters:
invocation_type: "REQUEST_RESPONSE"
filter_criteria:
filters:
- pattern: '{"source": ["web", "mobile"]}'
- pattern: '{"event_type": {"exists": true}}'
- pattern: '{"user_id": {"exists": true}}'
# Custom IAM role for cross-account access
role_arn: "arn:aws:iam::123456789012:role/EventBridgePipeRole"
Configuration Parameters
Basic Configuration
- pipe_name (string, optional) - EventBridge pipe name (1-64 characters, alphanumeric, periods, hyphens, underscores)
Service Configuration
- description (string, optional) - Pipe description (optional)
- enrichment (string, optional) - Event enrichment configuration (optional)
- filter_criteria (string, optional) - Event filtering patterns using JSON pattern matching (optional)
- role_arn (string, optional) - Custom IAM role ARN (optional, uses global default if not specified)
- source_arn (string, optional) - Source ARN (alternative to source_service, for external or direct ARN specification)
- source_parameters (string, optional) - Source-specific configuration parameters (optional)
- source_service (string, optional) - Name of source service (simplified approach, mutually exclusive with source_arn)
- target_arn (string, optional) - Target ARN (alternative to target_service, for external or direct ARN specification)
- target_parameters (string, optional) - Target-specific configuration parameters (optional)
- target_service (string, optional) - Name of target service (simplified approach, mutually exclusive with target_arn)
Advanced Options
- tags (string, optional) - Resource tags (optional)
Parameter Validation
Validation Rules: Parameters are validated according to AWS service limits and naming conventions.
-
function_name: Must be 1-64 characters,
alphanumeric and hyphens only
Examples:my-function
,data-processor-prod
,api-handler-v2
-
memory_size: 128-10240 MB in 64MB increments
Examples:128
,512
,1024
-
timeout: 1-900 seconds
Examples:30
,300
,900
-
runtime: python3.8, python3.9, python3.10,
python3.11, python3.12, nodejs18.x, java11, go1.x
Examples:python3.11
,nodejs18.x
,java11
Common Validation Errors
-
Parameter name contains invalid characters: Use
only alphanumeric characters, hyphens, underscores, and periods
Example: Change "my bucket" to "my-bucket" -
Parameter value exceeds maximum length: Reduce
the parameter value to within the allowed range
Example: Function names must be 64 characters or less -
Service reference not found: Ensure the
referenced service exists and is properly named
Example: Check that ${services.my-service.outputs.arn} references an existing service
Source Parameters
-
Kinesis Stream Parameters:
-
batch_size
- Number of records per batch (1-10000, default varies by source) -
starting_position
- Where to start reading (TRIM_HORIZON, LATEST, AT_TIMESTAMP) -
maximum_batching_window_in_seconds
- Maximum time to wait for batch (0-300 seconds)
-
-
DynamoDB Stream Parameters:
-
batch_size
- Number of records per batch (1-1000) -
starting_position
- Where to start reading (TRIM_HORIZON, LATEST) -
maximum_batching_window_in_seconds
- Maximum time to wait for batch
-
-
SQS Parameters:
-
batch_size
- Number of messages per batch (1-10) -
maximum_batching_window_in_seconds
- Maximum time to wait for batch
-
Target Parameters
-
Lambda Function Parameters:
-
invocation_type
- How to invoke Lambda (REQUEST_RESPONSE, FIRE_AND_FORGET)
-
-
SQS Parameters:
-
message_group_id
- Message group ID for FIFO queues -
message_deduplication_id
- Deduplication ID for FIFO queues
-
Event Sources & Targets
-
Supported Sources:
- Kinesis Data Streams
- Amazon SQS queues
- DynamoDB Streams
-
Supported Targets:
- AWS Lambda functions
- Amazon SQS queues
- Amazon SNS topics
- EventBridge event buses
- Kinesis Data Streams
- Kinesis Data Firehose
- AWS Step Functions state machines
Event Filtering
- JSON Pattern Matching: Use JSON patterns to filter events based on content
- Multiple Filters: Apply multiple filter patterns (OR logic between filters)
-
Pattern Examples:
-
{"event_type": ["purchase", "signup"]}
- Match specific event types -
{"amount": [{"numeric": [">", 100]}]}
- Numeric comparisons -
{"user_id": {"exists": true}}
- Check field existence -
{"source": {"prefix": "web-"}}
- String prefix matching
-
- Performance: Filtering reduces downstream processing and costs
Simplified Configuration Benefits
-
Service Names: Use
source_service
andtarget_service
instead of complex ARNs - Automatic ARN Resolution: Deployer automatically resolves service names to ARNs during deployment
- Convention over Configuration: Follows ModelKnife's principle of minimal configuration
-
Flexibility: Can still use
source_arn
andtarget_arn
for external resources or direct ARN specification - Validation: Service names are validated against defined services in the same configuration
IAM Integration
-
Default: Uses global
eventbridge_pipe_role
with appropriate permissions -
Custom: Provide
role_arn
parameter for custom IAM role (useful for cross-account access) - Permissions: Role needs permissions to read from source and write to target
- Validation: Custom roles are validated for correct permissions during deployment
Lambda Service Dependency Management
Installing Python dependencies for Lambda functions with native binaries
Overview
ModelKnife provides optimized dependency management for Lambda functions, especially for packages with native binaries like pandas, numpy, and scikit-learn that require Linux-compatible compilation.
Native Binary Challenge
Pure pip installation fails for native packages because they contain binaries compiled for your host OS (macOS/Windows). AWS Lambda requires Linux-compatible binaries. ModelKnife solves this with Docker-based builds using official Lambda base images.
Configuration Parameters
services:
iris_inference:
type: lambda_function
repository: "./src"
handler: iris_inference_with_model.lambda_handler
runtime: python3.9
code_path: "lambda_functions/"
requirements_file: "requirements.txt"
# Dependency build configuration
build_layer: true # Create separate dependency layer
build_strategy: "auto" # auto|local|docker
# Lambda configuration
timeout: 300
memory_size: 512
environment_variables:
MODEL_S3_PATH: "s3://my-bucket/models/"
Build Configuration Options
build_layer
true
- Package dependencies as Lambda Layer (recommended)false
- Bundle dependencies with function code- Benefits: Faster deployments, shared across functions, 50MB+ size limit
build_strategy
auto
- Auto-detect based on requirements (default)local
- Use local pip (pure Python packages only)docker
- Force Docker build (native binaries)
Build Strategy Details
AUTO Strategy (Recommended)
Automatically detects whether your requirements contain native packages and chooses the appropriate build method:
- Pure Python packages: Uses fast local pip installation
- Native packages detected: Switches to Docker build automatically
- Known native packages: pandas, numpy, scikit-learn, scipy, pillow, lxml, psycopg2
# LOCAL Strategy - Fast, pure Python only
services:
simple_lambda:
type: lambda_function
handler: handler.main
requirements_file: "requirements.txt" # requests, boto3, etc.
build_strategy: "local" # Fast pip install
# DOCKER Strategy - Native binaries supported
services:
ml_lambda:
type: lambda_function
handler: inference.predict
requirements_file: "requirements.txt" # pandas, numpy, sklearn
build_strategy: "docker" # Linux-compatible build
build_layer: true # Recommended for large deps
Performance Optimizations
Smart Caching & Build Skipping
ModelKnife includes several optimizations to minimize build times:
- Build Hash Validation: Skips dependency builds when requirements.txt unchanged
- Layer Reuse: Checks AWS for existing layers before building
- Source-Only Updates: ~50% faster deployments when only function code changes
- Incremental Builds: Separate source and dependency lifecycles
# First deployment - full build
🔧 Building dependencies with DOCKER strategy
📦 Creating Lambda layer ZIP: 42.3 MB
🚀 Publishing layer to AWS: mlknife-layer-76daf7e3
✅ Function deployed: 35.2 seconds
# Source-only change - optimized
🚀 Skipping dependency ZIP build - using existing layer
🎯 Layer already has ARN, skipping layer publishing
✅ Function updated: 16.1 seconds (50% faster!)
Docker Build Process
Docker Requirements
Docker build strategy requires Docker installed and running on your system. ModelKnife uses official AWS Lambda base images to ensure compatibility.
# ModelKnife runs these commands automatically:
docker run --platform linux/amd64 \
--entrypoint "" \
-v /local/requirements:/var/task \
-v /output:/var/runtime \
public.ecr.aws/lambda/python:3.9 \
pip install -r /var/task/requirements.txt -t /var/runtime
# Benefits:
# ✅ Linux-compatible binaries
# ✅ Exact Lambda runtime environment
# ✅ Handles complex native dependencies
# ✅ Automatic platform detection (arm64/amd64)
Common Use Cases
🔬 ML Inference Functions
pandas==2.0.3
numpy==1.24.3
scikit-learn==1.3.0
joblib==1.3.1
Strategy: AUTO (→ DOCKER)
Layer: Recommended
Build Time: ~45-60s first, ~15s updates
⚡ API Gateway Functions
requests==2.31.0
boto3==1.28.17
pydantic==2.1.1
Strategy: AUTO (→ LOCAL)
Layer: Optional
Build Time: ~10-15s first, ~8s updates
Requirements File Best Practices
requirements.txt Tips
- Pin versions:
pandas==2.0.3
instead ofpandas>=2.0
- Minimize dependencies: Only include packages you actually import
- Check compatibility: Ensure all packages support your Python runtime
- Consider alternatives: Use
boto3
(already in Lambda) instead ofrequests
when possible
Troubleshooting
Common Issues
- Docker not running: Ensure Docker is installed and started
- Platform mismatch: ModelKnife automatically handles ARM64 vs AMD64
- Large layers: Layers >250MB use S3 upload (automatic)
- Build failures: Check requirements.txt for version conflicts
# Deploy with detailed logging
mk s deploy --detail
# Check deployment logs
mk s status iris_inference
# Validate function
mk s test iris_inference
Layer Management
ModelKnife automatically manages Lambda layers:
- Naming:
mlknife-layer-{hash}
based on requirements content - Versioning: New layer versions created only when requirements change
- Cleanup: Old layers remain until manually deleted (AWS best practice)
- Sharing: Same layer hash can be reused across multiple functions