Services API Reference

Lambda Function

Serverless compute for APIs, real-time inference, and event processing

Key Features

Supports Python, Go, Node.js, and Java with automatic dependency management. Python functions auto-detect native dependencies and choose optimal build strategy. Features include EventBridge scheduling, VPC access, and dependency layer separation.

Minimal Lambda Function

services:
  simple_function:
    type: lambda_function
    repository: "../src"
    function_name: "simple-${parameters.environment}"
    runtime: "python3.11"
    entry_point: "app.lambda_handler"
    code_path: "python-lambda"
    # Uses defaults: memory_size=128MB, timeout=30s

Lambda Function with Custom Settings

services:
  api_function:
    type: lambda_function
    repository: "../src"
    function_name: "api-${parameters.environment}"
    runtime: "python3.11"
    entry_point: "app.lambda_handler"
    code_path: "python-lambda"
    memory_size: 512      # Override default (128 MB)
    timeout: 60           # Override default (30 seconds)
    environment:
      API_KEY: "${parameters.api_key}"

Multi-Runtime Examples

services:
  # Python: entry point = "filename.function_name"
  # Expects requirements.txt in src/ directory
  python_api:
    type: lambda_function
    repository: "../src"
    function_name: "python-api-${parameters.environment}"
    runtime: "python3.11"
    entry_point: "app.lambda_handler"  # Calls lambda_handler() in app.py
    code_path: "python_lambda/"
    build_strategy: "auto"
    build_layer: true

  # Node.js: entry point = "filename.function_name"
  nodejs_api:
    type: lambda_function
    repository: "../src"
    function_name: "nodejs-api-${parameters.environment}"
    runtime: "nodejs18.x"
    entry_point: "index.handler"  # Calls exports.handler in index.js
    code_path: "python_lambda"

  # Java: entry point = "package.Class::method"
  java_api:
    type: lambda_function
    repository: "../src"
    function_name: "java-api-${parameters.environment}"
    runtime: "java11"
    entry_point: "com.example.Handler::handleRequest"
    code_path: "java-lambda"
    memory_size: 1024
    timeout: 60

  # Go: entry point = executable name (typically "main")
  go_api:
    type: lambda_function
    repository: "../src"
    function_name: "go-api-${parameters.environment}"
    runtime: "go1.x"
    entry_point: "main"  # Compiled binary name
    code_path: "go-lambda/api"

Project Organization

The following folder structure shows how to organize code based on the above examples. The final code path is resolved as: configuration_file_directory/repository/code_path

configuration_file_directory: Directory containing mlknife-compose.yaml (e.g., conf/)
repository: Code repository path relative to the configuration file directory (e.g., "../src")
code_path: Specific code directory (e.g., "python-lambda", "go-lambda/api")

Folder Structure

my-lambda-project/
├── conf/
│   └── mlknife-compose.yaml          # Configuration file
└── src/                              # Source code directory
    ├── python-lambda/                # Python Lambda functions
    │   ├── app.py                    # Main entry point file
    │   ├── processor.py              # Additional modules
    │   └── requirements.txt          # Python dependencies
    ├── nodejs-lambda/                # Node.js Lambda functions
    │   ├── index.js                  # Main entry point file
    │   ├── utils.js                  # Helper modules
    │   └── package.json              # Node.js dependencies
    ├── java-lambda/                  # Java Lambda functions
    │   ├── src/
    │   │   └── main/
    │   │       └── java/
    │   │           └── com/
    │   │               └── example/
    │   │                   └── Handler.java
    │   ├── pom.xml                   # Maven dependencies
    │   └── build.gradle              # Gradle dependencies (alternative)
    └── go-lambda/                    # Go Lambda functions
        ├── api/
        │   ├── main.go               # Main entry point file
        │   └── go.mod                # Go module file
        └── processor/
            ├── main.go
            └── go.mod

Advanced Features

services:
  scheduled_processor:
    type: lambda_function
    repository: "../src"
    function_name: "processor-${parameters.environment}"
    runtime: "python3.11"
    entry_point: "processor.main"
    code_path: "python-lambda"
    memory_size: 1024
    timeout: 300

    # EventBridge scheduling
    schedule:
      cron: "0 8 * * 1-5"   # Weekdays at 8 AM
      timezone: "America/New_York"
      enabled: true

    # VPC access for private resources
    vpc_config:
      subnet_ids: ["subnet-12345678"]
      security_group_ids: ["sg-abcdef12"]

    # Custom layers
    layers:
      - "arn: aws:lambda:us-east-1:123456789012:layer:utils:1"

    environment:
      DB_HOST: "${parameters.database_host}"

Configuration Parameters

Basic Configuration

function_name (string, optional) - Lambda function name
repository (string, optional) - Code repository location (required, service level)

Service Configuration

build_strategy (string, optional) - Python build strategy: "local", "docker", "auto" (default: "auto")
code_path (string, optional) - Path within repository
environment (string, optional) - Environment variables (key-value pairs, default: empty)
entry_point (string, optional) - Entry point or handler function (format varies by runtime, see Entry point Formats section)
runtime (string, optional) - Runtime (python3.8-3.12, nodejs18.x, java11, go1.x, etc.)

Performance Settings

memory_size (string, optional) - Memory in MB (128-10240, 64MB increments, default: 128)
timeout (string, optional) - Max execution time in seconds (1-900, default: 30)

Security Configuration

vpc_config (string, optional) - VPC access (subnet_ids, security_group_ids, default: none)

Scheduling Configuration

schedule (string, optional) - EventBridge cron scheduling (cron, timezone, enabled, default: none)

Advanced Options

build_layer (string, optional) - Separate dependencies into reusable layer (default: false)
layers (string, optional) - List of Lambda layer ARNs (default: empty)

Parameter Validation

Validation Rules: Parameters are validated according to AWS service limits and naming conventions.

function_name: Must be 1-64 characters, alphanumeric and hyphens only
Examples: my-function, data-processor-prod, api-handler-v2
memory_size: 128-10240 MB in 64MB increments
Examples: 128, 512, 1024
timeout: 1-900 seconds
Examples: 30, 300, 900
runtime: python3.8, python3.9, python3.10, python3.11, python3.12, nodejs18.x, java11, go1.x
Examples: python3.11, nodejs18.x, java11

Common Validation Errors

Parameter name contains invalid characters: Use only alphanumeric characters, hyphens, underscores, and periods
Example: Change "my bucket" to "my-bucket"
Parameter value exceeds maximum length: Reduce the parameter value to within the allowed range
Example: Function names must be 64 characters or less
Service reference not found: Ensure the referenced service exists and is properly named
Example: Check that ${services.my-service.outputs.arn} references an existing service

Entry point Formats by Runtime

Python: filename.function_name (e.g., app.lambda_handler calls lambda_handler() in app.py)
Node.js: filename.function_name (e.g., index.handler calls exports.handler in index.js)
Java: package.Class::method (e.g., com.example.Handler::handleRequest)
Go: main or bootstrap (executable name, typically main)
.NET: Assembly::Namespace.Class::Method (e.g., MyApp::MyApp.Function::Handler)
Ruby: filename.method_name (e.g., lambda_function.lambda_handler)

Runtime-Specific Features

Python (3.8-3.12): Automatically installs dependencies from requirements.txt located in the code_path directory. Uses local pip for pure Python packages (fast) or Docker for native dependencies (reliable). Auto-detects packages like numpy, pandas, scikit-learn that require compiled C extensions. Supports build_strategy and build_layer options.
Node.js (14.x-20.x): Automatic npm install for package.json dependencies. Supports ES modules and CommonJS.
Java (8, 11, 17, 21): Auto-detects Maven (pom.xml) or Gradle (build.gradle) and runs build automatically. Maven: mvn clean package -DskipTests, Gradle: ./gradlew build -x test.
Go (1.x, provided.al2): Compiles to bootstrap executable. Binary must be named according to handler value.
Other Runtimes: .NET (dotnet6-8), Ruby (2.7-3.3) with automatic dependency resolution

Python Requirements & Build Strategies

Python functions automatically look for requirements.txt in the code_path directory and choose the optimal build strategy:

Local Build: Fast pip install with manylinux targeting. Works for pure Python packages (requests, boto3, pydantic)
Docker Build: Uses AWS Lambda Python base image. Required for native dependencies with C extensions (pandas, numpy, scikit-learn, opencv)
Auto Mode: Automatically detects native packages in requirements.txt and chooses docker when needed

Example requirements.txt location: ./src/requirements.txt for code_path: "src"

Common packages:

Pure Python: requests==2.31.0, boto3>=1.26.0, pydantic==2.0.0
Native deps: pandas==1.5.0, numpy>=1.20.0, scikit-learn==1.3.0

Please refer Lambda Dependencies for more information

Advanced Features

EventBridge Scheduling: Cron expressions with timezone support ("0 8 * * 1-5" = weekdays 8AM)
VPC Access: Private subnet access with vpc_config
Layer Management: Automatic layer creation and reuse for dependencies

SageMaker Endpoint

Real-time ML model inference endpoints with auto-scaling

Configuration Example

services:
  model_endpoint:
    type: sagemaker_endpoint
    repository: "."
    configuration:
      endpoint_name: "ml-model-${parameters.environment}"
      model_name: "my-model"
      code_path: "inference"
      instance_type: "ml.m5.large"
      initial_instance_count: 1
      requirements_file: "requirements.txt"
    environment_variables:
      MODEL_VERSION: "v1.0"

Configuration Parameters

Basic Configuration

endpoint_name (string, optional) - SageMaker endpoint name
instance_type (string, optional) - EC2 instance type for hosting
model_name (string, optional) - Model name for the endpoint

Service Configuration

code_path (string, optional) - Path to inference code directory
initial_instance_count (string, optional) - Number of instances to start with
requirements_file (string, optional) - Python dependencies file

Advanced Options

environment_variables (string, optional) - Environment variables for the model

S3 Bucket

Object storage for data lakes, model artifacts, and file storage

Note: S3 bucket names must be globally unique across all AWS accounts and follow DNS naming conventions (3-63 characters, letters, numbers, hyphens, and periods only).

Basic Configuration

services:
  data_lake:
    type: s3_bucket
    bucket_name: "ml-data-${parameters.environment}"
    versioning: true

Advanced Configuration with Encryption and Lifecycle

services:
  secure_data_lake:
    type: s3_bucket
    configuration:
      bucket_name: "ml-secure-data-${parameters.environment}"
      versioning: true
      encryption:
        SSEAlgorithm: "aws:kms"
        KMSMasterKeyID: "alias/s3-encryption-key"
        BucketKeyEnabled: true
      lifecycle_configuration:
        - Id: "DeleteOldVersions"
          Status: "Enabled"
          NoncurrentVersionExpiration:
            NoncurrentDays: 30
        - Id: "TransitionToIA"
          Status: "Enabled"
          Transitions:
            - Days: 30
              StorageClass: "STANDARD_IA"
            - Days: 90
              StorageClass: "GLACIER"

Configuration Parameters

Basic Configuration

bucket_name (string, required) - S3 bucket name (must be globally unique, 3-63 characters, letters/numbers/hyphens/periods only)

Service Configuration

BucketKeyEnabled (string, optional) - Use S3 Bucket Keys to reduce KMS costs (boolean)
Expiration (string, optional) - Object expiration settings
Id (string, optional) - Unique rule identifier
KMSMasterKeyID (string, optional) - KMS key ID or alias (required for aws:kms)
NoncurrentVersionExpiration (string, optional) - Expiration for non-current versions
SSEAlgorithm (string, optional) - Encryption algorithm ("AES256" or "aws:kms")
Status (string, optional) - Rule status ("Enabled" or "Disabled")
Transitions (string, optional) - List of transition rules with Days and StorageClass

Storage Configuration

versioning (boolean, optional, default: false) - Enable object versioning (cannot be disabled once enabled, only suspended)

Parameter Validation

Validation Rules: Parameters are validated according to AWS service limits and naming conventions.

bucket_name: 3-63 characters, lowercase letters, numbers, hyphens, and periods only
Examples: my-data-bucket, company-logs-2024, ml-model-artifacts
versioning: Boolean value
Examples: true, false

Common Validation Errors

Parameter name contains invalid characters: Use only alphanumeric characters, hyphens, underscores, and periods
Example: Change "my bucket" to "my-bucket"
Parameter value exceeds maximum length: Reduce the parameter value to within the allowed range
Example: Function names must be 64 characters or less
Service reference not found: Ensure the referenced service exists and is properly named
Example: Check that ${services.my-service.outputs.arn} references an existing service

Implementation Notes

Bucket names must be globally unique across all AWS accounts
Versioning cannot be disabled once enabled, only suspended
Lifecycle rules are processed in the order they appear in the configuration
Encryption settings apply to all objects in the bucket by default
Use KMS encryption for enhanced security and compliance requirements

DynamoDB Table

NoSQL database for feature stores, metadata, and real-time data access with advanced scaling and security features

Advanced DynamoDB Features: ModelKnife provides comprehensive DynamoDB configuration including auto-scaling, encryption, VPC access, backup configuration, and access control. Choose between on-demand (PAY_PER_REQUEST) and provisioned billing modes based on your workload patterns.

Basic Configuration Example

services:
  feature_store:
    type: dynamodb_table
    configuration:
      table_name: "ml-features-${parameters.environment}"
      attribute_definitions:
        feature_id: "S"  # String type
        timestamp: "S"   # String type
        user_id: "S"     # String type for GSI
      partition_key: "feature_id"
      sort_key: "timestamp"
      billing_mode: "PAY_PER_REQUEST"  # On-demand billing
      point_in_time_recovery: true
      deletion_protection: true
      global_secondary_indexes:
        - index_name: "user-index"
          partition_key: "user_id"
          projection:
            projection_type: "ALL"

Advanced Configuration with Auto-Scaling

services:
  advanced_table:
    type: dynamodb_table
    configuration:
      table_name: "ml-features-${parameters.environment}"
      attribute_definitions:
        feature_id: "S"
        timestamp: "S"
        user_id: "S"
        score: "N"
      partition_key: "feature_id"
      sort_key: "timestamp"
      billing_mode: "PROVISIONED"
      provisioned_throughput:
        ReadCapacityUnits: 10
        WriteCapacityUnits: 10
      auto_scaling:
        table:
          read_capacity:
            min_capacity: 5
            max_capacity: 100
            target_utilization: 70.0
          write_capacity:
            min_capacity: 5
            max_capacity: 100
            target_utilization: 70.0
        global_secondary_indexes:
          user-index:
            read_capacity:
              min_capacity: 5
              max_capacity: 50
              target_utilization: 70.0
      server_side_encryption:
        enabled: true
        kms_key_id: "alias/dynamodb-key"
      global_secondary_indexes:
        - index_name: "user-index"
          partition_key: "user_id"
          sort_key: "score"
          projection:
            projection_type: "ALL"
      local_secondary_indexes:
        - index_name: "score-index"
          sort_key: "score"
          projection:
            projection_type: "KEYS_ONLY"
      stream_specification:
        stream_enabled: true
        stream_view_type: "NEW_AND_OLD_IMAGES"
      backup_configuration:
        on_demand_backup:
          backup_name: "ml-features-backup"
        scheduled_backup:
          schedule_expression: "cron(0 2 * * ? *)"
          retention_period_days: 30
      table_class: "STANDARD"

Auto-Scaling Configuration Examples

# Table-level auto-scaling only
services:
  scalable_table:
    type: dynamodb_table
    configuration:
      table_name: "user-sessions-${parameters.environment}"
      attribute_definitions:
        session_id: "S"
        user_id: "S"
      partition_key: "session_id"
      billing_mode: "PROVISIONED"
      provisioned_throughput:
        ReadCapacityUnits: 5
        WriteCapacityUnits: 5
      auto_scaling:
        table:
          read_capacity:
            min_capacity: 5
            max_capacity: 200
            target_utilization: 70.0
            scale_in_cooldown: 300   # 5 minutes
            scale_out_cooldown: 60   # 1 minute
          write_capacity:
            min_capacity: 5
            max_capacity: 100
            target_utilization: 80.0

---

# GSI auto-scaling configuration
services:
  multi_index_table:
    type: dynamodb_table
    configuration:
      table_name: "product-catalog-${parameters.environment}"
      attribute_definitions:
        product_id: "S"
        category: "S"
        price: "N"
        brand: "S"
      partition_key: "product_id"
      billing_mode: "PROVISIONED"
      provisioned_throughput:
        ReadCapacityUnits: 10
        WriteCapacityUnits: 10
      global_secondary_indexes:
        - index_name: "category-price-index"
          partition_key: "category"
          sort_key: "price"
          projection:
            projection_type: "ALL"
          provisioned_throughput:
            ReadCapacityUnits: 5
            WriteCapacityUnits: 5
        - index_name: "brand-index"
          partition_key: "brand"
          projection:
            projection_type: "KEYS_ONLY"
          provisioned_throughput:
            ReadCapacityUnits: 3
            WriteCapacityUnits: 3
      auto_scaling:
        table:
          read_capacity:
            min_capacity: 5
            max_capacity: 100
            target_utilization: 70.0
          write_capacity:
            min_capacity: 5
            max_capacity: 50
            target_utilization: 70.0
        global_secondary_indexes:
          category-price-index:
            read_capacity:
              min_capacity: 5
              max_capacity: 50
              target_utilization: 75.0
            write_capacity:
              min_capacity: 5
              max_capacity: 25
              target_utilization: 75.0
          brand-index:
            read_capacity:
              min_capacity: 3
              max_capacity: 20
              target_utilization: 80.0
            write_capacity:
              min_capacity: 3
              max_capacity: 10
              target_utilization: 80.0

Global and Local Secondary Index Examples

# Comprehensive index configuration
services:
  analytics_table:
    type: dynamodb_table
    configuration:
      table_name: "user-analytics-${parameters.environment}"
      attribute_definitions:
        user_id: "S"        # Table partition key
        timestamp: "S"      # Table sort key
        event_type: "S"     # GSI partition key
        session_id: "S"     # GSI sort key
        score: "N"          # LSI sort key
        region: "S"         # Additional GSI partition key
      partition_key: "user_id"
      sort_key: "timestamp"
      billing_mode: "PAY_PER_REQUEST"

      # Global Secondary Indexes (different partition key)
      global_secondary_indexes:
        # GSI with both partition and sort key
        - index_name: "event-session-index"
          partition_key: "event_type"
          sort_key: "session_id"
          projection:
            projection_type: "ALL"  # Include all attributes

        # GSI with partition key only
        - index_name: "region-index"
          partition_key: "region"
          projection:
            projection_type: "KEYS_ONLY"  # Only key attributes

        # GSI with selective attribute projection
        - index_name: "event-timestamp-index"
          partition_key: "event_type"
          sort_key: "timestamp"
          projection:
            projection_type: "INCLUDE"
            non_key_attributes:
              - "user_id"
              - "score"
              - "session_id"

      # Local Secondary Indexes (same partition key, different sort key)
      local_secondary_indexes:
        # LSI for querying by score within user_id
        - index_name: "user-score-index"
          sort_key: "score"
          projection:
            projection_type: "KEYS_ONLY"

        # LSI for querying by event_type within user_id
        - index_name: "user-event-index"
          sort_key: "event_type"
          projection:
            projection_type: "ALL"

---

# Provisioned billing with GSI-specific capacity
services:
  ecommerce_table:
    type: dynamodb_table
    configuration:
      table_name: "orders-${parameters.environment}"
      attribute_definitions:
        order_id: "S"
        customer_id: "S"
        order_date: "S"
        status: "S"
        total_amount: "N"
      partition_key: "order_id"
      sort_key: "order_date"
      billing_mode: "PROVISIONED"
      provisioned_throughput:
        ReadCapacityUnits: 20
        WriteCapacityUnits: 10

      global_secondary_indexes:
        # Customer orders index with custom capacity
        - index_name: "customer-date-index"
          partition_key: "customer_id"
          sort_key: "order_date"
          projection:
            projection_type: "ALL"
          provisioned_throughput:
            ReadCapacityUnits: 10  # Different from table capacity
            WriteCapacityUnits: 5

        # Status index for order management
        - index_name: "status-date-index"
          partition_key: "status"
          sort_key: "order_date"
          projection:
            projection_type: "INCLUDE"
            non_key_attributes:
              - "customer_id"
              - "total_amount"
          provisioned_throughput:
            ReadCapacityUnits: 5
            WriteCapacityUnits: 2

      local_secondary_indexes:
        # Sort orders by total amount within order_id
        - index_name: "order-amount-index"
          sort_key: "total_amount"
          projection:
            projection_type: "INCLUDE"
            non_key_attributes:
              - "customer_id"
              - "status"

Configuration Parameters

Basic Configuration

index_name (string, required) - Unique index name
index_name (string, required) - Unique index name
table_name (string, required) - DynamoDB table name (supports environment variables). Must be 3-255 characters, alphanumeric with underscores, hyphens, and periods allowed.
global_secondary_indexes.{index_name} (string, optional) - Per-GSI scaling configuration
stream_view_type (string, optional) - Stream content: "KEYS_ONLY", "NEW_IMAGE", "OLD_IMAGE", "NEW_AND_OLD_IMAGES"

Service Configuration

partition_key (string, required) - Primary partition key attribute name. Must be defined in attribute_definitions.
partition_key (string, required) - GSI partition key (must be in attribute_definitions)
sort_key (string, required) - LSI sort key (different from table sort_key, must be in attribute_definitions)
cross_account_access (string, optional) - Cross-account access configuration with trusted_account_ids and access_level
enabled (boolean, optional) - Enable/disable encryption
iam_policy_templates (string, optional) - List of predefined policy templates: read_only, read_write, stream_consumer, ml_feature_store, admin
on_demand_backup (string, optional) - Manual backup configuration with backup_name
point_in_time_recovery (string, optional, default: false) - Enable continuous backups for 35 days with point-in-time recovery.
projection (string, optional) - Projection configuration: projection_type ("ALL", "KEYS_ONLY", "INCLUDE") and non_key_attributes for INCLUDE
projection (string, optional) - Projection configuration
resource_policy (string, optional) - Custom resource-based policy document
route_table_ids (string, optional) - List of route table IDs
sort_key (string, optional) - Sort key attribute name. Must be defined in attribute_definitions and different from partition_key.
sort_key (string, optional) - GSI sort key (must be in attribute_definitions)
stream_enabled (boolean, optional) - Enable/disable streams
table.read_capacity/write_capacity (string, optional) - Table-level scaling with min_capacity, max_capacity, target_utilization (0-100)
vpc_endpoint_id (string, optional) - VPC endpoint ID (vpce-*)

Performance Settings

provisioned_throughput (string, required) - Capacity settings with ReadCapacityUnits and WriteCapacityUnits (minimum 1 each).
billing_mode (string, optional, default: "PAY_PER_REQUEST") - Billing mode: "PAY_PER_REQUEST" for on-demand or "PROVISIONED" for predictable capacity.
provisioned_throughput (string, optional) - GSI-specific capacity (PROVISIONED billing only)

Security Configuration

kms_key_id (string, optional) - KMS key ID, alias, or ARN. Use "alias/aws/dynamodb" for AWS managed key

Network Configuration

security_group_ids (string, optional) - List of security group IDs
subnet_ids (string, optional) - List of subnet IDs

Scheduling Configuration

scheduled_backup (string, optional) - Automated backup with schedule_expression (cron format) and retention_period_days (1-35)

Parameter Validation

Validation Rules: Parameters are validated according to AWS service limits and naming conventions.

table_name: 3-255 characters, alphanumeric, hyphens, underscores, and periods only
Examples: user-sessions, product_catalog, event.logs
billing_mode: PAY_PER_REQUEST or PROVISIONED
Examples: PAY_PER_REQUEST, PROVISIONED

Common Validation Errors

Parameter name contains invalid characters: Use only alphanumeric characters, hyphens, underscores, and periods
Example: Change "my bucket" to "my-bucket"
Parameter value exceeds maximum length: Reduce the parameter value to within the allowed range
Example: Function names must be 64 characters or less
Service reference not found: Ensure the referenced service exists and is properly named
Example: Check that ${services.my-service.outputs.arn} references an existing service

Implementation Notes

Billing Mode Selection: Use PAY_PER_REQUEST for unpredictable workloads and PROVISIONED with auto-scaling for consistent traffic patterns.
Auto-Scaling: Requires PROVISIONED billing mode and appropriate IAM permissions for CloudWatch and Application Auto Scaling.
Global Secondary Indexes (GSI):
- Maximum 20 GSIs per table
- Each GSI consumes additional read/write capacity and storage
- Can have different partition and sort keys from the main table
- Can be created after table creation
- In PROVISIONED billing mode, each GSI can have independent capacity settings
- Projection types: ALL (all attributes), KEYS_ONLY (key attributes only), INCLUDE (specified attributes)
Local Secondary Indexes (LSI):
- Maximum 10 LSIs per table
- Can only be created at table creation time (not after)
- Must use the same partition key as the main table
- Must have a different sort key from the main table
- Share read/write capacity with the main table
- Total item size limit of 400KB per partition key value (including all LSI items)
Index Validation Rules:
- All partition and sort keys used in indexes must be defined in attribute_definitions
- Index names must be unique within the table
- LSI sort key cannot be the same as the table sort key
- For INCLUDE projection, non_key_attributes must not include key attributes
Encryption: Encryption at rest is enabled by default with AWS managed keys. Customer managed KMS keys provide additional control but incur extra costs.
Streams: Stream records are retained for 24 hours. Use for real-time processing, replication, or analytics.
Table Class: STANDARD_INFREQUENT_ACCESS reduces storage costs by ~60% but increases access costs. Suitable for infrequently accessed data.
Capacity Planning: For PROVISIONED billing, consider query patterns when setting GSI capacity. Read-heavy GSIs may need higher read capacity than the main table.

SageMaker Feature Store

Centralized feature repository for ML model training and inference

Configuration Example

services:
  user_features:
    type: sagemaker_feature_store
    configuration:
      feature_group_name: "user-features-${parameters.environment}"
      record_identifier_name: "user_id"
      event_time_feature_name: "event_time"
      feature_definitions:
        user_id: "String"
        age: "Integral"
        income: "Fractional"
        event_time: "String"
      online_store_config:
        enable_online_store: true
      offline_store_config:
        s3_storage_config:
          s3_uri: "s3://${services.feature_bucket.bucket_name}/features/"

Configuration Parameters

Basic Configuration

event_time_feature_name (string, optional) - Timestamp field name
feature_group_name (string, optional) - Feature group name in SageMaker
record_identifier_name (string, optional) - Primary key field name

Service Configuration

feature_definitions (string, optional) - Feature schema with data types
offline_store_config (string, optional) - Offline store configuration for batch processing
online_store_config (string, optional) - Online store configuration for real-time access

Glue Database

Data catalog database for schema management and metadata storage

Basic Database

services:
  analytics_db:
    type: glue_database
    configuration:
      database_name: "analytics_${parameters.environment}"
      description: "Analytics data warehouse database"

Database with Location

services:
  data_catalog:
    type: glue_database
    configuration:
      database_name: "data_catalog_${parameters.environment}"
      description: "Centralized data catalog for ML features"
      location_uri: "s3://${services.data_lake.outputs.bucket_name}/catalog/"

Configuration Parameters

Basic Configuration

database_name (string, optional) - Glue database name (lowercase, underscore-separated)

Service Configuration

description (string, optional) - Database description (optional)
location_uri (string, optional) - Default storage location S3 URI (optional)

Parameter Validation

General Validation: All parameters are validated according to AWS service limits and naming conventions.

Names: Must follow AWS naming conventions (alphanumeric, hyphens, underscores)
Required Parameters: All required parameters must be provided
Type Validation: Parameters must match expected data types
Range Validation: Numeric parameters must be within allowed ranges

Database Management

Schema Registry: Central location for table schemas and metadata
Table Organization: Tables are organized under databases
Integration: Works with Athena, EMR, and other analytics services

Glue Table

Data catalog table definitions for structured data schema management

Simplified Column Format

Glue tables use a simplified column definition format: column_name: "type". This makes schema definition clean and readable while supporting all standard data types.

Basic Table

services:
  events_table:
    type: glue_table
    configuration:
      table_name: "events"
      database_name: "${services.analytics_db.outputs.database_name}"
      storage_location: "s3://${services.data_lake.outputs.bucket_name}/events/"
      columns:
        event_id: "string"
        user_id: "string"
        event_type: "string"
        timestamp: "bigint"
        amount: "double"
        is_premium: "boolean"

Partitioned Table

services:
  partitioned_events:
    type: glue_table
    configuration:
      table_name: "partitioned_events"
      database_name: "${services.analytics_db.outputs.database_name}"
      storage_location: "s3://${services.data_lake.outputs.bucket_name}/partitioned-events/"
      columns:
        event_id: "string"
        user_id: "string"
        event_type: "string"
        timestamp: "bigint"
        year: "int"
        month: "int"
        day: "int"
      partition_keys: ["year", "month", "day"]
      description: "Events table partitioned by date for efficient querying"

Table with Custom SerDe

services:
  json_table:
    type: glue_table
    configuration:
      table_name: "json_events"
      database_name: "${services.analytics_db.outputs.database_name}"
      storage_location: "s3://${services.data_lake.outputs.bucket_name}/json-events/"
      columns:
        event_id: "string"
        payload: "struct<user_id:string,action:string,metadata:map<string,string>>"
        timestamp: "timestamp"
      serde_library: "org.apache.hive.hcatalog.data.JsonSerDe"
      serde_parameters:
        "serialization.format": "1"
      input_format: "org.apache.hadoop.mapred.TextInputFormat"
      output_format: "org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat"

Complex Data Types

services:
  complex_table:
    type: glue_table
    configuration:
      table_name: "complex_data"
      database_name: "${services.analytics_db.outputs.database_name}"
      storage_location: "s3://${services.data_lake.outputs.bucket_name}/complex-data/"
      columns:
        id: "string"
        tags: "array<string>"
        metadata: "map<string,string>"
        user_profile: "struct<name:string,age:int,preferences:array<string>>"
        scores: "array<double>"
        created_at: "timestamp"
        is_active: "boolean"

Configuration Parameters

Basic Configuration

database_name (string, optional) - Glue database name (reference to database service)
table_name (string, optional) - Glue table name (lowercase, underscore-separated)
table_type (string, optional) - Table type ("EXTERNAL_TABLE" or "VIRTUAL_VIEW", default: "EXTERNAL_TABLE")

Service Configuration

description (string, optional) - Table description (optional)
input_format (string, optional) - Input format class name (optional)
output_format (string, optional) - Output format class name (optional)
parameters (string, optional) - Additional table parameters (optional)
partition_keys (string, optional) - List of column names to use as partitions (optional)
serde_library (string, optional) - SerDe library class name (optional)
serde_parameters (string, optional) - SerDe configuration parameters (optional)
storage_location (string, optional) - S3 URI where table data is stored

Parameter Validation

General Validation: All parameters are validated according to AWS service limits and naming conventions.

Names: Must follow AWS naming conventions (alphanumeric, hyphens, underscores)
Required Parameters: All required parameters must be provided
Type Validation: Parameters must match expected data types
Range Validation: Numeric parameters must be within allowed ranges

Supported Data Types

Primitive Types: string, int, bigint, double, float, boolean, timestamp, date, binary, decimal
Complex Types: array<type>, map<key_type,value_type>, struct<field:type,field:type>
Examples: array<string>, map<string,int>, struct<name:string,age:int>

Partitioning

Partition Keys: List of column names that define table partitions
Performance: Partitioning improves query performance by limiting data scanned
Common Patterns: Date-based partitioning (year, month, day) or categorical partitioning
Validation: Partition key columns must be defined in the columns section

SerDe Configuration

JSON SerDe: org.apache.hive.hcatalog.data.JsonSerDe for JSON data
Parquet: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
CSV/TSV: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
Custom Parameters: Configure SerDe behavior through serde_parameters

Configuration Parameters

Basic Configuration

database_name (string, optional) - Parent Glue database name
table_name (string, optional) - Glue table name

Service Configuration

columns (string, optional) - Table column definitions in simplified format (column_name: type)
input_format (string, optional) - Data input format class (optional)
output_format (string, optional) - Data output format class (optional)
partition_keys (string, optional) - List of column names to use as partition keys (optional)
serde_library (string, optional) - Serialization/deserialization library (optional)
serde_parameters (string, optional) - SerDe-specific parameters (optional)
storage_location (string, optional) - S3 location for table data

Simplified Configuration

Columns: Use simple key-value format: column_name: "type"
Partition Keys: List column names that exist in the columns definition
Convention: Follows ModelKnife's "less configuration" principle
Deployer Mapping: Automatically converts to AWS Glue API format

Search Service

Business-oriented search and analytics with automatic infrastructure optimization

Business-Oriented Search Types: ModelKnife automatically selects and optimizes the underlying search infrastructure based on your business use case. Choose from specialized search types like product_search, document_search, or vector_search, and the system handles the technical complexity.

Product Search Example

services:
  product_search:
    type: search_service
    configuration:
      service_name: "product-search-${parameters.environment}"
      search_type: "product_search"
      environment: "production"
      performance_tier: "high_performance"
      access_level: "private"
      indices:
        - name: "products"
          fields:
            - name: "title"
              type: "text"
              analyzer: "english"
              copy_to: ["search_all"]
            - name: "category"
              type: "keyword"
              facetable: true
            - name: "price"
              type: "number"
              facetable: true
            - name: "brand"
              type: "keyword"
              facetable: true
            - name: "search_all"
              type: "search_as_you_type"
              max_shingle_size: 3
      languages: ["english"]

Vector Search Example

services:
  document_search:
    type: search_service
    configuration:
      service_name: "doc-search-${parameters.environment}"
      search_type: "vector_search"
      environment: "production"
      performance_tier: "balanced"
      access_level: "team_access"
      embedding_config:
        model_id: "amazon.titan-embed-text-v1"
        service: "bedrock"
        batch_size: 25
        auto_vectorize: true
      indices:
        - name: "documents"
          fields:
            - name: "content"
              type: "text"
              analyzer: "english"
            - name: "title"
              type: "text"
              analyzer: "english"
            - name: "embedding"
              type: "vector"
              dimensions: 1536
              similarity_function: "cosine"
            - name: "document_type"
              type: "keyword"
              facetable: true
            - name: "created_date"
              type: "date"
              facetable: true
      languages: ["english"]

Hybrid Search Example

services:
  hybrid_search:
    type: search_service
    configuration:
      service_name: "hybrid-search-${parameters.environment}"
      search_type: "hybrid_search"
      environment: "production"
      performance_tier: "high_performance"
      access_level: "public"
      embedding_config:
        model_id: "amazon.titan-embed-text-v1"
        service: "bedrock"
        batch_size: 50
        auto_vectorize: true
      indices:
        - name: "content"
          fields:
            - name: "title"
              type: "text"
              analyzer: "english"
              searchable: true
            - name: "content"
              type: "text"
              analyzer: "english"
              searchable: true
            - name: "embedding"
              type: "vector"
              dimensions: 1536
              similarity_function: "cosine"
            - name: "tags"
              type: "keyword"
              facetable: true
            - name: "category"
              type: "keyword"
              facetable: true
      languages: ["english", "spanish"]

Business-Oriented Search Types

full_text_search - Optimized for text search and analysis with advanced text processing
vector_search - Optimized for semantic/vector search with embedding integration
product_search - Optimized for e-commerce search with faceting and filtering
log_search - Optimized for time-series log analysis and monitoring
document_search - Optimized for document content search and retrieval
hybrid_search - Combines full-text and vector search for comprehensive results

Parameter Validation

General Validation: All parameters are validated according to AWS service limits and naming conventions.

Names: Must follow AWS naming conventions (alphanumeric, hyphens, underscores)
Required Parameters: All required parameters must be provided
Type Validation: Parameters must match expected data types
Range Validation: Numeric parameters must be within allowed ranges

Configuration Parameters

Basic Configuration

search_type (string, required) - Business-oriented search type (see above)
service_name (string, required) - Search service name (1-32 characters, lowercase, alphanumeric and hyphens)

Service Configuration

access_level (string, optional) - Access level ("public", "private", "team_access")
data_sources (string, optional) - Data ingestion source configurations
embedding_config (string, optional) - Embedding model configuration for vector search
environment (string, optional) - Deployment environment ("development", "staging", "production")
indices (string, optional) - Index configurations with field definitions
languages (string, optional) - Supported languages for text analysis
performance_tier (string, optional) - Performance tier ("development", "balanced", "high_performance", "cost_optimized")

Field Types

text - Full-text searchable content with analyzers
keyword - Exact match, faceting, and filtering
number - Numeric values for range queries and faceting
date - Date/timestamp fields for temporal filtering
vector - Vector embeddings for semantic search
search_as_you_type - Auto-complete and search suggestions

Performance Tiers

development - Cost-optimized for development and testing
balanced - Balanced performance and cost for most use cases
high_performance - Performance-optimized for production workloads
cost_optimized - Minimum cost configuration for light usage

Access Levels

public - Public internet access with authentication
private - Private VPC access only
team_access - Team-based access control with IAM integration

Embedding Configuration

model_id - Embedding model identifier (e.g., "amazon.titan-embed-text-v1")
service - Embedding service ("bedrock", "openai", "huggingface")
batch_size - Batch size for embedding generation (1-100)
timeout_seconds - Request timeout for embedding service
auto_vectorize - Automatically generate embeddings for text fields

Implementation Notes

Search service automatically selects optimal infrastructure (serverless vs managed) based on configuration
Vector search types require embedding_config for automatic vectorization
Field configurations determine index mappings and query capabilities
Performance tiers affect underlying infrastructure provisioning and costs
Access levels control network access patterns and authentication requirements
Multi-language support affects text analyzers and tokenization strategies
Data sources enable automatic ingestion and transformation pipelines

API Gateway (REST)

REST API management for serverless applications and microservices

Recommendation: Use API Gateway v2 (HTTP APIs)

For new projects, we recommend using API Gateway v2 (HTTP APIs) instead of REST APIs. HTTP APIs are 70% cheaper, 60% faster, and use the same configuration format. Simply use type: api_gateway_v2 instead of type: api_gateway.

Configuration Example

services:
  ml_api:
    type: api_gateway
    configuration:
      api_name: "ml-api-${parameters.environment}"
      stage_name: "prod"
      resources:
        - path: "/predict"
          method: "POST"
          integration_type: "lambda"
          lambda_function: "${services.inference_lambda.function_name}"
        - path: "/health"
          method: "GET"
          integration_type: "mock"

Configuration Parameters

Basic Configuration

api_name (string, optional) - API Gateway REST API name
stage_name (string, optional) - Deployment stage name (e.g., dev, prod)

Service Configuration

resources (string, optional) - API resource definitions with paths and methods

Network Configuration

cors_enabled (string, optional) - Enable CORS for cross-origin requests (optional)

API Gateway v2 (HTTP APIs)

Modern HTTP APIs with JWT/REQUEST authorizers, 70% cheaper and 60% faster than REST APIs

Basic Integration Examples

services:
  my_api:
    type: api_gateway_v2
    configuration:
      api_name: "my-api-${parameters.environment}"
      stage_name: "dev"
      resources:
        # Lambda integration
        - path: "/users"
          methods: ["GET", "POST"]
          integration_type: "lambda"
          lambda_function: "${services.user_lambda.function_name}"
        
        # HTTP integration (public endpoints)
        - path: "/external"
          methods: ["GET"]
          integration_type: "http"
          integration_uri: "https://api.external-service.com/data"
        
        # Mock integration (testing)
        - path: "/health"
          methods: ["GET"]
          integration_type: "mock"
          mock_response:
            status_code: 200
            response_body: '{"status": "healthy"}'

Private Integration

The integration_type: "private" provides a simplified and more intuitive way to configure private resource integrations. It automatically handles path extraction, parameter conversion, and VPC Link setup, reducing configuration complexity and errors.

Why Private Integration?

Connect API Gateway to internal services (ECS, EKS, EC2) without exposing them to the internet. Ideal for security, compliance, and enterprise architectures requiring private backend connectivity.

Private Integration

services:
  enrichment_api:
    type: api_gateway_v2
    configuration:
      api_name: "enrichment-api-${parameters.environment}"
      stage_name: "dev"
      resources:
        - path: "/users"
          methods: ["GET"]
          integration_type: "private"
          integration_uri: "arn:aws:elasticloadbalancing:eu-west-1:123456789012:listener/net/my-nlb/abcd1234/efgh5678#/v1/users"
          vpc_link_id: "abc123"
        
        - path: "/users/{id}"
          methods: ["GET"]
          integration_type: "private"
          integration_uri: "arn:aws:elasticloadbalancing:eu-west-1:123456789012:listener/net/my-nlb/abcd1234/efgh5678#/v1/users/{id}"
          vpc_link_id: "abc123"

Private Integration with HTTP URLs

services:
  internal_api:
    type: api_gateway_v2
    configuration:
      api_name: "internal-api-${parameters.environment}"
      stage_name: "dev"
      resources:
        - path: "/api/data"
          methods: ["GET"]
          integration_type: "private"
          integration_uri: "https://internal-service.vpc.local/v1/data"
          vpc_link_id: "abc123"
        
        - path: "/api/data/{category}"
          methods: ["GET"]
          integration_type: "private"
          integration_uri: "https://internal-service.vpc.local/v1/data/{category}"
          vpc_link_id: "abc123"

Private Integration URI Formats

Supported URI Formats

ELB Listener ARN: arn:aws:elasticloadbalancing:region:account:listener/net/nlb-name/id/listener-id#/path
Cloud Map Service ARN: arn:aws:servicediscovery:region:account:service/service-id#/path
HTTP/HTTPS URL: https://internal-service.vpc.local/path
Path Parameters: Use {paramName} format (automatically converted to AWS format)

Authentication

JWT Authorizer

services:
  secure_api:
    type: api_gateway_v2
    configuration:
      api_name: "secure-api-${parameters.environment}"
      stage_name: "prod"
      authorizers:
        - name: "jwt-auth"
          type: "JWT"
          configuration:
            # Cognito User Pool
            issuer: "https://cognito-idp.us-east-1.amazonaws.com/us-east-1_XXXXXXXXX"
            audience: ["your-app-client-id"]
            
            # Google OAuth (alternative)
            # issuer: "https://accounts.google.com"
            # audience: ["your-google-client-id.apps.googleusercontent.com"]
            
            # Auth0 (alternative)
            # issuer: "https://your-domain.auth0.com/"
            # audience: ["your-auth0-api-identifier"]
      resources:
        - path: "/protected"
          methods: ["GET", "POST"]
          integration_type: "lambda"
          lambda_function: "${services.api_lambda.function_name}"
          authorizer: "jwt-auth"

REQUEST Authorizer (Custom Lambda)

services:
  auth_api:
    type: api_gateway_v2
    configuration:
      api_name: "auth-api-${parameters.environment}"
      stage_name: "prod"
      authorizers:
        - name: "custom-auth"
          type: "REQUEST"
          configuration:
            authorizer_uri: "${services.auth_lambda.function_arn}"
            identity_sources: ["$request.header.Authorization"]
            authorizer_result_ttl_in_seconds: 300
          
          # Simplified format (alternative)
          # lambda_function: "custom-authorizer"
          # cache_ttl: 300
          # identity_sources: ["$request.header.Authorization", "$request.header.X-API-Key"]
      resources:
        - path: "/protected"
          methods: ["GET", "POST"]
          integration_type: "lambda"
          lambda_function: "${services.api_lambda.function_name}"
          authorizer: "custom-auth"

Advanced Features

CORS & Throttling

services:
  production_api:
    type: api_gateway_v2
    configuration:
      api_name: "production-api-${parameters.environment}"
      stage_name: "prod"

      # CORS configuration
      cors:
        allow_credentials: true
        allow_headers: ["Content-Type", "Authorization"]
        allow_methods: ["GET", "POST", "PUT", "DELETE"]
        allow_origins: ["https://myapp.com"]
        max_age: 3600

      # Rate limiting
      throttling:
        burst_limit: 5000
        rate_limit: 2000.0

      resources:
        - path: "/api/{proxy+}"
          methods: ["ANY"]
          integration_type: "lambda"
          lambda_function: "${services.api_lambda.function_name}"

Configuration Reference

api_name - HTTP API name (supports variables)
stage_name - Deployment stage (default: "dev")
description - API description
resources - Route definitions with paths and methods
authorizers - JWT and REQUEST authorizer definitions
cors - Cross-origin resource sharing settings
throttling - Rate limiting configuration

Integration Types

HTTP API v2 supports four integration types for connecting routes to backend services:

lambda - AWS Lambda proxy integration for serverless functions

integration_type: "lambda"
lambda_function: "my-function-name"

http - HTTP proxy integration for public HTTP/HTTPS endpoints

integration_type: "http"
integration_uri: "https://api.external.com/endpoint"

private Simplified private resource integration

integration_type: "private"
integration_uri: "arn:aws:elasticloadbalancing:region:account:listener/net/nlb/id/listener-id#/v1/path/{id}"
vpc_link_id: "vpc-link-id"

mock - Mock responses for testing and prototyping

integration_type: "mock"
mock_response:
  status_code: 200
  response_body: '{"status": "ok"}'

Recommendation: Use Private Integration for VPC Resources

For private resources behind VPC Links, use integration_type: "private" instead of integration_type: "http". The private integration type automatically handles path extraction, parameter conversion, and VPC Link configuration.

Authorizer Types

JWT Authorizers: Support Cognito User Pools and OIDC providers (Auth0, Google, etc.)
REQUEST Authorizers: Custom Lambda-based authorization with configurable identity sources
Route-level Authorization: Apply different authorizers to different routes

CORS & Throttling

cors.allow_credentials - Allow cookies and authorization headers
cors.allow_headers/methods/origins - Configure allowed headers, methods, and origins
throttling.burst_limit - Maximum concurrent requests
throttling.rate_limit - Steady-state requests per second

Key Benefits

Cost: Up to 70% cheaper than REST APIs
Performance: Up to 60% faster response times
Configuration: Simple and intuitive resource definitions
Features: Modern JWT/REQUEST authorizers, enhanced CORS, better throttling

Complete Example

services:
  complete_api:
    type: api_gateway_v2
    configuration:
      api_name: "complete-api-${parameters.environment}"
      stage_name: "prod"
      description: "Complete HTTP API with all features"

      # JWT and REQUEST authorizers
      authorizers:
        - name: "jwt-auth"
          type: "JWT"
          configuration:
            issuer: "https://cognito-idp.us-east-1.amazonaws.com/us-east-1_XXXXXXXXX"
            audience: ["client-id"]
        - name: "custom-auth"
          type: "REQUEST"
          configuration:
            authorizer_uri: "${services.auth_lambda.function_arn}"
            identity_sources: ["$request.header.X-API-Key"]
            authorizer_result_ttl_in_seconds: 300

      # CORS configuration
      cors:
        allow_credentials: true
        allow_headers: ["Content-Type", "Authorization"]
        allow_methods: ["GET", "POST", "PUT", "DELETE"]
        allow_origins: ["https://myapp.com"]
        max_age: 3600

      # Throttling limits
      throttling:
        burst_limit: 5000
        rate_limit: 2000.0

      resources:
        # Public endpoints
        - path: "/health"
          methods: ["GET"]
          integration_type: "mock"
          mock_response:
            status_code: 200
            response_body: '{"status": "healthy"}'

        # JWT protected endpoints
        - path: "/users"
          methods: ["GET", "POST"]
          integration_type: "lambda"
          lambda_function: "${services.user_lambda.function_name}"
          authorizer: "jwt-auth"

        - path: "/users/{id}"
          methods: ["GET", "PUT", "DELETE"]
          integration_type: "lambda"
          lambda_function: "${services.user_lambda.function_name}"
          authorizer: "jwt-auth"

        # Custom auth protected endpoints
        - path: "/admin/stats"
          methods: ["GET"]
          integration_type: "lambda"
          lambda_function: "${services.admin_lambda.function_name}"
          authorizer: "custom-auth"

        # HTTP proxy integration
        - path: "/external/{proxy+}"
          methods: ["GET", "POST"]
          integration_type: "http"
          integration_uri: "https://api.external.com/{proxy}"

      # Stage-specific configuration
      stage_configuration:
        auto_deploy: true
        throttling_burst_limit: 4000
        throttling_rate_limit: 1500.0

Configuration Options

Resource Configuration: Define API endpoints using the resources format with paths, methods, and integrations.

Route-based Configuration: HTTP APIs also support native routes format for advanced use cases.

Kinesis Stream

Real-time data streaming for event processing and analytics

Real-time Data Streaming

Kinesis Data Streams enables real-time processing of streaming data at massive scale with automatic scaling and built-in durability.

Basic Configuration

services:
  event_stream:
    type: kinesis_stream
    configuration:
      stream_name: "events-${parameters.environment}"
      shard_count: 2
      retention_period: 48

Encrypted Stream

services:
  encrypted_stream:
    type: kinesis_stream
    configuration:
      stream_name: "secure-events-${parameters.environment}"
      shard_count: 4
      retention_period: 168  # 7 days
      encryption_type: "KMS"
      kms_key_id: "alias/aws/kinesis"

Configuration Parameters

Basic Configuration

encryption_type (string, optional) - Encryption type ("KMS" for server-side encryption)
stream_name (string, optional) - Kinesis stream name (supports environment variables)

Service Configuration

retention_period (string, optional) - Data retention in hours (24-8760)
role_arn (string, optional) - Custom IAM role ARN (optional, uses global default if not specified)
shard_count (string, optional) - Number of shards (1-1000), affects throughput capacity

Security Configuration

kms_key_id (string, optional) - KMS key for encryption (alias or key ID)

Parameter Validation

Validation Rules: Parameters are validated according to AWS service limits and naming conventions.

stream_name: 1-128 characters, alphanumeric, hyphens, underscores, and periods only
Examples: event-stream, user_activity, log.stream
shard_count: 1-1000 shards
Examples: 1, 5, 100

Common Validation Errors

Parameter name contains invalid characters: Use only alphanumeric characters, hyphens, underscores, and periods
Example: Change "my bucket" to "my-bucket"
Parameter value exceeds maximum length: Reduce the parameter value to within the allowed range
Example: Function names must be 64 characters or less
Service reference not found: Ensure the referenced service exists and is properly named
Example: Check that ${services.my-service.outputs.arn} references an existing service

IAM Integration

Default: Uses global kinesis_stream_role with appropriate permissions
Custom: Provide role_arn parameter for custom IAM role
Validation: Custom roles are validated for correct permissions

Kinesis Firehose

Managed data delivery service for streaming data to data lakes and analytics services

Data Delivery Modes

Kinesis Data Firehose supports two delivery modes: Direct PUT (applications write directly to Firehose) and Kinesis Data Streams as source (Firehose reads from an existing stream). Both modes support buffering, compression, and format conversion.

Source Configuration Options

For stream-to-S3 delivery, you can specify the source Kinesis stream using either: source_service (recommended - references service name) or source_stream_arn (direct ARN reference). The source_service approach is more maintainable and follows ModelKnife service reference patterns.

Direct PUT to S3

services:
  data_pipeline:
    type: kinesis_firehose
    configuration:
      delivery_stream_name: "events-firehose-${parameters.environment}"
      destination_s3_bucket: "${services.data_lake.outputs.bucket_name}"
      destination_s3_prefix: "events/"
      buffer_size: 5
      buffer_interval: 300
      compression_format: "GZIP"

Stream-to-S3 Delivery (using source_service)

services:
  stream_to_s3:
    type: kinesis_firehose
    configuration:
      delivery_stream_name: "stream-to-s3-${parameters.environment}"
      source_service: "event_stream"  # Reference to Kinesis stream service
      destination_s3_bucket: "${services.data_lake.outputs.bucket_name}"
      destination_s3_prefix: "processed-events/"
      buffer_size: 10
      buffer_interval: 60
      compression_format: "GZIP"

Stream-to-S3 Delivery (using source_stream_arn)

services:
  stream_to_s3_alt:
    type: kinesis_firehose
    configuration:
      delivery_stream_name: "stream-alt-${parameters.environment}"
      source_stream_arn: "${services.event_stream.outputs.stream_arn}"
      destination_s3_bucket: "${services.data_lake.outputs.bucket_name}"
      destination_s3_prefix: "processed-events/"
      buffer_size: 10
      buffer_interval: 60
      compression_format: "GZIP"

Advanced Configuration with Format Conversion

services:
  advanced_firehose:
    type: kinesis_firehose
    configuration:
      delivery_stream_name: "advanced-firehose-${parameters.environment}"
      source_stream_arn: "${services.event_stream.outputs.stream_arn}"
      destination_s3_bucket: "${services.data_lake.outputs.bucket_name}"
      destination_s3_prefix: "events/year=!{timestamp:yyyy}/month=!{timestamp:MM}/day=!{timestamp:dd}/"
      error_output_prefix: "errors/"
      buffer_size: 64
      buffer_interval: 300
      compression_format: "GZIP"

      # Convert JSON to Parquet format
      data_format_conversion:
        enabled: true
        output_format: "parquet"
        schema_database: "${services.analytics_db.outputs.database_name}"
        schema_table: "events"

      # Custom IAM role for cross-account access
      role_arn: "arn:aws:iam::123456789012:role/FirehoseDeliveryRole"

High-Throughput Configuration

services:
  high_throughput_firehose:
    type: kinesis_firehose
    configuration:
      delivery_stream_name: "high-throughput-${parameters.environment}"
      source_stream_arn: "${services.high_volume_stream.outputs.stream_arn}"
      destination_s3_bucket: "${services.data_lake.outputs.bucket_name}"
      destination_s3_prefix: "high-volume-data/"

      # Optimize for high throughput
      buffer_size: 128        # Maximum buffer size
      buffer_interval: 60     # Minimum buffer interval
      compression_format: "GZIP"

      tags:
        Environment: "${parameters.environment}"
        DataType: "HighVolume"
        CostCenter: "Analytics"

Configuration Parameters

Basic Configuration

delivery_stream_name (string, optional) - Firehose delivery stream name (1-64 characters, alphanumeric, hyphens, underscores)

Service Configuration

buffer_interval (string, optional) - Buffer interval in seconds (60-900, default: 60)
buffer_size (string, optional) - Buffer size in MB (1-128, default: 64)
compression_format (string, optional) - Data compression format (GZIP, ZIP, Snappy, HADOOP_SNAPPY, UNCOMPRESSED, default: GZIP)
data_format_conversion (string, optional) - Convert data format (JSON to Parquet/ORC, optional)
destination_s3_bucket (string, optional) - Target S3 bucket for data delivery (required)
destination_s3_prefix (string, optional) - S3 key prefix for data organization (supports dynamic partitioning)
error_output_prefix (string, optional) - S3 prefix for error records (optional)
role_arn (string, optional) - Custom IAM role ARN (optional, uses global default if not specified)
source_service (string, optional) - Source Kinesis stream service name (alternative to source_stream_arn)
source_stream_arn (string, optional) - Source Kinesis stream ARN (alternative to source_service, for stream-to-S3 mode)

Advanced Options

tags (string, optional) - Resource tags (optional)

Parameter Validation

Validation Rules: Parameters are validated according to AWS service limits and naming conventions.

bucket_name: 3-63 characters, lowercase letters, numbers, hyphens, and periods only
Examples: my-data-bucket, company-logs-2024, ml-model-artifacts
versioning: Boolean value
Examples: true, false

Common Validation Errors

Parameter name contains invalid characters: Use only alphanumeric characters, hyphens, underscores, and periods
Example: Change "my bucket" to "my-bucket"
Parameter value exceeds maximum length: Reduce the parameter value to within the allowed range
Example: Function names must be 64 characters or less
Service reference not found: Ensure the referenced service exists and is properly named
Example: Check that ${services.my-service.outputs.arn} references an existing service

Data Format Conversion

Supported Formats: Convert JSON input to Parquet or ORC format
Schema Integration: Uses AWS Glue Data Catalog for schema information
Configuration:
- enabled - Enable format conversion (true/false)
- output_format - Target format ("parquet" or "orc")
- schema_database - Glue database name for schema
- schema_table - Glue table name for schema
Benefits: Improved query performance and reduced storage costs

Dynamic Partitioning

Time-based Partitioning: Use timestamp expressions in S3 prefix
- year=!{timestamp:yyyy}/month=!{timestamp:MM}/ - Year/month partitions
- !{timestamp:yyyy/MM/dd/HH}/ - Hourly partitions
Content-based Partitioning: Extract values from record content
- event_type=!{partitionKeyFromQuery:event_type}/ - Partition by event type
Performance: Improves query performance by reducing data scanned

Buffering and Delivery

Buffer Size: Amount of data (1-128 MB) to buffer before delivery
Buffer Interval: Maximum time (60-900 seconds) to wait before delivery
Delivery Trigger: Whichever condition is met first triggers delivery
Optimization: Larger buffers reduce S3 PUT costs but increase latency

Compression Options

GZIP: Good compression ratio, widely supported (recommended)
ZIP: Compatible with many tools, moderate compression
Snappy: Fast compression/decompression, lower ratio
HADOOP_SNAPPY: Hadoop-compatible Snappy format
UNCOMPRESSED: No compression, fastest delivery

Delivery Modes

Direct PUT: Applications write directly to Firehose (no source configuration needed)
Kinesis Data Streams Source: Firehose reads from existing stream
- source_service - Reference to Kinesis stream service name (recommended)
- source_stream_arn - Direct ARN reference (alternative approach)
Use Cases: Direct PUT for simple ingestion, stream source for complex processing pipelines

Error Handling

Error Records: Failed records are delivered to error output prefix
Retry Logic: Automatic retries for transient failures
Monitoring: CloudWatch metrics for delivery success/failure rates

IAM Integration

Default: Uses global kinesis_firehose_role with appropriate permissions
Custom: Provide role_arn parameter for custom IAM role
Permissions: Role needs S3 write permissions and Kinesis read permissions (if using source stream)
Cross-Account: Custom roles enable cross-account S3 delivery

EventBridge Pipe

Event-driven integrations with filtering, transformation, and routing

Simplified Service Configuration

EventBridge Pipes support both simplified service name references (source_service, target_service) and direct ARN specification. Service names are automatically resolved to ARNs during deployment, following ModelKnife's "less configuration" principle.

Basic Pipe with Service Names

services:
  event_processor:
    type: eventbridge_pipe
    configuration:
      pipe_name: "event-processor-${parameters.environment}"
      source_service: "event_stream"
      target_service: "processor_lambda"
      description: "Process events from stream to Lambda"

Pipe with Direct ARNs

services:
  external_pipe:
    type: eventbridge_pipe
    configuration:
      pipe_name: "external-pipe-${parameters.environment}"
      source_arn: "arn:aws:kinesis:us-east-1:123456789012:stream/external-stream"
      target_arn: "arn:aws:lambda:us-east-1:123456789012:function:external-processor"
      description: "Process events from external Kinesis stream"

Kinesis Stream to Lambda with Filtering

services:
  filtered_processor:
    type: eventbridge_pipe
    configuration:
      pipe_name: "filtered-processor-${parameters.environment}"
      source_service: "event_stream"
      target_service: "processor_lambda"
      description: "Process only high-priority events"
      source_parameters:
        kinesis_stream_parameters:
          batch_size: 10
          starting_position: "LATEST"
          maximum_batching_window_in_seconds: 5
      filter_criteria:
        filters:
          - pattern: '{"event_type": ["click", "purchase", "signup"]}'
          - pattern: '{"priority": ["high", "critical"]}'

SQS to Lambda with Target Parameters

services:
  sqs_processor:
    type: eventbridge_pipe
    configuration:
      pipe_name: "sqs-processor-${parameters.environment}"
      source_service: "message_queue"
      target_service: "message_processor"
      description: "Process SQS messages asynchronously"
      source_parameters:
        sqs_queue_parameters:
          batch_size: 5
          maximum_batching_window_in_seconds: 10
      target_parameters:
        lambda_function_parameters:
          invocation_type: "FIRE_AND_FORGET"

DynamoDB Stream to Multiple Targets

services:
  # Primary processing pipe
  db_stream_processor:
    type: eventbridge_pipe
    configuration:
      pipe_name: "db-stream-processor-${parameters.environment}"
      source_arn: "${services.user_table.outputs.stream_arn}"
      target_service: "user_change_processor"
      description: "Process user table changes"
      source_parameters:
        dynamodb_stream_parameters:
          batch_size: 5
          starting_position: "LATEST"
      filter_criteria:
        filters:
          - pattern: '{"eventName": ["INSERT", "MODIFY"]}'

  # Analytics pipe
  db_analytics_pipe:
    type: eventbridge_pipe
    configuration:
      pipe_name: "db-analytics-pipe-${parameters.environment}"
      source_arn: "${services.user_table.outputs.stream_arn}"
      target_service: "analytics_processor"
      description: "Send user changes to analytics"
      filter_criteria:
        filters:
          - pattern: '{"eventName": ["INSERT", "REMOVE"]}'

Complex Event Processing Pipeline

services:
  # Multi-stage processing with enrichment
  enriched_processor:
    type: eventbridge_pipe
    configuration:
      pipe_name: "enriched-processor-${parameters.environment}"
      source_service: "raw_events_stream"
      target_service: "enriched_processor_lambda"
      description: "Enrich and process raw events"

      source_parameters:
        kinesis_stream_parameters:
          batch_size: 25
          starting_position: "TRIM_HORIZON"
          maximum_batching_window_in_seconds: 5

      target_parameters:
        lambda_function_parameters:
          invocation_type: "REQUEST_RESPONSE"

      filter_criteria:
        filters:
          - pattern: '{"source": ["web", "mobile"]}'
          - pattern: '{"event_type": {"exists": true}}'
          - pattern: '{"user_id": {"exists": true}}'

      # Custom IAM role for cross-account access
      role_arn: "arn:aws:iam::123456789012:role/EventBridgePipeRole"

Configuration Parameters

Basic Configuration

pipe_name (string, optional) - EventBridge pipe name (1-64 characters, alphanumeric, periods, hyphens, underscores)

Service Configuration

description (string, optional) - Pipe description (optional)
enrichment (string, optional) - Event enrichment configuration (optional)
filter_criteria (string, optional) - Event filtering patterns using JSON pattern matching (optional)
role_arn (string, optional) - Custom IAM role ARN (optional, uses global default if not specified)
source_arn (string, optional) - Source ARN (alternative to source_service, for external or direct ARN specification)
source_parameters (string, optional) - Source-specific configuration parameters (optional)
source_service (string, optional) - Name of source service (simplified approach, mutually exclusive with source_arn)
target_arn (string, optional) - Target ARN (alternative to target_service, for external or direct ARN specification)
target_parameters (string, optional) - Target-specific configuration parameters (optional)
target_service (string, optional) - Name of target service (simplified approach, mutually exclusive with target_arn)

Advanced Options

tags (string, optional) - Resource tags (optional)

Parameter Validation

Validation Rules: Parameters are validated according to AWS service limits and naming conventions.

function_name: Must be 1-64 characters, alphanumeric and hyphens only
Examples: my-function, data-processor-prod, api-handler-v2
memory_size: 128-10240 MB in 64MB increments
Examples: 128, 512, 1024
timeout: 1-900 seconds
Examples: 30, 300, 900
runtime: python3.8, python3.9, python3.10, python3.11, python3.12, nodejs18.x, java11, go1.x
Examples: python3.11, nodejs18.x, java11

Common Validation Errors

Parameter name contains invalid characters: Use only alphanumeric characters, hyphens, underscores, and periods
Example: Change "my bucket" to "my-bucket"
Parameter value exceeds maximum length: Reduce the parameter value to within the allowed range
Example: Function names must be 64 characters or less
Service reference not found: Ensure the referenced service exists and is properly named
Example: Check that ${services.my-service.outputs.arn} references an existing service

Source Parameters

Kinesis Stream Parameters:
- batch_size - Number of records per batch (1-10000, default varies by source)
- starting_position - Where to start reading (TRIM_HORIZON, LATEST, AT_TIMESTAMP)
- maximum_batching_window_in_seconds - Maximum time to wait for batch (0-300 seconds)
DynamoDB Stream Parameters:
- batch_size - Number of records per batch (1-1000)
- starting_position - Where to start reading (TRIM_HORIZON, LATEST)
- maximum_batching_window_in_seconds - Maximum time to wait for batch
SQS Parameters:
- batch_size - Number of messages per batch (1-10)
- maximum_batching_window_in_seconds - Maximum time to wait for batch

Target Parameters

Lambda Function Parameters:
- invocation_type - How to invoke Lambda (REQUEST_RESPONSE, FIRE_AND_FORGET)
SQS Parameters:
- message_group_id - Message group ID for FIFO queues
- message_deduplication_id - Deduplication ID for FIFO queues

Event Sources & Targets

Supported Sources:
- Kinesis Data Streams
- Amazon SQS queues
- DynamoDB Streams
Supported Targets:
- AWS Lambda functions
- Amazon SQS queues
- Amazon SNS topics
- EventBridge event buses
- Kinesis Data Streams
- Kinesis Data Firehose
- AWS Step Functions state machines

Event Filtering

JSON Pattern Matching: Use JSON patterns to filter events based on content
Multiple Filters: Apply multiple filter patterns (OR logic between filters)
Pattern Examples:
- {"event_type": ["purchase", "signup"]} - Match specific event types
- {"amount": [{"numeric": [">", 100]}]} - Numeric comparisons
- {"user_id": {"exists": true}} - Check field existence
- {"source": {"prefix": "web-"}} - String prefix matching
Performance: Filtering reduces downstream processing and costs

Simplified Configuration Benefits

Service Names: Use source_service and target_service instead of complex ARNs
Automatic ARN Resolution: Deployer automatically resolves service names to ARNs during deployment
Convention over Configuration: Follows ModelKnife's principle of minimal configuration
Flexibility: Can still use source_arn and target_arn for external resources or direct ARN specification
Validation: Service names are validated against defined services in the same configuration

IAM Integration

Default: Uses global eventbridge_pipe_role with appropriate permissions
Custom: Provide role_arn parameter for custom IAM role (useful for cross-account access)
Permissions: Role needs permissions to read from source and write to target
Validation: Custom roles are validated for correct permissions during deployment

Lambda Service Dependency Management

Installing Python dependencies for Lambda functions with native binaries

Overview

ModelKnife provides optimized dependency management for Lambda functions, especially for packages with native binaries like pandas, numpy, and scikit-learn that require Linux-compatible compilation.

Native Binary Challenge

Pure pip installation fails for native packages because they contain binaries compiled for your host OS (macOS/Windows). AWS Lambda requires Linux-compatible binaries. ModelKnife solves this with Docker-based builds using official Lambda base images.

Configuration Parameters

Lambda Service with Dependencies

services:
  iris_inference:
    type: lambda_function
    repository: "./src"
    handler: iris_inference_with_model.lambda_handler
    runtime: python3.9
    code_path: "lambda_functions/"
    requirements_file: "requirements.txt"

    # Dependency build configuration
    build_layer: true           # Create separate dependency layer
    build_strategy: "auto"      # auto|local|docker

    # Lambda configuration
    timeout: 300
    memory_size: 512
    environment_variables:
      MODEL_S3_PATH: "s3://my-bucket/models/"

Build Configuration Options

build_layer

true - Package dependencies as Lambda Layer (recommended)
false - Bundle dependencies with function code
Benefits: Faster deployments, shared across functions, 50MB+ size limit

build_strategy

auto - Auto-detect based on requirements (default)
local - Use local pip (pure Python packages only)
docker - Force Docker build (native binaries)

Build Strategy Details

AUTO Strategy (Recommended)

Automatically detects whether your requirements contain native packages and chooses the appropriate build method:

Pure Python packages: Uses fast local pip installation
Native packages detected: Switches to Docker build automatically
Known native packages: pandas, numpy, scikit-learn, scipy, pillow, lxml, psycopg2

LOCAL vs DOCKER Build Strategy

# LOCAL Strategy - Fast, pure Python only
services:
  simple_lambda:
    type: lambda_function
    handler: handler.main
    requirements_file: "requirements.txt"  # requests, boto3, etc.
    build_strategy: "local"                # Fast pip install

# DOCKER Strategy - Native binaries supported
services:
  ml_lambda:
    type: lambda_function
    handler: inference.predict
    requirements_file: "requirements.txt"  # pandas, numpy, sklearn
    build_strategy: "docker"               # Linux-compatible build
    build_layer: true                      # Recommended for large deps

Performance Optimizations

Smart Caching & Build Skipping

ModelKnife includes several optimizations to minimize build times:

Build Hash Validation: Skips dependency builds when requirements.txt unchanged
Layer Reuse: Checks AWS for existing layers before building
Source-Only Updates: ~50% faster deployments when only function code changes
Incremental Builds: Separate source and dependency lifecycles

Example Build Output

# First deployment - full build
🔧 Building dependencies with DOCKER strategy
📦 Creating Lambda layer ZIP: 42.3 MB
🚀 Publishing layer to AWS: mlknife-layer-76daf7e3
✅ Function deployed: 35.2 seconds

# Source-only change - optimized
🚀 Skipping dependency ZIP build - using existing layer
🎯 Layer already has ARN, skipping layer publishing
✅ Function updated: 16.1 seconds (50% faster!)

Docker Build Process

Docker Requirements

Docker build strategy requires Docker installed and running on your system. ModelKnife uses official AWS Lambda base images to ensure compatibility.

Docker Build Process

# ModelKnife runs these commands automatically:
docker run --platform linux/amd64 \
  --entrypoint "" \
  -v /local/requirements:/var/task \
  -v /output:/var/runtime \
  public.ecr.aws/lambda/python:3.9 \
  pip install -r /var/task/requirements.txt -t /var/runtime

# Benefits:
# ✅ Linux-compatible binaries
# ✅ Exact Lambda runtime environment
# ✅ Handles complex native dependencies
# ✅ Automatic platform detection (arm64/amd64)

Common Use Cases

🔬 ML Inference Functions

pandas==2.0.3
numpy==1.24.3
scikit-learn==1.3.0
joblib==1.3.1

Strategy: AUTO (→ DOCKER)
Layer: Recommended
Build Time: ~45-60s first, ~15s updates

⚡ API Gateway Functions

requests==2.31.0
boto3==1.28.17
pydantic==2.1.1

Strategy: AUTO (→ LOCAL)
Layer: Optional
Build Time: ~10-15s first, ~8s updates

Requirements File Best Practices

requirements.txt Tips

Pin versions: pandas==2.0.3 instead of pandas>=2.0
Minimize dependencies: Only include packages you actually import
Check compatibility: Ensure all packages support your Python runtime
Consider alternatives: Use boto3 (already in Lambda) instead of requests when possible

Troubleshooting

Common Issues

Docker not running: Ensure Docker is installed and started
Platform mismatch: ModelKnife automatically handles ARM64 vs AMD64
Large layers: Layers >250MB use S3 upload (automatic)
Build failures: Check requirements.txt for version conflicts

Debug Build Process

# Deploy with detailed logging
mk s deploy --detail

# Check deployment logs
mk s status iris_inference

# Validate function
mk s test iris_inference

Layer Management

ModelKnife automatically manages Lambda layers:

Naming: mlknife-layer-{hash} based on requirements content
Versioning: New layer versions created only when requirements change
Cleanup: Old layers remain until manually deleted (AWS best practice)
Sharing: Same layer hash can be reused across multiple functions

Services Reference

Available Service Types

Compute

Storage & Data

Search & API

Streaming & Events

Environment Variable Usage Examples

Parameter References

Service Output References

Common Patterns

Validation Tips

Lambda Function

Key Features

Project Organization

Configuration Parameters

Basic Configuration

Service Configuration

Performance Settings

Security Configuration

Scheduling Configuration

Advanced Options

Parameter Validation

Common Validation Errors

Entry point Formats by Runtime

Runtime-Specific Features

Python Requirements & Build Strategies

Advanced Features

SageMaker Endpoint

Configuration Parameters

Basic Configuration

Service Configuration

Advanced Options

S3 Bucket

Configuration Parameters

Basic Configuration

Service Configuration

Storage Configuration

Parameter Validation

Common Validation Errors

Implementation Notes

DynamoDB Table

Configuration Parameters

Basic Configuration

Service Configuration

Performance Settings

Security Configuration

Network Configuration

Scheduling Configuration

Parameter Validation

Common Validation Errors

Implementation Notes

SageMaker Feature Store

Configuration Parameters

Basic Configuration

Service Configuration

Glue Database

Configuration Parameters

Basic Configuration

Service Configuration

Parameter Validation

Database Management

Glue Table

Simplified Column Format

Configuration Parameters

Basic Configuration

Service Configuration

Parameter Validation

Supported Data Types

Partitioning

SerDe Configuration

Configuration Parameters

Basic Configuration

Service Configuration

Simplified Configuration

Search Service

Business-Oriented Search Types

Parameter Validation

Configuration Parameters

Basic Configuration

Service Configuration