GridTelligence Documentation

v1.0.0

GridTelligence provides secure AI inference for critical energy infrastructure through temporally-isolated GPU sharing, dedicated instances, or on-premise hardware appliances. Access is provided through VPN-secured connections to ensure complete data sovereignty.

📚 What We Provide

  • Secure LLM Inference - Access to 20B and 120B parameter models
  • VPN-Only Access - Custom .ovpn profile for secure connectivity
  • Simple REST API - OpenAI-compatible endpoints
  • Flexible Deployment - Shared cloud, dedicated cloud, or on-premise box

Quick Start

Step 1: Connect to VPN

After contract signing, we provide you with a custom .ovpn profile for secure access to your dedicated VPC.

bash
# Connect using your organization's .ovpn profile
sudo openvpn --config your-company-gridtelligence.ovpn

# Verify connection is established
ping 10.8.0.1

Step 2: Test API Access

bash
# Test connectivity with your API key
curl -H "Authorization: Bearer YOUR_API_KEY" \
  https://api.gridtelligence.secure/v1/health

# Expected response
{"status":"healthy","model":"ready","tier":"shared-silver"}

Step 3: Make Your First Inference

python
import requests

# Set up headers with your API key
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

# Make inference request
response = requests.post(
    "https://api.gridtelligence.secure/v1/inference",
    headers=headers,
    json={
        "prompt": "Analyze voltage stability in a 138kV transmission system",
        "model": "20B",
        "temperature": 0.7,
        "max_tokens": 500
    }
)

print(response.json()["response"])

Deployment Options

Shared Cloud GPU

Cost-effective option with temporal isolation for NERC CIP compliance.

  • Pricing: $200-1,200/month based on usage tier
  • Access: VPN + API key
  • Isolation: Complete temporal separation between customers
  • GPU Utilization: 90%+ through continuous processing

Dedicated Cloud GPU

Exclusive GPU instance for your organization.

  • Pricing: $2,000-5,000/month based on GPU size
  • Access: VPN + API key
  • Availability: 24/7 exclusive access
  • Customization: Model fine-tuning options available

On-Premise Hardware Box

Physical appliance deployed within your secure facility.

  • Pricing: One-time purchase (contact sales)
  • Access: Direct network connection within trusted zone
  • Requirements: No cloud/VPN needed - completely air-gapped
  • Updates: Delivered via secure physical media

Authentication

GridTelligence uses bearer token authentication. Your API key is provided after contract signing.

Using Your API Key

http
GET /api/v1/inference
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

Python Example

python
headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

cURL Example

bash
curl -H "Authorization: Bearer YOUR_API_KEY" \
     -H "Content-Type: application/json" \
     -X POST https://api.gridtelligence.secure/v1/inference \
     -d '{"prompt":"Your prompt here","model":"20B"}'

Inference API

The inference API provides access to GridTelligence's AI models for grid analysis and operations support.

Endpoint

http
POST https://api.gridtelligence.secure/v1/inference

Request Parameters

Parameter Type Required Description
prompt string Yes The input text for the model
model string No "20B" or "120B" (default: "20B")
temperature float No Sampling temperature 0.0-1.0 (default: 0.7)
max_tokens integer No Maximum tokens to generate (default: 500)
stream boolean No Stream response tokens (default: false)

Example Request

json
{
    "prompt": "Analyze the impact of a 50MW solar farm connection to a 138kV substation",
    "model": "20B",
    "temperature": 0.7,
    "max_tokens": 1000,
    "stream": false
}

Response Format

json
{
    "request_id": "req_abc123",
    "timestamp": "2025-10-15T14:30:00Z",
    "model": "20B",
    "response": "Analysis of 50MW solar farm connection:\n\n1. Voltage Impact...",
    "tokens_used": 342,
    "inference_time_ms": 127
}

Streaming Responses

For real-time token streaming, set stream: true in your request:

python
import requests
import json

response = requests.post(
    "https://api.gridtelligence.secure/v1/inference",
    headers=headers,
    json={"prompt": "...", "stream": True},
    stream=True
)

for line in response.iter_lines():
    if line:
        data = json.loads(line.decode('utf-8'))
        print(data["token"], end="")

Available Models

Model Parameters Best For Latency Context Window
20B 20 billion General analysis, routine operations 50-100ms 4,096 tokens
120B MoE 120 billion (Mixture of Experts) Complex reasoning, critical decisions 200-500ms 32,768 tokens

Model Selection

Choose the appropriate model based on your use case:

  • 20B Model: Routine SCADA analysis, alarm processing, standard reporting
  • 120B Model: Contingency analysis, root cause investigation, complex grid modeling

Rate Limits

Request Limits by Tier

Tier Requests/Min Requests/Hour Concurrent
Shared Bronze 60 3,000 10
Shared Silver 120 6,000 25
Shared Gold 300 15,000 50
Dedicated/On-Premise Unlimited Unlimited Unlimited

Rate Limit Headers

Every response includes rate limit information:

http
HTTP/1.1 200 OK
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 118
X-RateLimit-Reset: 1760457600

Troubleshooting

Common Issues

🔴 VPN Connection Failed

Error: "Unable to establish VPN connection"

Solutions:

bash
# Check VPN configuration file exists
ls -la *.ovpn

# Test with verbose mode
sudo openvpn --config your-company-gridtelligence.ovpn --verb 3

# Ensure UDP port 1194 is not blocked
sudo netstat -an | grep 1194

# Check firewall settings
sudo iptables -L | grep 1194

🔴 Authentication Failed

Error: "401 Unauthorized"

Checklist:

  • Verify API key is correct and active
  • Ensure "Bearer " prefix is included
  • Check VPN connection is established
  • Confirm your IP is whitelisted (if applicable)

🔴 High Latency

Symptoms: Slow API responses

Solutions:

  • Check VPN connection quality
  • Consider upgrading to dedicated tier
  • Use 20B model for faster responses
  • Enable response streaming

Debug Mode

Enable verbose logging for diagnostics:

python
import logging
import requests

# Enable debug logging
logging.basicConfig(level=logging.DEBUG)

# Make request with detailed logging
response = requests.post(
    "https://api.gridtelligence.secure/v1/inference",
    headers=headers,
    json=payload
)

print(f"Status: {response.status_code}")
print(f"Headers: {response.headers}")

Frequently Asked Questions

Q: How do I get my VPN profile?

We generate a custom .ovpn profile for your organization after contract signing. This is sent securely via encrypted email or secure file transfer.

Q: Can GridTelligence run completely air-gapped?

Yes, the on-premise hardware appliance requires no external connectivity. It operates entirely within your trusted network without any cloud or VPN requirements.

Q: What's the difference between 20B and 120B models?

The 20B model handles routine operations with 50-100ms latency. The 120B MoE model provides advanced reasoning for complex analysis with 200-500ms latency and a much larger context window.

Q: How do you ensure NERC CIP compliance?

GridTelligence uses temporal isolation - only one customer can access GPU resources at any moment. Combined with IDS/IPS monitoring, automated patching, anti-malware scanning, and vulnerability assessments.

Q: Can I switch between shared and dedicated tiers?

Yes, you can upgrade or downgrade with 30 days notice. Contact [email protected] to initiate the change.

Q: What happens if my API rate limit is exceeded?

You'll receive a 429 status code with a Retry-After header indicating when you can resume requests. Consider upgrading your tier for higher limits.

Support Resources

📧 Email Support

All customers

[email protected]

📞 Priority Support

Enterprise customers

[email protected]

💬 Support Portal

Ticket system

portal.gridtelligence.com

📚 Documentation

You are here

docs.gridtelligence.com

Response Times

Tier Initial Response Resolution Target
Shared Bronze/Silver 24 hours Best effort
Shared Gold 8 hours 48 hours
Dedicated/On-Premise 4 hours 24 hours