GridTelligence Documentation

v1.0.0

GridTelligence provides secure AI inference for critical energy infrastructure through temporally-isolated GPU sharing, dedicated instances, or on-premise hardware appliances. Access is provided through VPN-secured connections to ensure complete data sovereignty.

📚 What We Provide

Secure LLM Inference - Access to 20B and 120B parameter models
VPN-Only Access - Custom .ovpn profile for secure connectivity
Simple REST API - OpenAI-compatible endpoints
Flexible Deployment - Shared cloud, dedicated cloud, or on-premise box

Quick Start

Step 1: Connect to VPN

After contract signing, we provide you with a custom .ovpn profile for secure access to your dedicated VPC.

bash

# Connect using your organization's .ovpn profile
sudo openvpn --config your-company-gridtelligence.ovpn

# Verify connection is established
ping 10.8.0.1

Step 2: Test API Access

bash

# Test connectivity with your API key
curl -H "Authorization: Bearer YOUR_API_KEY" \
  https://api.gridtelligence.secure/v1/health

# Expected response
{"status":"healthy","model":"ready","tier":"shared-silver"}

Step 3: Make Your First Inference

python

import requests

# Set up headers with your API key
headers = {
    "Authorization": "Bearer YOUR_API_KEY",
    "Content-Type": "application/json"
}

# Make inference request
response = requests.post(
    "https://api.gridtelligence.secure/v1/inference",
    headers=headers,
    json={
        "prompt": "Analyze voltage stability in a 138kV transmission system",
        "model": "20B",
        "temperature": 0.7,
        "max_tokens": 500
    }
)

print(response.json()["response"])

Deployment Options

Shared Cloud GPU

Cost-effective option with temporal isolation for NERC CIP compliance.

Pricing: $200-1,200/month based on usage tier
Access: VPN + API key
Isolation: Complete temporal separation between customers
GPU Utilization: 90%+ through continuous processing

Dedicated Cloud GPU

Exclusive GPU instance for your organization.

Pricing: $2,000-5,000/month based on GPU size
Access: VPN + API key
Availability: 24/7 exclusive access
Customization: Model fine-tuning options available

On-Premise Hardware Box

Physical appliance deployed within your secure facility.

Pricing: One-time purchase (contact sales)
Access: Direct network connection within trusted zone
Requirements: No cloud/VPN needed - completely air-gapped
Updates: Delivered via secure physical media

Authentication

GridTelligence uses bearer token authentication. Your API key is provided after contract signing.

Using Your API Key

http

GET /api/v1/inference
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json

Python Example

python

headers = {
    "Authorization": f"Bearer {API_KEY}",
    "Content-Type": "application/json"
}

cURL Example

bash

curl -H "Authorization: Bearer YOUR_API_KEY" \
     -H "Content-Type: application/json" \
     -X POST https://api.gridtelligence.secure/v1/inference \
     -d '{"prompt":"Your prompt here","model":"20B"}'

Inference API

The inference API provides access to GridTelligence's AI models for grid analysis and operations support.

Endpoint

http

POST https://api.gridtelligence.secure/v1/inference

Request Parameters

Parameter	Type	Required	Description
`prompt`	string	Yes	The input text for the model
`model`	string	No	"20B" or "120B" (default: "20B")
`temperature`	float	No	Sampling temperature 0.0-1.0 (default: 0.7)
`max_tokens`	integer	No	Maximum tokens to generate (default: 500)
`stream`	boolean	No	Stream response tokens (default: false)

Example Request

json

{
    "prompt": "Analyze the impact of a 50MW solar farm connection to a 138kV substation",
    "model": "20B",
    "temperature": 0.7,
    "max_tokens": 1000,
    "stream": false
}

Response Format

json

{
    "request_id": "req_abc123",
    "timestamp": "2025-10-15T14:30:00Z",
    "model": "20B",
    "response": "Analysis of 50MW solar farm connection:\n\n1. Voltage Impact...",
    "tokens_used": 342,
    "inference_time_ms": 127
}

Streaming Responses

For real-time token streaming, set stream: true in your request:

python

import requests
import json

response = requests.post(
    "https://api.gridtelligence.secure/v1/inference",
    headers=headers,
    json={"prompt": "...", "stream": True},
    stream=True
)

for line in response.iter_lines():
    if line:
        data = json.loads(line.decode('utf-8'))
        print(data["token"], end="")

Available Models

Model	Parameters	Best For	Latency	Context Window
20B	20 billion	General analysis, routine operations	50-100ms	4,096 tokens
120B MoE	120 billion (Mixture of Experts)	Complex reasoning, critical decisions	200-500ms	32,768 tokens

Model Selection

Choose the appropriate model based on your use case:

20B Model: Routine SCADA analysis, alarm processing, standard reporting
120B Model: Contingency analysis, root cause investigation, complex grid modeling

Rate Limits

Request Limits by Tier

Tier	Requests/Min	Requests/Hour	Concurrent
Shared Bronze	60	3,000	10
Shared Silver	120	6,000	25
Shared Gold	300	15,000	50
Dedicated/On-Premise	Unlimited	Unlimited	Unlimited

Rate Limit Headers

Every response includes rate limit information:

http

HTTP/1.1 200 OK
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 118
X-RateLimit-Reset: 1760457600

Troubleshooting

Common Issues

🔴 VPN Connection Failed

Error: "Unable to establish VPN connection"

Solutions:

bash

# Check VPN configuration file exists
ls -la *.ovpn

# Test with verbose mode
sudo openvpn --config your-company-gridtelligence.ovpn --verb 3

# Ensure UDP port 1194 is not blocked
sudo netstat -an | grep 1194

# Check firewall settings
sudo iptables -L | grep 1194

🔴 Authentication Failed

Error: "401 Unauthorized"

Checklist:

Verify API key is correct and active
Ensure "Bearer " prefix is included
Check VPN connection is established
Confirm your IP is whitelisted (if applicable)

🔴 High Latency

Symptoms: Slow API responses

Solutions:

Check VPN connection quality
Consider upgrading to dedicated tier
Use 20B model for faster responses
Enable response streaming

Debug Mode

Enable verbose logging for diagnostics:

python

import logging
import requests

# Enable debug logging
logging.basicConfig(level=logging.DEBUG)

# Make request with detailed logging
response = requests.post(
    "https://api.gridtelligence.secure/v1/inference",
    headers=headers,
    json=payload
)

print(f"Status: {response.status_code}")
print(f"Headers: {response.headers}")

Frequently Asked Questions

Q: How do I get my VPN profile?

We generate a custom .ovpn profile for your organization after contract signing. This is sent securely via encrypted email or secure file transfer.

Q: Can GridTelligence run completely air-gapped?

Yes, the on-premise hardware appliance requires no external connectivity. It operates entirely within your trusted network without any cloud or VPN requirements.

Q: What's the difference between 20B and 120B models?

The 20B model handles routine operations with 50-100ms latency. The 120B MoE model provides advanced reasoning for complex analysis with 200-500ms latency and a much larger context window.

Q: How do you ensure NERC CIP compliance?

GridTelligence uses temporal isolation - only one customer can access GPU resources at any moment. Combined with IDS/IPS monitoring, automated patching, anti-malware scanning, and vulnerability assessments.

Q: Can I switch between shared and dedicated tiers?

Yes, you can upgrade or downgrade with 30 days notice. Contact [email protected] to initiate the change.

Q: What happens if my API rate limit is exceeded?

You'll receive a 429 status code with a Retry-After header indicating when you can resume requests. Consider upgrading your tier for higher limits.

Support Resources

📧 Email Support

All customers

[email protected]

📞 Priority Support

Enterprise customers

[email protected]

💬 Support Portal

Ticket system

portal.gridtelligence.com

📚 Documentation

You are here

docs.gridtelligence.com

Response Times

Tier	Initial Response	Resolution Target
Shared Bronze/Silver	24 hours	Best effort
Shared Gold	8 hours	48 hours
Dedicated/On-Premise	4 hours	24 hours