GridTelligence Documentation
GridTelligence provides secure AI inference for critical energy infrastructure through temporally-isolated GPU sharing, dedicated instances, or on-premise hardware appliances. Access is provided through VPN-secured connections to ensure complete data sovereignty.
📚 What We Provide
- Secure LLM Inference - Access to 20B and 120B parameter models
- VPN-Only Access - Custom .ovpn profile for secure connectivity
- Simple REST API - OpenAI-compatible endpoints
- Flexible Deployment - Shared cloud, dedicated cloud, or on-premise box
Quick Start
Step 1: Connect to VPN
After contract signing, we provide you with a custom .ovpn profile for secure access to your dedicated VPC.
# Connect using your organization's .ovpn profile
sudo openvpn --config your-company-gridtelligence.ovpn
# Verify connection is established
ping 10.8.0.1
Step 2: Test API Access
# Test connectivity with your API key
curl -H "Authorization: Bearer YOUR_API_KEY" \
https://api.gridtelligence.secure/v1/health
# Expected response
{"status":"healthy","model":"ready","tier":"shared-silver"}
Step 3: Make Your First Inference
import requests
# Set up headers with your API key
headers = {
"Authorization": "Bearer YOUR_API_KEY",
"Content-Type": "application/json"
}
# Make inference request
response = requests.post(
"https://api.gridtelligence.secure/v1/inference",
headers=headers,
json={
"prompt": "Analyze voltage stability in a 138kV transmission system",
"model": "20B",
"temperature": 0.7,
"max_tokens": 500
}
)
print(response.json()["response"])
Deployment Options
Shared Cloud GPU
Cost-effective option with temporal isolation for NERC CIP compliance.
- Pricing: $200-1,200/month based on usage tier
- Access: VPN + API key
- Isolation: Complete temporal separation between customers
- GPU Utilization: 90%+ through continuous processing
Dedicated Cloud GPU
Exclusive GPU instance for your organization.
- Pricing: $2,000-5,000/month based on GPU size
- Access: VPN + API key
- Availability: 24/7 exclusive access
- Customization: Model fine-tuning options available
On-Premise Hardware Box
Physical appliance deployed within your secure facility.
- Pricing: One-time purchase (contact sales)
- Access: Direct network connection within trusted zone
- Requirements: No cloud/VPN needed - completely air-gapped
- Updates: Delivered via secure physical media
Authentication
GridTelligence uses bearer token authentication. Your API key is provided after contract signing.
Using Your API Key
GET /api/v1/inference
Authorization: Bearer YOUR_API_KEY
Content-Type: application/json
Python Example
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
cURL Example
curl -H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-X POST https://api.gridtelligence.secure/v1/inference \
-d '{"prompt":"Your prompt here","model":"20B"}'
Inference API
The inference API provides access to GridTelligence's AI models for grid analysis and operations support.
Endpoint
POST https://api.gridtelligence.secure/v1/inference
Request Parameters
Parameter | Type | Required | Description |
---|---|---|---|
prompt |
string | Yes | The input text for the model |
model |
string | No | "20B" or "120B" (default: "20B") |
temperature |
float | No | Sampling temperature 0.0-1.0 (default: 0.7) |
max_tokens |
integer | No | Maximum tokens to generate (default: 500) |
stream |
boolean | No | Stream response tokens (default: false) |
Example Request
{
"prompt": "Analyze the impact of a 50MW solar farm connection to a 138kV substation",
"model": "20B",
"temperature": 0.7,
"max_tokens": 1000,
"stream": false
}
Response Format
{
"request_id": "req_abc123",
"timestamp": "2025-10-15T14:30:00Z",
"model": "20B",
"response": "Analysis of 50MW solar farm connection:\n\n1. Voltage Impact...",
"tokens_used": 342,
"inference_time_ms": 127
}
Streaming Responses
For real-time token streaming, set stream: true
in your request:
import requests
import json
response = requests.post(
"https://api.gridtelligence.secure/v1/inference",
headers=headers,
json={"prompt": "...", "stream": True},
stream=True
)
for line in response.iter_lines():
if line:
data = json.loads(line.decode('utf-8'))
print(data["token"], end="")
Available Models
Model | Parameters | Best For | Latency | Context Window |
---|---|---|---|---|
20B | 20 billion | General analysis, routine operations | 50-100ms | 4,096 tokens |
120B MoE | 120 billion (Mixture of Experts) | Complex reasoning, critical decisions | 200-500ms | 32,768 tokens |
Model Selection
Choose the appropriate model based on your use case:
- 20B Model: Routine SCADA analysis, alarm processing, standard reporting
- 120B Model: Contingency analysis, root cause investigation, complex grid modeling
Rate Limits
Request Limits by Tier
Tier | Requests/Min | Requests/Hour | Concurrent |
---|---|---|---|
Shared Bronze | 60 | 3,000 | 10 |
Shared Silver | 120 | 6,000 | 25 |
Shared Gold | 300 | 15,000 | 50 |
Dedicated/On-Premise | Unlimited | Unlimited | Unlimited |
Rate Limit Headers
Every response includes rate limit information:
HTTP/1.1 200 OK
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 118
X-RateLimit-Reset: 1760457600
Troubleshooting
Common Issues
🔴 VPN Connection Failed
Error: "Unable to establish VPN connection"
Solutions:
# Check VPN configuration file exists
ls -la *.ovpn
# Test with verbose mode
sudo openvpn --config your-company-gridtelligence.ovpn --verb 3
# Ensure UDP port 1194 is not blocked
sudo netstat -an | grep 1194
# Check firewall settings
sudo iptables -L | grep 1194
🔴 Authentication Failed
Error: "401 Unauthorized"
Checklist:
- Verify API key is correct and active
- Ensure "Bearer " prefix is included
- Check VPN connection is established
- Confirm your IP is whitelisted (if applicable)
🔴 High Latency
Symptoms: Slow API responses
Solutions:
- Check VPN connection quality
- Consider upgrading to dedicated tier
- Use 20B model for faster responses
- Enable response streaming
Debug Mode
Enable verbose logging for diagnostics:
import logging
import requests
# Enable debug logging
logging.basicConfig(level=logging.DEBUG)
# Make request with detailed logging
response = requests.post(
"https://api.gridtelligence.secure/v1/inference",
headers=headers,
json=payload
)
print(f"Status: {response.status_code}")
print(f"Headers: {response.headers}")
Frequently Asked Questions
Q: How do I get my VPN profile?
We generate a custom .ovpn profile for your organization after contract signing. This is sent securely via encrypted email or secure file transfer.
Q: Can GridTelligence run completely air-gapped?
Yes, the on-premise hardware appliance requires no external connectivity. It operates entirely within your trusted network without any cloud or VPN requirements.
Q: What's the difference between 20B and 120B models?
The 20B model handles routine operations with 50-100ms latency. The 120B MoE model provides advanced reasoning for complex analysis with 200-500ms latency and a much larger context window.
Q: How do you ensure NERC CIP compliance?
GridTelligence uses temporal isolation - only one customer can access GPU resources at any moment. Combined with IDS/IPS monitoring, automated patching, anti-malware scanning, and vulnerability assessments.
Q: Can I switch between shared and dedicated tiers?
Yes, you can upgrade or downgrade with 30 days notice. Contact [email protected] to initiate the change.
Q: What happens if my API rate limit is exceeded?
You'll receive a 429 status code with a Retry-After header indicating when you can resume requests. Consider upgrading your tier for higher limits.
Support Resources
💬 Support Portal
Ticket system
portal.gridtelligence.com
📚 Documentation
You are here
docs.gridtelligence.com
Response Times
Tier | Initial Response | Resolution Target |
---|---|---|
Shared Bronze/Silver | 24 hours | Best effort |
Shared Gold | 8 hours | 48 hours |
Dedicated/On-Premise | 4 hours | 24 hours |