# Investigate API user guide

{% hint style="info" %}
**Please Note:**

This only applies to Vectra RUX deployments.
{% endhint %}

{% hint style="warning" %}

#### Investigate API is currently in PRIVATE PREVIEW. Please provide feedback to your respective account teams.

{% endhint %}

## Overview

The Vectra AI Investigations API (also called the Metadata API) enables you to run SQL-like queries directly against network metadata and cloud logs captured by your Vectra sensors. This is the same data that powers Vectra's built-in detections, giving you full access to underlying evidence for threat hunting, incident response, and ad-hoc investigation.

**Key capabilities:**

* Query up to **14 days** of raw network session data (connections, DNS, HTTP, TLS, SMB, Kerberos, LDAP, RDP, SSH, beacons, and more)
* Query cloud audit logs: **AWS CloudTrail**, **Azure ARM**, **Microsoft 365** (Entra ID, SharePoint, Exchange, Teams), and **Entra ID sign-ins**
* Filter by host entity IDs, IP addresses, timestamps, protocol fields, MITRE technique indicators, and any field documented in the schema reference
* Retrieve up to **10,000 rows** per query with full pagination support
* Use aggregation functions (COUNT, SUM, AVG, etc.) and grouping for statistical analysis

The API is asynchronous: you **submit** a query, receive a `request_id`, then **poll** for completion and **retrieve** paginated results.

## Prerequisites & Authentication

#### What You Need

| Item                       | Details                                                                       |
| -------------------------- | ----------------------------------------------------------------------------- |
| **Vectra Brain URL**       | Your Vectra deployment URL, e.g., `https://your-brain.vectra.ai`              |
| **API Client Credentials** | OAuth 2.0 Client ID and Client Secret (created in Vectra Admin → API Clients) |
| **Access Token**           | Bearer token obtained via the OAuth token endpoint                            |
| **Postman** (optional)     | Version 9+ recommended; import the collection or build requests manually      |

#### Obtaining an Access Token

The Investigations API uses OAuth 2.0 Bearer Token authentication. Obtain a token before making API calls.

**Token endpoint:**

```
POST https://{your-brain}/oauth2/token
```

**Postman setup:**

1. Create a new request → set method to `POST`
2. URL: `https://{your-brain}/oauth2/token`
3. Body → select `x-www-form-urlencoded` and add:

| Key             | Value                  |
| --------------- | ---------------------- |
| `grant_type`    | `client_credentials`   |
| `client_id`     | `{your_client_id}`     |
| `client_secret` | `{your_client_secret}` |

4. Send the request. Copy the `access_token` value from the response.

**Token response example:**

json

```json
{
  "access_token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...",
  "token_type": "Bearer",
  "expires_in": 3600
}
```

**Using the token in subsequent requests:**

In Postman, go to the **Authorization** tab → select **Bearer Token** → paste the access token. Alternatively, add a header manually:

```
Authorization: Bearer {access_token}
```

> **Tip:** In Postman, store the token as a collection variable (`{{access_token}}`) and reference it with `Bearer {{access_token}}`. You can automate token refresh using a pre-request script.

## How It Works — Submit, Poll, Retrieve

The Investigations API uses an **asynchronous** pattern. Queries may take several seconds or longer depending on data volume and time range.

```
Step 1: Submit Query
  POST /api/v3.4/investigations/
  Body: { "query": "SELECT ...", "version": "1.0" }
  → Response: { "request_id": "abc-123", "searchable_range": {...} }

Step 2: Poll for Completion
  GET /api/v3.4/investigations/abc-123/
  → Response: { "meta": { "query_status": "RUNNING", "data": [] } }   ← keep polling
  → Response: { "meta": { "query_status": "SUCCESS" }, "data": [...] }  ← done

Step 3: Retrieve Additional Pages (if needed)
  GET /api/v3.4/investigations/abc-123/?page=2&page_size=50
  → Response: { "status": "completed", "results": [...], "next_page": 3 }
```

**Status values:**

| `meta.query_status` | Meaning                                                                                |
| ------------------- | -------------------------------------------------------------------------------------- |
| `RUNNING`           | Query is actively executing — keep polling                                             |
| `SUCCESS`           | Query finished — `data` array contains your results                                    |
| `FAILED`            | Query failed — check the response body for error details (exact structure unconfirmed) |

**Recommended polling interval:** 1–3 seconds. Most queries complete within 5–15 seconds.

> **Postman Automation Tip:** Add this snippet to the **Tests** tab of your POST request to automatically capture the `request_id` into a collection variable:
>
> javascript
>
> ```javascript
> var json = pm.response.json();
> pm.collectionVariables.set("request_id", json.request_id);
> ```
>
> Your GET request can then reference `{{request_id}}` in the URL path automatically — no manual copy/paste needed. See Appendix B for the full script.

## API Reference

#### POST /api/v3.4/investigations/ — Submit a Query

**Full URL:** `https://{your-brain}/api/v3.4/investigations/`

**Method:** `POST`

**Headers:**

| Header          | Value                   |
| --------------- | ----------------------- |
| `Authorization` | `Bearer {access_token}` |
| `Content-Type`  | `application/json`      |

**Request body (JSON):**

| Field     | Type   | Required | Description                                          |
| --------- | ------ | -------- | ---------------------------------------------------- |
| `query`   | string | Yes      | SQL-like query string. See Query Language Reference. |
| `version` | string | No       | Query language version. Use `"1.0"` (default).       |

**Example request body:**

json

```json
{
  "query": "SELECT timestamp, id.orig_h, id.resp_h, id.resp_p FROM network.isession._all WHERE timestamp > date_add('hour', -24, now()) AND local_orig = true AND local_resp = false LIMIT 100",
  "version": "1.0"
}
```

**Success response — 200 OK:**

json

```json
{
  "request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "searchable_range": {
    "searchable_days_allowed": 14
  }
}
```

**Error responses:**

| HTTP Status             | Cause                                                                             |
| ----------------------- | --------------------------------------------------------------------------------- |
| `400 Bad Request`       | Invalid query syntax, unsupported clause, or missing `._all` suffix on table name |
| `401 Unauthorized`      | Missing or expired Bearer token                                                   |
| `403 Forbidden`         | Token does not have Investigations API permission                                 |
| `429 Too Many Requests` | Rate limit exceeded (5 POST requests/minute)                                      |

The API surfaces two distinct error patterns depending on when the error is detected.

**Syntax errors — returned directly in the POST response** (query fails to parse before execution):

json

```json
{
  "error": {
    "errorCode": "SYNTAX_ERROR",
    "errorId": "SAA23Q",
    "extra": [
      {
        "line": 1,
        "column": 96,
        "offending_symbol": "WHERE",
        "message": "query parsing failed: mismatched input 'WHERE' expecting '.'"
      }
    ]
  }
}
```

| Error field                      | Description                                              |
| -------------------------------- | -------------------------------------------------------- |
| `error.errorCode`                | `SYNTAX_ERROR` for parse failures                        |
| `error.errorId`                  | Unique ID for this error — include when reporting issues |
| `error.extra[].line`             | Line number in the query where the error occurred        |
| `error.extra[].column`           | Character position of the error                          |
| `error.extra[].offending_symbol` | The token that triggered the parse failure               |
| `error.extra[].message`          | Human-readable description of the parse error            |

> **Common cause:** Missing `._all` suffix on a table name produces a `SYNTAX_ERROR` because the parser expects a `.` after the table name and encounters `WHERE` instead.

**Runtime errors — returned in the GET response** (query parses successfully but fails during execution):

json

```json
{
  "error": {
    "errorCode": "DATABASE_ERROR",
    "errorId": "OWHVWH",
    "extra": [
      {
        "column": "orig_bytes",
        "error_name": "COLUMN_NOT_FOUND",
        "error_type": "USER_ERROR"
      }
    ]
  }
}
```

| Error field                | Description                                              |
| -------------------------- | -------------------------------------------------------- |
| `error.errorCode`          | `DATABASE_ERROR` for runtime failures                    |
| `error.errorId`            | Unique ID for this error — include when reporting issues |
| `error.extra[].column`     | The field name that caused the error                     |
| `error.extra[].error_name` | Specific error type (e.g. `COLUMN_NOT_FOUND`)            |
| `error.extra[].error_type` | `USER_ERROR` indicates the query itself is the problem   |

Common `error_name` values for runtime errors:

| `error_name`       | `column` present                | Meaning                                                                                                                                                                     |
| ------------------ | ------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `COLUMN_NOT_FOUND` | Yes — names the offending field | A field name in SELECT or WHERE does not exist in the table — check the schema reference for correct field names                                                            |
| `TYPE_MISMATCH`    | No                              | A value, cast, or function argument has an incompatible type — check comparisons (e.g. string field vs. numeric literal), `CAST` expressions, and aggregate function inputs |

#### GET /api/v3.4/investigations/{request\_id}/ — Get Results

**Full URL:** `https://{your-brain}/api/v3.4/investigations/{request_id}/`

**Method:** `GET`

**Headers:**

| Header          | Value                   |
| --------------- | ----------------------- |
| `Authorization` | `Bearer {access_token}` |

**Query parameters:**

| Parameter   | Type    | Default | Description                |
| ----------- | ------- | ------- | -------------------------- |
| `page`      | integer | 1       | Page number (1-indexed)    |
| `page_size` | integer | 50      | Rows per page (max 10,000) |

**Success response — 200 OK (completed):**

json

```json
{
  "request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "data": [
    {
      "timestamp": "2026-03-10T14:23:01.000Z",
      "orig_h": "10.0.1.45",
      "resp_h": "93.184.216.34",
      "resp_p": 443
    }
  ],
  "meta": {
    "page": 1,
    "page_size": 50,
    "num_rows_available": 342,
    "estimated_file_size_bytes": 18400,
    "query_status": "SUCCESS",
    "columns": [
      ["timestamp", [{"type": "string"}, ""]],
      ["orig_h",    [{"type": "string"}, ""]],
      ["resp_h",    [{"type": "string"}, ""]],
      ["resp_p",    [{"type": "number"}, ""]]
    ]
  }
}
```

**Response fields:**

| Field                            | Location  | Description                                                                  |
| -------------------------------- | --------- | ---------------------------------------------------------------------------- |
| `request_id`                     | top level | Unique ID for this query job                                                 |
| `data`                           | top level | Array of result rows — your query results are here                           |
| `meta.query_status`              | `meta`    | `"RUNNING"` while executing, `"SUCCESS"` when complete                       |
| `meta.num_rows_available`        | `meta`    | Total rows returned by the query                                             |
| `meta.page`                      | `meta`    | Current page number                                                          |
| `meta.page_size`                 | `meta`    | Rows returned on this page                                                   |
| `meta.estimated_file_size_bytes` | `meta`    | Estimated size of the full result set in bytes                               |
| `meta.columns`                   | `meta`    | Array of `[column_name, [{type}, hint]]` tuples describing the result schema |

> **Checking for more pages:** Compare `meta.num_rows_available` against `meta.page_size`. If `num_rows_available > page_size`, there are additional pages to retrieve. Increment the `page` parameter until you have collected all rows.

**Still running — 202 OK:**

json

```json
{
  "data": [],
  "request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "meta": {
    "page": 1,
    "page_size": 50,
    "estimated_file_size_bytes": null,
    "num_rows_available": 0,
    "query_status": "RUNNING",
    "columns": []
  }
}
```

> **Polling tip:** The response structure is identical whether the query is running or complete — `data` and `columns` are simply empty arrays while the query is in progress. Poll until `meta.query_status` equals `"SUCCESS"` before reading `data`.

**Failed query response (expected structure):**

json

```json
{
  "request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "meta": {
    "query_status": "FAILED",
    "error": "Syntax error near 'GROUPBY' at position 42"
  }
}
```

## Query Language Reference

#### Table Naming Convention

All table names **must** end with the `._all` suffix:

sql

```sql
-- Correct
FROM network.isession._all
FROM network.dns._all
FROM aws.cloudtrail._all

-- Incorrect — will return 400 Bad Request
FROM network.isession
FROM network.dns
```

#### Supported Clauses

| Clause                         | Supported | Notes                                   |
| ------------------------------ | --------- | --------------------------------------- |
| `SELECT`                       | ✅         | Use `*` or specific fields              |
| `FROM`                         | ✅         | One table per query (no JOIN)           |
| `WHERE`                        | ✅         | Standard filters + functions            |
| `ORDER BY`                     | ✅         | `ASC` or `DESC`                         |
| `LIMIT`                        | ✅         | Recommended; maximum 10,000             |
| `GROUP BY`                     | ✅         | For aggregation queries                 |
| `HAVING`                       | ✅         | Filter on aggregate results             |
| `UNION ALL`                    | ✅         | Combine results from multiple queries   |
| Subqueries                     | ✅         | Supported                               |
| `JOIN`                         | ❌         | Not supported — use `UNION ALL` instead |
| Plain `UNION`                  | ❌         | Use `UNION ALL`                         |
| `INSERT` / `UPDATE` / `DELETE` | ❌         | Read-only API                           |

#### Field Notation Rules

Struct fields **must** use dot-notation in `WHERE` and `ORDER BY` clauses. Using a flat field name in a filter position returns `400 Bad Request`.

sql

```sql
-- Correct
WHERE id.resp_h = '1.2.3.4'
WHERE orig_hostname.id = 7832
ORDER BY id.resp_p ASC

-- Incorrect — returns 400 Bad Request
WHERE resp_h = '1.2.3.4'
WHERE id = 7832
```

> **Note:** In `SELECT`, either form works — `SELECT id.orig_h` returns the column as `orig_h` in results.

#### Supported Functions

**Aggregate functions:** `COUNT`, `MAX`, `MIN`, `SUM`, `AVG`, `STDDEV`, `STDDEV_SAMP`, `STDDEV_POP`

**String functions:** `LOWER`, `UPPER`, `LENGTH`, `ABS`, `CONCAT`, `CONTAINS`, `COALESCE`

**Time functions:** `DATE`, `NOW`, `DATE_ADD`, `DATE_DIFF`, `FROM_ISO8601_TIMESTAMP`, `FROM_UNIXTIME`, `TO_UNIXTIME`

**Regex functions:** `REGEXP_COUNT`, `REGEXP_EXTRACT_ALL`, `REGEXP_EXTRACT`, `REGEXP_LIKE`, `REGEXP_POSITION`, `REGEXP_REPLACE`, `REGEXP_SPLIT`

**Type casting:** `TRY_CAST`, `CAST`

**Array / predicate functions:** `ANY_MATCH`, `ALL_MATCH`, `DISTINCT`, `ARRAY_AGG`, `CARDINALITY`

#### Common Time Filter Pattern

sql

```sql
-- Last 24 hours
WHERE timestamp > date_add('hour', -24, now())

-- Last 7 days
WHERE timestamp > date_add('day', -7, now())

-- Specific time window
WHERE timestamp >= '2026-03-10T00:00:00Z'
  AND timestamp <= '2026-03-11T00:00:00Z'
```

#### Common Fields (All Network Tables)

| Field                | Type      | Description                                   |
| -------------------- | --------- | --------------------------------------------- |
| `timestamp`          | timestamp | Record timestamp — always use this (not `ts`) |
| `id.orig_h`          | string    | Source (originator) IP address                |
| `id.orig_p`          | integer   | Source port                                   |
| `id.resp_h`          | string    | Destination (responder) IP address            |
| `id.resp_p`          | integer   | Destination port                              |
| `local_orig`         | boolean   | `true` if source IP is internal               |
| `local_resp`         | boolean   | `true` if destination IP is internal          |
| `orig_hostname.id`   | integer   | Vectra host entity ID for source host         |
| `orig_hostname.name` | string    | Display name of source host                   |
| `uid`                | string    | Unique session ID                             |

> For complete field listings by table, see the Table Schema Reference.

## Sample Queries

All samples below are formatted as complete Postman request bodies (JSON). Submit each as the body of `POST /api/v3.4/investigations/`.

#### Basic Queries

**1. Recent Outbound Connections (Last 24 Hours)**

Returns outbound sessions from internal hosts to external destinations.

json

```json
{
  "query": "SELECT timestamp, id.orig_h, id.resp_h, id.resp_p, orig_ip_bytes, resp_ip_bytes FROM network.isession._all WHERE timestamp > date_add('hour', -24, now()) AND local_orig = true AND local_resp = false ORDER BY timestamp DESC LIMIT 100",
  "version": "1.0"
}
```

**2. DNS Queries from a Specific Host**

Returns all DNS lookups made by a host identified by its IP address.

json

```json
{ 
  "query": "SELECT timestamp, uid, id.orig_h, orig_hostname, id.resp_h, id.resp_p, qtype_name, query, answers, total_answers, rejected, sensor_uid FROM network.dns._all WHERE id.orig_h = '1.2.3.4' AND timestamp > date_add('hour', -24, now()) ORDER BY timestamp DESC LIMIT 100", 
  "version": "1.0" }
```

> Replace `1.2.3.4` with the actual Vectra host IP.

**3. HTTP Activity to a Suspicious Domain**

Finds all HTTP requests to a specific host header value.

json

```json
{ 
  "query": "SELECT timestamp, id.orig_h, host, uri, method, status_code, user_agent FROM network.http._all WHERE timestamp > date_add('day', -3, now()) AND host = 'suspicious-domain.com' ORDER BY timestamp DESC LIMIT 100",
  "version": "1.0" }
```

**4. Failed Kerberos Authentication Attempts**

Returns failed Kerberos events which may indicate brute force or password spraying.

json

```json
{
  "query": "SELECT timestamp, id.orig_h, client, service, error_msg FROM network.kerberos._all WHERE timestamp > date_add('hour', -6, now()) AND success = false ORDER BY timestamp DESC LIMIT 500",
  "version": "1.0"
}
```

**5. Large Outbound Data Transfers**

Identifies sessions where an internal host sent more than 10 MB to an external destination.

json

```json
{
  "query": "SELECT id.orig_h, id.resp_h, id.resp_p, COUNT(*) AS sessions, SUM(orig_ip_bytes) AS bytes_sent, SUM(resp_ip_bytes) AS bytes_received FROM network.isession._all WHERE timestamp > date_add('hour', -24, now()) AND local_orig = true AND local_resp = false GROUP BY id.orig_h, id.resp_h, id.resp_p HAVING SUM(orig_ip_bytes) > 10000000 ORDER BY bytes_sent DESC LIMIT 100",
  "version": "1.0"
}
```

#### Advanced Queries

**6. Top External Destinations by Connection Count**

Aggregates outbound connections grouped by destination IP, ordered by volume. Useful for identifying beaconing or bulk data transfer patterns.

json

```json
{
  "query": "SELECT id.resp_h, COUNT(*) AS connection_count, SUM(orig_ip_bytes) AS total_bytes_sent FROM network.isession._all WHERE timestamp > date_add('day', -1, now()) AND local_orig = true AND local_resp = false GROUP BY id.resp_h ORDER BY connection_count DESC LIMIT 50",
  "version": "1.0"
}
```

**7. DNS Tunneling Indicators**

Flags suspiciously long DNS query names (potential data exfiltration via DNS tunneling) combined with TXT record lookups.

json

```json
{
  "query": "SELECT timestamp, id.orig_h, orig_hostname.name, query, qtype_name, LENGTH(query) AS query_length FROM network.dns._all WHERE timestamp > date_add('hour', -24, now()) AND (LENGTH(query) > 50 OR qtype_name = 'TXT') ORDER BY query_length DESC LIMIT 200",
  "version": "1.0"
}
```

**8. Lateral Movement — Internal RDP Sessions**

Finds RDP connections between internal hosts, a primary lateral movement indicator.

json

```json
{
  "query": "SELECT timestamp, id.orig_h, id.resp_h, orig_hostname.name AS src_host, resp_hostname.name AS dst_host, client_name FROM network.rdp._all WHERE timestamp > date_add('hour', -24, now()) AND local_orig = true AND local_resp = true ORDER BY timestamp DESC LIMIT 200",
  "version": "1.0"
}
```

**9. AWS IAM Changes in the Last 7 Days**

Identifies mutating IAM API calls (user/role/policy changes) in AWS CloudTrail — a key persistence and privilege escalation indicator.

json

```json
{
  "query": "SELECT timestamp, event_name, user_identity.arn, source_ip_address, error_code FROM aws.cloudtrail._all WHERE timestamp > date_add('day', -7, now()) AND event_source = 'iam.amazonaws.com' AND read_only = 'false' ORDER BY timestamp DESC LIMIT 200",
  "version": "1.0"
}
```

**10. Entra ID Sign-ins from High-Risk Countries (UNION ALL)**

Combines sign-in events from multiple high-risk geographic locations using `UNION ALL`.

json

```json
{
  "query": "SELECT timestamp, user_principal_name, ip_address, location.country_or_region AS country, app_display_name, status.error_code FROM entra.signins._all WHERE timestamp > date_add('day', -7, now()) AND location.country_or_region = 'RU' UNION ALL SELECT timestamp, user_principal_name, ip_address, location.country_or_region AS country, app_display_name, status.error_code FROM entra.signins._all WHERE timestamp > date_add('day', -7, now()) AND location.country_or_region = 'KP' ORDER BY timestamp DESC LIMIT 500",
  "version": "1.0"
}
```

## Pagination — Retrieving Large Result Sets

When a query matches more rows than your `page_size`, results are split across multiple pages. Use the `next_page` field in the response to retrieve subsequent pages.

#### Step-by-Step Pagination in Postman

**Step 1 — Submit the query** (POST):

json

```json
{
  "query": "SELECT timestamp, id.orig_h, id.resp_h FROM network.isession._all WHERE timestamp > date_add('day', -1, now()) AND local_orig = true LIMIT 10000",
  "version": "1.0"
}
```

**Step 2 — Poll for completion** (GET):

```
GET /api/v3.4/investigations/{request_id}/
```

**Step 3 — Retrieve page 1** (GET):

```
GET /api/v3.4/investigations/{request_id}/?page=1&page_size=500
```

**Step 4 — Continue if `next_page` is not null:**

```
GET /api/v3.4/investigations/{request_id}/?page=2&page_size=500
GET /api/v3.4/investigations/{request_id}/?page=3&page_size=500
...
```

**Stop when you've collected all rows** — compare `meta.num_rows_available` (total rows) against `meta.page * meta.page_size`. When the rows you've collected equals `meta.num_rows_available`, you're done.

> **Performance tip:** Use a `page_size` of 500–1,000 for large result sets to balance payload size and number of round trips. Your results will be in the `data` array of each response.

## Troubleshooting

| Symptom                                | Likely Cause                                                            | Resolution                                                                                                                                                                  |
| -------------------------------------- | ----------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `SYNTAX_ERROR` in POST response        | Missing `._all` suffix — parser expects `.` before `WHERE`              | Change `FROM network.isession` to `FROM network.isession._all`                                                                                                              |
| `SYNTAX_ERROR` in POST response        | Malformed SQL — bad clause order, unmatched quotes, unsupported keyword | Check `extra[].offending_symbol` and `extra[].message` for the exact location                                                                                               |
| `COLUMN_NOT_FOUND` in GET response     | Field name doesn't exist in the table                                   | Check the schema reference for correct field names — e.g. `orig_ip_bytes` not `orig_bytes`                                                                                  |
| `COLUMN_NOT_FOUND` in GET response     | Flat field name used in WHERE clause                                    | Change `WHERE resp_h = '...'` to `WHERE id.resp_h = '...'`                                                                                                                  |
| `TYPE_MISMATCH` in GET response        | Incompatible types in a comparison, cast, or function                   | Check that field types match their usage — e.g. wrap array fields with `CAST(... AS VARCHAR)` before using `REGEXP_LIKE`, avoid comparing string fields to numeric literals |
| `400 Bad Request`                      | Unsupported clause (e.g., `JOIN`)                                       | Remove JOIN; use `UNION ALL` to combine results from multiple queries                                                                                                       |
| `401 Unauthorized`                     | Missing or expired token                                                | Re-authenticate via `POST /oauth2/token` and update the Bearer token                                                                                                        |
| `403 Forbidden`                        | Insufficient API client permissions                                     | Ensure the API client has Investigations API access in Vectra Admin → API Clients                                                                                           |
| `429 Too Many Requests`                | Rate limit exceeded                                                     | Wait 60 seconds before retrying; the limit is 5 POST requests/minute                                                                                                        |
| `error` object present in GET response | Runtime error (e.g. bad field name)                                     | Check `error.extra[].error_name` and `error.extra[].column` for the specific cause                                                                                          |
| Query returns 0 results                | Time range too narrow or wrong table                                    | Widen the time range; verify the correct table name in the schema reference                                                                                                 |
| Query times out                        | Query spans too much data                                               | Add a more specific `WHERE` clause to narrow the result set; reduce time range                                                                                              |
| Struct field returns `null`            | External IP has no Vectra host record                                   | `orig_hostname` / `resp_hostname` fields are `null` for external IPs — this is expected                                                                                     |
| `UNION ALL` returns duplicate columns  | Column alias mismatch between branches                                  | Ensure both SELECT branches have identical column count and matching aliases                                                                                                |

## Rate Limits & Constraints

| Constraint                        | Value                                          |
| --------------------------------- | ---------------------------------------------- |
| **POST requests (submit query)**  | 5 per minute                                   |
| **Maximum rows per query**        | 10,000                                         |
| **Maximum lookback period**       | 14 days                                        |
| **Minimum polling interval**      | 1 second (recommended: 2–3 seconds)            |
| **Authentication token lifetime** | 3,600 seconds (1 hour) — refresh before expiry |

> **Bulk data retrieval:** If you need more than 10,000 rows, split your query into multiple time windows and submit them sequentially. For example, query 4-hour windows over a 24-hour period to retrieve up to 60,000 rows total.

### Appendix A — Where to Find Table Schemas

This guide covers the query mechanics and provides sample queries, but does not document every available table and field. Full schema documentation — including all available tables, field names, types, and example values — is maintained separately.

> **📄 See the Vectra Investigate API Query Schema Reference** for complete field listings:&#x20;
>
> [investigate-api-metadata-schema-reference](https://docs.vectra.ai/operations/investigate/investigate-api-metadata-schema-reference "mention")

### Appendix B — Quick Reference: Endpoint Summary

| Action              | Method | Endpoint                                                                     |
| ------------------- | ------ | ---------------------------------------------------------------------------- |
| Obtain access token | POST   | `https://{brain}/oauth2/token`                                               |
| Submit a query      | POST   | `https://{brain}/api/v3.4/investigations/`                                   |
| Poll for results    | GET    | `https://{brain}/api/v3.4/investigations/{request_id}/`                      |
| Get page 2+         | GET    | `https://{brain}/api/v3.4/investigations/{request_id}/?page=2&page_size=500` |

### Appendix C — Postman Tests Script (Auto-Capture Request ID)

Paste the following into the **Tests** tab of your POST request in Postman to automatically store the returned `request_id` as a collection variable. This eliminates the need to manually copy the ID between requests.

javascript

```javascript
pm.test("Status 200", function() {
  pm.response.to.have.status(200);
});

var json = pm.response.json();
if (json.request_id) {
  pm.collectionVariables.set("request_id", json.request_id);
  console.log("Captured request_id:", json.request_id);
}
```

**How to set this up in Postman:**

1. Open your POST `/api/v3.4/investigations/` request
2. Click the **Tests** tab (next to Headers, Body, etc.)
3. Paste the script above
4. In your GET request URL, use: `https://{{VECTRA_BRAIN}}/api/v3.4/investigations/{{request_id}}/`
5. Every time you run the POST request, `request_id` is automatically updated — then just send the GET request to poll for results

**Collection variables to configure:**

| Variable       | Example Value                     | Description                               |
| -------------- | --------------------------------- | ----------------------------------------- |
| `VECTRA_BRAIN` | `your-brain.vectra.ai`            | Your Vectra Brain hostname                |
| `VECTRA_TOKEN` | `eyJhbGci...`                     | OAuth Bearer token (refresh every hour)   |
| `request_id`   | *(auto-populated by test script)* | Captured from POST response automatically |

> **Token auto-refresh tip:** Add a pre-request script to your collection to check token expiry and re-authenticate automatically if needed. Store the token expiry timestamp in a `TOKEN_EXPIRY` collection variable and compare against `Date.now()` before each request.

***

*Vectra AI — Investigations API User Guide | Version 1.0 | March 2026*
