# Investigate API user guide

{% hint style="info" %}
**Please Note:**

This only applies to Vectra RUX deployments.
{% endhint %}

{% hint style="warning" %}

#### Investigate API is currently in PRIVATE PREVIEW. Please provide feedback to your respective account teams.

{% endhint %}

## Overview

The Vectra AI Investigations API (also called the Metadata API) enables you to run SQL-like queries directly against network metadata and cloud logs captured by your Vectra sensors. This is the same data that powers Vectra's built-in detections, giving you full access to underlying evidence for threat hunting, incident response, and ad-hoc investigation.

**Key capabilities:**

* Query up to **14 days** of raw network session data (connections, DNS, HTTP, TLS, SMB, Kerberos, LDAP, RDP, SSH, beacons, and more)
* Query cloud audit logs: **AWS CloudTrail**, **Azure ARM**, **Microsoft 365** (Entra ID, SharePoint, Exchange, Teams), and **Entra ID sign-ins**
* Filter by host entity IDs, IP addresses, timestamps, protocol fields, MITRE technique indicators, and any field documented in the schema reference
* Retrieve up to **10,000 rows** per query with full pagination support
* Use aggregation functions (COUNT, SUM, AVG, etc.) and grouping for statistical analysis

The API is asynchronous: you **submit** a query, receive a `request_id`, then **poll** for completion and **retrieve** paginated results.

## Prerequisites & Authentication

#### What You Need

| Item                       | Details                                                                       |
| -------------------------- | ----------------------------------------------------------------------------- |
| **Vectra Brain URL**       | Your Vectra deployment URL, e.g., `https://your-brain.vectra.ai`              |
| **API Client Credentials** | OAuth 2.0 Client ID and Client Secret (created in Vectra Admin → API Clients) |
| **Access Token**           | Bearer token obtained via the OAuth token endpoint                            |
| **Postman** (optional)     | Version 9+ recommended; import the collection or build requests manually      |

#### Obtaining an Access Token

The Investigations API uses OAuth 2.0 Bearer Token authentication. Obtain a token before making API calls.

**Token endpoint:**

```
POST https://{your-brain}/oauth2/token
```

**Postman setup:**

1. Create a new request → set method to `POST`
2. URL: `https://{your-brain}/oauth2/token`
3. Body → select `x-www-form-urlencoded` and add:

| Key             | Value                  |
| --------------- | ---------------------- |
| `grant_type`    | `client_credentials`   |
| `client_id`     | `{your_client_id}`     |
| `client_secret` | `{your_client_secret}` |

4. Send the request. Copy the `access_token` value from the response.

**Token response example:**

json

```json
{
  "access_token": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9...",
  "token_type": "Bearer",
  "expires_in": 3600
}
```

**Using the token in subsequent requests:**

In Postman, go to the **Authorization** tab → select **Bearer Token** → paste the access token. Alternatively, add a header manually:

```
Authorization: Bearer {access_token}
```

> **Tip:** In Postman, store the token as a collection variable (`{{access_token}}`) and reference it with `Bearer {{access_token}}`. You can automate token refresh using a pre-request script.

## How It Works — Submit, Poll, Retrieve

The Investigations API uses an **asynchronous** pattern. Queries may take several seconds or longer depending on data volume and time range.

```
Step 1: Submit Query
  POST /api/v3.4/investigations/
  Body: { "query": "SELECT ...", "version": "1.0" }
  → Response: { "request_id": "abc-123", "searchable_range": {...} }

Step 2: Poll for Completion
  GET /api/v3.4/investigations/abc-123/
  → Response: { "meta": { "query_status": "RUNNING", "data": [] } }   ← keep polling
  → Response: { "meta": { "query_status": "SUCCESS" }, "data": [...] }  ← done

Step 3: Retrieve Additional Pages (if needed)
  GET /api/v3.4/investigations/abc-123/?page=2&page_size=50
  → Response: { "status": "completed", "results": [...], "next_page": 3 }
```

**Status values:**

| `meta.query_status` | Meaning                                                                                |
| ------------------- | -------------------------------------------------------------------------------------- |
| `RUNNING`           | Query is actively executing — keep polling                                             |
| `SUCCESS`           | Query finished — `data` array contains your results                                    |
| `FAILED`            | Query failed — check the response body for error details (exact structure unconfirmed) |

**Recommended polling interval:** 1–3 seconds. Most queries complete within 5–15 seconds.

> **Postman Automation Tip:** Add this snippet to the **Tests** tab of your POST request to automatically capture the `request_id` into a collection variable:
>
> javascript
>
> ```javascript
> var json = pm.response.json();
> pm.collectionVariables.set("request_id", json.request_id);
> ```
>
> Your GET request can then reference `{{request_id}}` in the URL path automatically — no manual copy/paste needed. See Appendix B for the full script.

## API Reference

#### POST /api/v3.4/investigations/ — Submit a Query

**Full URL:** `https://{your-brain}/api/v3.4/investigations/`

**Method:** `POST`

**Headers:**

| Header          | Value                   |
| --------------- | ----------------------- |
| `Authorization` | `Bearer {access_token}` |
| `Content-Type`  | `application/json`      |

**Request body (JSON):**

| Field     | Type   | Required | Description                                          |
| --------- | ------ | -------- | ---------------------------------------------------- |
| `query`   | string | Yes      | SQL-like query string. See Query Language Reference. |
| `version` | string | No       | Query language version. Use `"1.0"` (default).       |

**Example request body:**

json

```json
{
  "query": "SELECT timestamp, id.orig_h, id.resp_h, id.resp_p FROM network.isession._all WHERE timestamp > date_add('hour', -24, now()) AND local_orig = true AND local_resp = false LIMIT 100",
  "version": "1.0"
}
```

**Success response — 200 OK:**

json

```json
{
  "request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "searchable_range": {
    "searchable_days_allowed": 14
  }
}
```

**Error responses:**

| HTTP Status             | Cause                                                                             |
| ----------------------- | --------------------------------------------------------------------------------- |
| `400 Bad Request`       | Invalid query syntax, unsupported clause, or missing `._all` suffix on table name |
| `401 Unauthorized`      | Missing or expired Bearer token                                                   |
| `403 Forbidden`         | Token does not have Investigations API permission                                 |
| `429 Too Many Requests` | Rate limit exceeded (5 POST requests/minute)                                      |

The API surfaces two distinct error patterns depending on when the error is detected.

**Syntax errors — returned directly in the POST response** (query fails to parse before execution):

json

```json
{
  "error": {
    "errorCode": "SYNTAX_ERROR",
    "errorId": "SAA23Q",
    "extra": [
      {
        "line": 1,
        "column": 96,
        "offending_symbol": "WHERE",
        "message": "query parsing failed: mismatched input 'WHERE' expecting '.'"
      }
    ]
  }
}
```

| Error field                      | Description                                              |
| -------------------------------- | -------------------------------------------------------- |
| `error.errorCode`                | `SYNTAX_ERROR` for parse failures                        |
| `error.errorId`                  | Unique ID for this error — include when reporting issues |
| `error.extra[].line`             | Line number in the query where the error occurred        |
| `error.extra[].column`           | Character position of the error                          |
| `error.extra[].offending_symbol` | The token that triggered the parse failure               |
| `error.extra[].message`          | Human-readable description of the parse error            |

> **Common cause:** Missing `._all` suffix on a table name produces a `SYNTAX_ERROR` because the parser expects a `.` after the table name and encounters `WHERE` instead.

**Runtime errors — returned in the GET response** (query parses successfully but fails during execution):

json

```json
{
  "error": {
    "errorCode": "DATABASE_ERROR",
    "errorId": "OWHVWH",
    "extra": [
      {
        "column": "orig_bytes",
        "error_name": "COLUMN_NOT_FOUND",
        "error_type": "USER_ERROR"
      }
    ]
  }
}
```

| Error field                | Description                                              |
| -------------------------- | -------------------------------------------------------- |
| `error.errorCode`          | `DATABASE_ERROR` for runtime failures                    |
| `error.errorId`            | Unique ID for this error — include when reporting issues |
| `error.extra[].column`     | The field name that caused the error                     |
| `error.extra[].error_name` | Specific error type (e.g. `COLUMN_NOT_FOUND`)            |
| `error.extra[].error_type` | `USER_ERROR` indicates the query itself is the problem   |

Common `error_name` values for runtime errors:

| `error_name`       | `column` present                | Meaning                                                                                                                                                                     |
| ------------------ | ------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `COLUMN_NOT_FOUND` | Yes — names the offending field | A field name in SELECT or WHERE does not exist in the table — check the schema reference for correct field names                                                            |
| `TYPE_MISMATCH`    | No                              | A value, cast, or function argument has an incompatible type — check comparisons (e.g. string field vs. numeric literal), `CAST` expressions, and aggregate function inputs |

#### GET /api/v3.4/investigations/{request\_id}/ — Get Results

**Full URL:** `https://{your-brain}/api/v3.4/investigations/{request_id}/`

**Method:** `GET`

**Headers:**

| Header          | Value                   |
| --------------- | ----------------------- |
| `Authorization` | `Bearer {access_token}` |

**Query parameters:**

| Parameter   | Type    | Default | Description                |
| ----------- | ------- | ------- | -------------------------- |
| `page`      | integer | 1       | Page number (1-indexed)    |
| `page_size` | integer | 50      | Rows per page (max 10,000) |

**Success response — 200 OK (completed):**

json

```json
{
  "request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "data": [
    {
      "timestamp": "2026-03-10T14:23:01.000Z",
      "orig_h": "10.0.1.45",
      "resp_h": "93.184.216.34",
      "resp_p": 443
    }
  ],
  "meta": {
    "page": 1,
    "page_size": 50,
    "num_rows_available": 342,
    "estimated_file_size_bytes": 18400,
    "query_status": "SUCCESS",
    "columns": [
      ["timestamp", [{"type": "string"}, ""]],
      ["orig_h",    [{"type": "string"}, ""]],
      ["resp_h",    [{"type": "string"}, ""]],
      ["resp_p",    [{"type": "number"}, ""]]
    ]
  }
}
```

**Response fields:**

| Field                            | Location  | Description                                                                  |
| -------------------------------- | --------- | ---------------------------------------------------------------------------- |
| `request_id`                     | top level | Unique ID for this query job                                                 |
| `data`                           | top level | Array of result rows — your query results are here                           |
| `meta.query_status`              | `meta`    | `"RUNNING"` while executing, `"SUCCESS"` when complete                       |
| `meta.num_rows_available`        | `meta`    | Total rows returned by the query                                             |
| `meta.page`                      | `meta`    | Current page number                                                          |
| `meta.page_size`                 | `meta`    | Rows returned on this page                                                   |
| `meta.estimated_file_size_bytes` | `meta`    | Estimated size of the full result set in bytes                               |
| `meta.columns`                   | `meta`    | Array of `[column_name, [{type}, hint]]` tuples describing the result schema |

> **Checking for more pages:** Compare `meta.num_rows_available` against `meta.page_size`. If `num_rows_available > page_size`, there are additional pages to retrieve. Increment the `page` parameter until you have collected all rows.

**Still running — 202 OK:**

json

```json
{
  "data": [],
  "request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "meta": {
    "page": 1,
    "page_size": 50,
    "estimated_file_size_bytes": null,
    "num_rows_available": 0,
    "query_status": "RUNNING",
    "columns": []
  }
}
```

> **Polling tip:** The response structure is identical whether the query is running or complete — `data` and `columns` are simply empty arrays while the query is in progress. Poll until `meta.query_status` equals `"SUCCESS"` before reading `data`.

**Failed query response (expected structure):**

json

```json
{
  "request_id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "meta": {
    "query_status": "FAILED",
    "error": "Syntax error near 'GROUPBY' at position 42"
  }
}
```

## Query Language Reference

#### Table Naming Convention

All table names **must** end with the `._all` suffix:

sql

```sql
-- Correct
FROM network.isession._all
FROM network.dns._all
FROM aws.cloudtrail._all

-- Incorrect — will return 400 Bad Request
FROM network.isession
FROM network.dns
```

#### Supported Clauses

| Clause                         | Supported | Notes                                   |
| ------------------------------ | --------- | --------------------------------------- |
| `SELECT`                       | ✅         | Use `*` or specific fields              |
| `FROM`                         | ✅         | One table per query (no JOIN)           |
| `WHERE`                        | ✅         | Standard filters + functions            |
| `ORDER BY`                     | ✅         | `ASC` or `DESC`                         |
| `LIMIT`                        | ✅         | Recommended; maximum 10,000             |
| `GROUP BY`                     | ✅         | For aggregation queries                 |
| `HAVING`                       | ✅         | Filter on aggregate results             |
| `UNION ALL`                    | ✅         | Combine results from multiple queries   |
| Subqueries                     | ✅         | Supported                               |
| `JOIN`                         | ❌         | Not supported — use `UNION ALL` instead |
| Plain `UNION`                  | ❌         | Use `UNION ALL`                         |
| `INSERT` / `UPDATE` / `DELETE` | ❌         | Read-only API                           |

#### Field Notation Rules

Struct fields **must** use dot-notation in `WHERE` and `ORDER BY` clauses. Using a flat field name in a filter position returns `400 Bad Request`.

sql

```sql
-- Correct
WHERE id.resp_h = '1.2.3.4'
WHERE orig_hostname.id = 7832
ORDER BY id.resp_p ASC

-- Incorrect — returns 400 Bad Request
WHERE resp_h = '1.2.3.4'
WHERE id = 7832
```

> **Note:** In `SELECT`, either form works — `SELECT id.orig_h` returns the column as `orig_h` in results.

#### Supported Functions

**Aggregate functions:** `COUNT`, `MAX`, `MIN`, `SUM`, `AVG`, `STDDEV`, `STDDEV_SAMP`, `STDDEV_POP`

**String functions:** `LOWER`, `UPPER`, `LENGTH`, `ABS`, `CONCAT`, `CONTAINS`, `COALESCE`

**Time functions:** `DATE`, `NOW`, `DATE_ADD`, `DATE_DIFF`, `FROM_ISO8601_TIMESTAMP`, `FROM_UNIXTIME`, `TO_UNIXTIME`

**Regex functions:** `REGEXP_COUNT`, `REGEXP_EXTRACT_ALL`, `REGEXP_EXTRACT`, `REGEXP_LIKE`, `REGEXP_POSITION`, `REGEXP_REPLACE`, `REGEXP_SPLIT`

**Type casting:** `TRY_CAST`, `CAST`

**Array / predicate functions:** `ANY_MATCH`, `ALL_MATCH`, `DISTINCT`, `ARRAY_AGG`, `CARDINALITY`

#### Common Time Filter Pattern

sql

```sql
-- Last 24 hours
WHERE timestamp > date_add('hour', -24, now())

-- Last 7 days
WHERE timestamp > date_add('day', -7, now())

-- Specific time window
WHERE timestamp >= '2026-03-10T00:00:00Z'
  AND timestamp <= '2026-03-11T00:00:00Z'
```

#### Common Fields (All Network Tables)

| Field                | Type      | Description                                   |
| -------------------- | --------- | --------------------------------------------- |
| `timestamp`          | timestamp | Record timestamp — always use this (not `ts`) |
| `id.orig_h`          | string    | Source (originator) IP address                |
| `id.orig_p`          | integer   | Source port                                   |
| `id.resp_h`          | string    | Destination (responder) IP address            |
| `id.resp_p`          | integer   | Destination port                              |
| `local_orig`         | boolean   | `true` if source IP is internal               |
| `local_resp`         | boolean   | `true` if destination IP is internal          |
| `orig_hostname.id`   | integer   | Vectra host entity ID for source host         |
| `orig_hostname.name` | string    | Display name of source host                   |
| `uid`                | string    | Unique session ID                             |

> For complete field listings by table, see the Table Schema Reference.

## Sample Queries

All samples below are formatted as complete Postman request bodies (JSON). Submit each as the body of `POST /api/v3.4/investigations/`.

#### Basic Queries

**1. Recent Outbound Connections (Last 24 Hours)**

Returns outbound sessions from internal hosts to external destinations.

json

```json
{
  "query": "SELECT timestamp, id.orig_h, id.resp_h, id.resp_p, orig_ip_bytes, resp_ip_bytes FROM network.isession._all WHERE timestamp > date_add('hour', -24, now()) AND local_orig = true AND local_resp = false ORDER BY timestamp DESC LIMIT 100",
  "version": "1.0"
}
```

**2. DNS Queries from a Specific Host**

Returns all DNS lookups made by a host identified by its IP address.

json

```json
{ 
  "query": "SELECT timestamp, uid, id.orig_h, orig_hostname, id.resp_h, id.resp_p, qtype_name, query, answers, total_answers, rejected, sensor_uid FROM network.dns._all WHERE id.orig_h = '1.2.3.4' AND timestamp > date_add('hour', -24, now()) ORDER BY timestamp DESC LIMIT 100", 
  "version": "1.0" }
```

> Replace `1.2.3.4` with the actual Vectra host IP.

**3. HTTP Activity to a Suspicious Domain**

Finds all HTTP requests to a specific host header value.

json

```json
{ 
  "query": "SELECT timestamp, id.orig_h, host, uri, method, status_code, user_agent FROM network.http._all WHERE timestamp > date_add('day', -3, now()) AND host = 'suspicious-domain.com' ORDER BY timestamp DESC LIMIT 100",
  "version": "1.0" }
```

**4. Failed Kerberos Authentication Attempts**

Returns failed Kerberos events which may indicate brute force or password spraying.

json

```json
{
  "query": "SELECT timestamp, id.orig_h, client, service, error_msg FROM network.kerberos._all WHERE timestamp > date_add('hour', -6, now()) AND success = false ORDER BY timestamp DESC LIMIT 500",
  "version": "1.0"
}
```

**5. Large Outbound Data Transfers**

Identifies sessions where an internal host sent more than 10 MB to an external destination.

json

```json
{
  "query": "SELECT id.orig_h, id.resp_h, id.resp_p, COUNT(*) AS sessions, SUM(orig_ip_bytes) AS bytes_sent, SUM(resp_ip_bytes) AS bytes_received FROM network.isession._all WHERE timestamp > date_add('hour', -24, now()) AND local_orig = true AND local_resp = false GROUP BY id.orig_h, id.resp_h, id.resp_p HAVING SUM(orig_ip_bytes) > 10000000 ORDER BY bytes_sent DESC LIMIT 100",
  "version": "1.0"
}
```

#### Advanced Queries

**6. Top External Destinations by Connection Count**

Aggregates outbound connections grouped by destination IP, ordered by volume. Useful for identifying beaconing or bulk data transfer patterns.

json

```json
{
  "query": "SELECT id.resp_h, COUNT(*) AS connection_count, SUM(orig_ip_bytes) AS total_bytes_sent FROM network.isession._all WHERE timestamp > date_add('day', -1, now()) AND local_orig = true AND local_resp = false GROUP BY id.resp_h ORDER BY connection_count DESC LIMIT 50",
  "version": "1.0"
}
```

**7. DNS Tunneling Indicators**

Flags suspiciously long DNS query names (potential data exfiltration via DNS tunneling) combined with TXT record lookups.

json

```json
{
  "query": "SELECT timestamp, id.orig_h, orig_hostname.name, query, qtype_name, LENGTH(query) AS query_length FROM network.dns._all WHERE timestamp > date_add('hour', -24, now()) AND (LENGTH(query) > 50 OR qtype_name = 'TXT') ORDER BY query_length DESC LIMIT 200",
  "version": "1.0"
}
```

**8. Lateral Movement — Internal RDP Sessions**

Finds RDP connections between internal hosts, a primary lateral movement indicator.

json

```json
{
  "query": "SELECT timestamp, id.orig_h, id.resp_h, orig_hostname.name AS src_host, resp_hostname.name AS dst_host, client_name FROM network.rdp._all WHERE timestamp > date_add('hour', -24, now()) AND local_orig = true AND local_resp = true ORDER BY timestamp DESC LIMIT 200",
  "version": "1.0"
}
```

**9. AWS IAM Changes in the Last 7 Days**

Identifies mutating IAM API calls (user/role/policy changes) in AWS CloudTrail — a key persistence and privilege escalation indicator.

json

```json
{
  "query": "SELECT timestamp, event_name, user_identity.arn, source_ip_address, error_code FROM aws.cloudtrail._all WHERE timestamp > date_add('day', -7, now()) AND event_source = 'iam.amazonaws.com' AND read_only = 'false' ORDER BY timestamp DESC LIMIT 200",
  "version": "1.0"
}
```

**10. Entra ID Sign-ins from High-Risk Countries (UNION ALL)**

Combines sign-in events from multiple high-risk geographic locations using `UNION ALL`.

json

```json
{
  "query": "SELECT timestamp, user_principal_name, ip_address, location.country_or_region AS country, app_display_name, status.error_code FROM entra.signins._all WHERE timestamp > date_add('day', -7, now()) AND location.country_or_region = 'RU' UNION ALL SELECT timestamp, user_principal_name, ip_address, location.country_or_region AS country, app_display_name, status.error_code FROM entra.signins._all WHERE timestamp > date_add('day', -7, now()) AND location.country_or_region = 'KP' ORDER BY timestamp DESC LIMIT 500",
  "version": "1.0"
}
```

## Pagination — Retrieving Large Result Sets

When a query matches more rows than your `page_size`, results are split across multiple pages. Use the `next_page` field in the response to retrieve subsequent pages.

#### Step-by-Step Pagination in Postman

**Step 1 — Submit the query** (POST):

json

```json
{
  "query": "SELECT timestamp, id.orig_h, id.resp_h FROM network.isession._all WHERE timestamp > date_add('day', -1, now()) AND local_orig = true LIMIT 10000",
  "version": "1.0"
}
```

**Step 2 — Poll for completion** (GET):

```
GET /api/v3.4/investigations/{request_id}/
```

**Step 3 — Retrieve page 1** (GET):

```
GET /api/v3.4/investigations/{request_id}/?page=1&page_size=500
```

**Step 4 — Continue if `next_page` is not null:**

```
GET /api/v3.4/investigations/{request_id}/?page=2&page_size=500
GET /api/v3.4/investigations/{request_id}/?page=3&page_size=500
...
```

**Stop when you've collected all rows** — compare `meta.num_rows_available` (total rows) against `meta.page * meta.page_size`. When the rows you've collected equals `meta.num_rows_available`, you're done.

> **Performance tip:** Use a `page_size` of 500–1,000 for large result sets to balance payload size and number of round trips. Your results will be in the `data` array of each response.

## Troubleshooting

| Symptom                                | Likely Cause                                                            | Resolution                                                                                                                                                                  |
| -------------------------------------- | ----------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `SYNTAX_ERROR` in POST response        | Missing `._all` suffix — parser expects `.` before `WHERE`              | Change `FROM network.isession` to `FROM network.isession._all`                                                                                                              |
| `SYNTAX_ERROR` in POST response        | Malformed SQL — bad clause order, unmatched quotes, unsupported keyword | Check `extra[].offending_symbol` and `extra[].message` for the exact location                                                                                               |
| `COLUMN_NOT_FOUND` in GET response     | Field name doesn't exist in the table                                   | Check the schema reference for correct field names — e.g. `orig_ip_bytes` not `orig_bytes`                                                                                  |
| `COLUMN_NOT_FOUND` in GET response     | Flat field name used in WHERE clause                                    | Change `WHERE resp_h = '...'` to `WHERE id.resp_h = '...'`                                                                                                                  |
| `TYPE_MISMATCH` in GET response        | Incompatible types in a comparison, cast, or function                   | Check that field types match their usage — e.g. wrap array fields with `CAST(... AS VARCHAR)` before using `REGEXP_LIKE`, avoid comparing string fields to numeric literals |
| `400 Bad Request`                      | Unsupported clause (e.g., `JOIN`)                                       | Remove JOIN; use `UNION ALL` to combine results from multiple queries                                                                                                       |
| `401 Unauthorized`                     | Missing or expired token                                                | Re-authenticate via `POST /oauth2/token` and update the Bearer token                                                                                                        |
| `403 Forbidden`                        | Insufficient API client permissions                                     | Ensure the API client has Investigations API access in Vectra Admin → API Clients                                                                                           |
| `429 Too Many Requests`                | Rate limit exceeded                                                     | Wait 60 seconds before retrying; the limit is 5 POST requests/minute                                                                                                        |
| `error` object present in GET response | Runtime error (e.g. bad field name)                                     | Check `error.extra[].error_name` and `error.extra[].column` for the specific cause                                                                                          |
| Query returns 0 results                | Time range too narrow or wrong table                                    | Widen the time range; verify the correct table name in the schema reference                                                                                                 |
| Query times out                        | Query spans too much data                                               | Add a more specific `WHERE` clause to narrow the result set; reduce time range                                                                                              |
| Struct field returns `null`            | External IP has no Vectra host record                                   | `orig_hostname` / `resp_hostname` fields are `null` for external IPs — this is expected                                                                                     |
| `UNION ALL` returns duplicate columns  | Column alias mismatch between branches                                  | Ensure both SELECT branches have identical column count and matching aliases                                                                                                |

## Rate Limits & Constraints

| Constraint                        | Value                                          |
| --------------------------------- | ---------------------------------------------- |
| **POST requests (submit query)**  | 5 per minute                                   |
| **Maximum rows per query**        | 10,000                                         |
| **Maximum lookback period**       | 14 days                                        |
| **Minimum polling interval**      | 1 second (recommended: 2–3 seconds)            |
| **Authentication token lifetime** | 3,600 seconds (1 hour) — refresh before expiry |

> **Bulk data retrieval:** If you need more than 10,000 rows, split your query into multiple time windows and submit them sequentially. For example, query 4-hour windows over a 24-hour period to retrieve up to 60,000 rows total.

### Appendix A — Where to Find Table Schemas

This guide covers the query mechanics and provides sample queries, but does not document every available table and field. Full schema documentation — including all available tables, field names, types, and example values — is maintained separately.

> **📄 See the Vectra Investigate API Query Schema Reference** for complete field listings:&#x20;
>
> [Investigate API metadata schema reference](/operations/investigate/investigate-api-metadata-schema-reference.md)

### Appendix B — Quick Reference: Endpoint Summary

| Action              | Method | Endpoint                                                                     |
| ------------------- | ------ | ---------------------------------------------------------------------------- |
| Obtain access token | POST   | `https://{brain}/oauth2/token`                                               |
| Submit a query      | POST   | `https://{brain}/api/v3.4/investigations/`                                   |
| Poll for results    | GET    | `https://{brain}/api/v3.4/investigations/{request_id}/`                      |
| Get page 2+         | GET    | `https://{brain}/api/v3.4/investigations/{request_id}/?page=2&page_size=500` |

### Appendix C — Postman Tests Script (Auto-Capture Request ID)

Paste the following into the **Tests** tab of your POST request in Postman to automatically store the returned `request_id` as a collection variable. This eliminates the need to manually copy the ID between requests.

javascript

```javascript
pm.test("Status 200", function() {
  pm.response.to.have.status(200);
});

var json = pm.response.json();
if (json.request_id) {
  pm.collectionVariables.set("request_id", json.request_id);
  console.log("Captured request_id:", json.request_id);
}
```

**How to set this up in Postman:**

1. Open your POST `/api/v3.4/investigations/` request
2. Click the **Tests** tab (next to Headers, Body, etc.)
3. Paste the script above
4. In your GET request URL, use: `https://{{VECTRA_BRAIN}}/api/v3.4/investigations/{{request_id}}/`
5. Every time you run the POST request, `request_id` is automatically updated — then just send the GET request to poll for results

**Collection variables to configure:**

| Variable       | Example Value                     | Description                               |
| -------------- | --------------------------------- | ----------------------------------------- |
| `VECTRA_BRAIN` | `your-brain.vectra.ai`            | Your Vectra Brain hostname                |
| `VECTRA_TOKEN` | `eyJhbGci...`                     | OAuth Bearer token (refresh every hour)   |
| `request_id`   | *(auto-populated by test script)* | Captured from POST response automatically |

> **Token auto-refresh tip:** Add a pre-request script to your collection to check token expiry and re-authenticate automatically if needed. Store the token expiry timestamp in a `TOKEN_EXPIRY` collection variable and compare against `Date.now()` before each request.

***

*Vectra AI — Investigations API User Guide | Version 1.0 | March 2026*


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.vectra.ai/operations/investigate/investigate-api-user-guide.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
