Analytics Dashboard (Grafana) ============================= The Analytics Dashboard provides visual insights into vulnerability trends, host statistics, and future security posture predictions using Grafana. Overview -------- Access the dashboard at ``/grafana/d/vulnerability_reports`` (requires login). The dashboard focuses on three key areas: 1. **Host-Level Metrics**: Top vulnerable hosts and their risk profiles. 2. **Historical Trends**: Vulnerability counts over time. 3. **Predictive Analytics**: Forecasting future vulnerability counts based on historical trends. Dashboard Panels ---------------- Top 10 Vulnerable Hosts ~~~~~~~~~~~~~~~~~~~~~~~ Displays the most critical hosts sorted by: 1. Average Vulnerability Severity 2. Total Vulnerability Count **Features:** - Filterable by ``Selected Host`` variable. - Shows unique Products, Sources, and Locations for context. Vulnerability Trend over Time (Forecast) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This is the centerpiece of the analytics dashboard. It visualizes: 1. **Historical Data** (Yellow): Actual daily vulnerability counts from scan reports. 2. **Forecast Data** (Green Dashed): A projection of future vulnerability counts for the next 30 days. **Methodology:** The forecast is generated using a **Momentum-Based Linear Regression** combined with **Simulated Seasonality**. **1. Data Selection (The Momentum)** Instead of calculating the trend over the entire history (which often results in a static, flat average), the model focuses on **Momentum**. It selects only the **Last 7 Days** of historical data to calculate the slope. This ensures the forecast reacts quickly to recent changes (e.g., a sudden spike in vulnerabilities or a rapid remediation effort). **2. Linear Regression (The Direction)** A "Least Squares" Linear Regression is performed on the normalized 7-day window. - **X-Axis**: Normalized to Days (0, 1, 2...) to ensure mathematical stability and avoid integer overflow errors. - **Slope ($m$)**: Represents the daily rate of change (e.g., +5 vulnerabilities/day). - **Intercept ($b$)**: The baseline starting point. **3. Seasonality (The Movement)** Real-world data is rarely a straight line. To better simulate daily variance and uncertainty, a **Sine Wave** component is added to the projection. - **Formula**: $y_{pred} = (mx + b) + A \cdot \sin(x)$ - **Effect**: The forecast line "wobbles" slightly, providing a visual indication that the prediction is an estimate with expected variance, rather than a rigid target. **4. Bridge** The forecast series explicitly starts at the exact timestamp and value of the last available historical data point. This ensures a seamless visual transition from "History" to "Future". Filters and Variables --------------------- The dashboard supports dynamic filtering: - **Time Range**: Defaults to **Last 7 Days** (essential for the momentum model). - **Forecast Window**: Select the historical range for the regression model (7, 30, 90, or 365 Days). **Default: 365 Days**. - **Institution**: Filter by Faculty or Department (e.g., "Fakultät für Informatik"). - **Host**: Select specific IPs or Hostnames. - **Severity**: Filter by risk level (High/Medium/Low). - **Vulnerability**: Focus on a specific CVE or NVT name. - **OS**: Filter by Operating System (e.g., "Ubuntu 22.04 LTS"). Configuration ------------- The dashboard is provisioned via JSON at ``src/grafana/dashboards/vulnerability_reports.json``. Modifications to the SQL queries should be made in this file and applied by restarting the Grafana container. .. code-block:: bash docker compose restart grafana Technical Implementation Details -------------------------------- The forecast is calculated dynamically using a multi-stage Common Table Expression (CTE) SQL query. Query Breakdown ~~~~~~~~~~~~~~~ 1. ``historical_data`` * **Purpose**: Aggregates raw scan results into daily counts. * **Logic**: joins ``scan_results``, ``scans``, ``hosts``, etc., and groups by ``DATE(scan_date)``. 2. ``recent_stats`` * **Purpose**: Defines the time window for the momentum calculation. * **Logic**: Selects the MIN and MAX date from the **Last 7 Days**. 3. ``recent_data_normalized`` * **Purpose**: Prepares data for regression by normalizing the X-axis. * **Logic**: Converts dates into integer days ($0, 1, 2...$) using ``DATEDIFF(time, start_date)``. This is crucial for numerical stability (avoiding integer overflow). 4. ``regression`` * **Purpose**: Calculates the Slope ($m$) and Intercept ($b$) using the **Least Squares Method**. * **Formula - Slope ($m$)**: .. math:: m = \frac{N \sum(xy) - \sum x \sum y}{N \sum(x^2) - (\sum x)^2} * **Formula - Intercept ($b$)**: .. math:: b = \bar{y} - m\bar{x} * **SQL Implementation**: Uses ``SUM()``, ``COUNT()``, and ``AVG()`` aggregate functions on the normalized data. 5. ``forecast_days`` * **Purpose**: Generates the future timeline. * **Logic**: Creates a sequence of integers (1, 2, 3... 30) representing "Days into the future". 6. ``Final SELECT (Forecast Union)`` * **Purpose**: Projects the regression line into the future and adds seasonality. * **Formula**: .. math:: y_{pred} = b + m \cdot (x_{current} + n) + 5 \cdot \sin(n) * ``x_{current}``: The total days from the start of the window to "today". * ``n``: The days into the future. * ``5 * SIN(n)``: The sine wave component adding "wobble" (seasonality).