Day 04 - Testing
Learning Objectives
This module covers comprehensive testing strategies essential for enterprise geospatial systems, emphasizing quality engineering practices that ensure reliability, performance, and maintainability at scale. By the end of this module, you will understand:
- Test-Driven Development (TDD) and Behavior-Driven Development (BDD) for geospatial domain modeling
- Advanced pytest patterns including fixtures, parametrization, plugins, and custom test discovery
- Property-based testing with Hypothesis for robust geospatial algorithm validation
- Integration testing strategies for distributed geospatial services and external API dependencies
- Performance testing methodologies including load, stress, spike, and endurance testing
- Mutation testing and fuzzing for ensuring test quality and edge case coverage
- Test automation in CI/CD pipelines with geographic data and spatial analysis workflows
- Chaos engineering principles for testing system resilience under failure conditions
Theoretical Foundation
Quality Engineering in Geospatial Systems
Spatial Data Complexity:
Geospatial systems face unique testing challenges:
- Coordinate precision: Floating-point arithmetic and transformation accuracy
- Topology validation: Ensuring geometric integrity across operations
- Scale variability: Testing from millimeter precision to global datasets
- Temporal dimensions: Time-series spatial data and change detection
- Multi-format support: Ensuring consistency across GeoJSON, WKT, MVT, etc.
Distributed System Testing: Modern geospatial architectures require sophisticated testing approaches: - Service mesh testing: Inter-service communication and failure scenarios - Event-driven architecture validation: Async message processing and ordering - Spatial indexing correctness: R-tree, quadtree, and H3 index validation - Cache coherence testing: Multi-level caching with spatial invalidation
Testing Pyramid for Geospatial Applications
/\
/ \ Manual Exploratory Testing
/____\ (Map visual validation, UX testing)
/ \
/ \ End-to-End Integration Tests
/__________\ (Full pipeline, external services)
/ \
/ \ Contract/API Tests
/________________\ (Service boundaries, protocols)
/ \
/ \ Component/Service Tests
/______________________\ (Business logic, spatial operations)
/ \
/ \ Unit Tests
/____________________________\ (Pure functions, algorithms, data structures)
Optimal Distribution (Geospatial Systems):
- Unit Tests (70%): Spatial algorithms, coordinate transformations, geometric operations
- Component Tests (20%): Provider integrations, spatial queries, cache behavior
- Integration Tests (8%): API contracts, database transactions, external service mocks
- E2E Tests (2%): Critical user workflows, visual map rendering validation
Core Concepts
1) TDD Cycle
Red → Green → Refactor. Target small, observable units and keep test names behavior-focused.
2) pytest Essentials
pytest is a powerful testing framework that makes testing simple and scalable:
import pytest
def test_simple():
assert 2 + 2 == 4
def test_with_fixture(sample_data):
assert len(sample_data) > 0
@pytest.mark.parametrize("input,expected", [
(1, 2),
(2, 4),
(3, 6)
])
def test_doubling(input, expected):
assert input * 2 == expected
3) Testing Pyramid
/\
/ \ E2E Tests (Few, Slow)
/____\
/ \ Integration Tests (Some, Medium)
/________\ Unit Tests (Many, Fast)
- Unit Tests: Test individual functions/methods in isolation
- Integration Tests: Test how components work together
- End-to-End Tests: Test complete user workflows
Code Walkthrough (this repo)
1) API Smoke Tests
src/day04_testing/tests/test_smoke.py verifies the Day 3 API routes.
from fastapi.testclient import TestClient
from src.day03_api.app import app
def test_tiles_smoke():
client = TestClient(app)
resp = client.get("/tiles/0/0/0.mvt")
assert resp.status_code == 200
def test_stream_features_invalid_bbox():
client = TestClient(app)
resp = client.get("/stream-features", params={
"min_lat": 1, "min_lon": 0,
"max_lat": 0, "max_lon": 1
})
assert resp.status_code == 400
Key Points:
- TestClient: FastAPI's testing utility that simulates HTTP requests
- Arrange-Act-Assert: Clear test structure
- Descriptive names: Test names explain what they're testing
2) Test Client Usage
from fastapi.testclient import TestClient
# Create a test client
client = TestClient(app)
# Test GET requests
response = client.get("/endpoint")
response = client.get("/endpoint", params={"param": "value"})
# Test POST requests
response = client.post("/endpoint", json={"data": "value"})
# Test headers
response = client.get("/endpoint", headers={"Authorization": "Bearer token"})
# Assertions
assert response.status_code == 200
assert response.json()["key"] == "expected_value"
assert "expected_header" in response.headers
Running Tests
# Activate virtual environment
source .venv/bin/activate
# Run all tests
pytest
# Run with verbose output
pytest -v
# Run specific test file
pytest src/day04_testing/tests/test_smoke.py
# Run specific test function
pytest src/day04_testing/tests/test_smoke.py::test_tiles_smoke
Options
# Run tests and show coverage
pytest --cov=src
# Run tests in parallel
pytest -n auto
# Stop on first failure
pytest -x
# Show local variables on failure
pytest -l
# Run only tests matching a pattern
pytest -k "bbox"
Exercises
1) Unit Tests for RoadNetwork
Create test_road_network.py to test the core data processing:
import pytest
from src.day07_mock.mock_test.road_network import RoadNetwork
from unittest.mock import patch, mock_open
import csv
class TestRoadNetwork:
@pytest.fixture
def sample_csv_data(self):
return """road_id,name,geometry,speed_limit,road_type,last_updated
R001,El Camino Real,"LINESTRING(-122.123 37.456, -122.124 37.457)",50,arterial,2024-01-15T10:00:00Z
R002,Page Mill Rd,"LINESTRING(-122.124 37.457, -122.125 37.458)",40,residential,2024-01-15T10:00:00Z"""
@pytest.fixture
def road_network(self, sample_csv_data):
with patch("builtins.open", mock_open(read_data=sample_csv_data)):
with patch("csv.DictReader") as mock_reader:
mock_reader.return_value = [
{
"road_id": "R001",
"name": "El Camino Real",
"geometry": "LINESTRING(-122.123 37.456, -122.124 37.457)",
"speed_limit": "50",
"road_type": "arterial",
"last_updated": "2024-01-15T10:00:00Z"
},
{
"road_id": "R002",
"name": "Page Mill Rd",
"geometry": "LINESTRING(-122.124 37.457, -122.125 37.458)",
"speed_limit": "40",
"road_type": "residential",
"last_updated": "2024-01-15T10:00:00Z"
}
]
network = RoadNetwork("dummy_path.csv")
return network
def test_loads_roads_correctly(self, road_network):
assert len(road_network._roads) == 2
assert "R001" in road_network._roads
assert "R002" in road_network._roads
def test_find_roads_in_bbox(self, road_network):
roads = road_network.find_roads_in_bbox(37.45, -122.13, 37.46, -122.12)
assert len(roads) == 2
def test_get_connected_roads(self, road_network):
connected = road_network.get_connected_roads("R001")
assert "R002" in connected
def test_update_road(self, road_network):
updated = road_network.update_road("R001", speed_limit=60)
assert updated.speed_limit == 60
assert updated.road_type == "arterial" # Unchanged
2) API Tests
Create comprehensive API tests:
import pytest
from fastapi.testclient import TestClient
from src.day07_mock.mock_test.api import app
import json
class TestAPI:
@pytest.fixture
def client(self):
return TestClient(app)
def test_roads_bbox_valid(self, client):
response = client.get("/roads/bbox", params={
"min_lat": 37.0, "min_lon": -122.0,
"max_lat": 38.0, "max_lon": -121.0
})
assert response.status_code == 200
data = response.json()
assert data["type"] == "FeatureCollection"
assert "features" in data
def test_roads_bbox_invalid(self, client):
response = client.get("/roads/bbox", params={
"min_lat": 38.0, "min_lon": -122.0,
"max_lat": 37.0, "max_lon": -121.0 # Invalid bbox
})
assert response.status_code == 400
def test_roads_bbox_pagination(self, client):
response = client.get("/roads/bbox", params={
"min_lat": 37.0, "min_lon": -122.0,
"max_lat": 38.0, "max_lon": -121.0,
"limit": 5, "offset": 0
})
assert response.status_code == 200
def test_connected_roads(self, client):
response = client.get("/roads/R001/connected")
assert response.status_code == 200
assert isinstance(response.json(), list)
def test_update_road(self, client):
update_data = {"speed_limit": 55}
response = client.post("/roads/R001/update", json=update_data)
assert response.status_code == 200
data = response.json()
assert data["properties"]["speed_limit"] == 55
def test_update_road_invalid(self, client):
update_data = {"speed_limit": -5} # Invalid speed
response = client.post("/roads/R001/update", json=update_data)
assert response.status_code == 422 # Validation error
3) Property-Based Testing with Hypothesis
import pytest
from hypothesis import given, strategies as st
from shapely.geometry import box
from src.day07_mock.mock_test.road_network import RoadNetwork
class TestRoadNetworkProperties:
@given(
min_lat=st.floats(min_value=-90, max_value=90),
min_lon=st.floats(min_value=-180, max_value=180),
max_lat=st.floats(min_value=-90, max_value=90),
max_lon=st.floats(min_value=-180, max_value=180)
)
def test_bbox_query_always_returns_list(self, min_lat, min_lon, max_lat, max_lon):
# Skip invalid bboxes
if min_lat > max_lat or min_lon > max_lon:
return
# Create a minimal network for testing
network = RoadNetwork("dummy_path.csv")
result = network.find_roads_in_bbox(min_lat, min_lon, max_lat, max_lon)
assert isinstance(result, list)
@given(
road_id=st.text(min_size=1, max_size=10)
)
def test_connected_roads_always_returns_list(self, road_id):
network = RoadNetwork("dummy_path.csv")
result = network.get_connected_roads(road_id)
assert isinstance(result, list)
assert all(isinstance(x, str) for x in result)
Advanced Techniques
1) Fixtures and Dependency Injection
import pytest
from unittest.mock import Mock
@pytest.fixture
def mock_database():
"""Mock database connection for testing."""
db = Mock()
db.query.return_value.filter.return_value.all.return_value = [
{"id": 1, "name": "Test Road"}
]
return db
@pytest.fixture
def sample_geojson():
"""Sample GeoJSON data for testing."""
return {
"type": "FeatureCollection",
"features": [
{
"type": "Feature",
"properties": {"id": "R001", "name": "Test Road"},
"geometry": {
"type": "LineString",
"coordinates": [[-122.123, 37.456], [-122.124, 37.457]]
}
}
]
}
def test_with_fixtures(mock_database, sample_geojson):
# Use the fixtures in your test
assert mock_database.query.called
assert len(sample_geojson["features"]) == 1
2) Parametrized Tests
import pytest
@pytest.mark.parametrize("bbox,expected_count", [
((37.0, -122.0, 38.0, -121.0), 2), # Large bbox
((37.45, -122.13, 37.46, -122.12), 2), # Small bbox
((0.0, 0.0, 1.0, 1.0), 0), # No roads in this area
])
def test_bbox_queries(bbox, expected_count):
min_lat, min_lon, max_lat, max_lon = bbox
network = RoadNetwork("test_data.csv")
roads = network.find_roads_in_bbox(min_lat, min_lon, max_lat, max_lon)
assert len(roads) == expected_count
@pytest.mark.parametrize("invalid_bbox", [
(38.0, -122.0, 37.0, -121.0), # min_lat > max_lat
(37.0, -121.0, 38.0, -122.0), # min_lon > max_lon
])
def test_invalid_bbox_raises_error(invalid_bbox):
min_lat, min_lon, max_lat, max_lon = invalid_bbox
network = RoadNetwork("test_data.csv")
with pytest.raises(ValueError):
network.find_roads_in_bbox(min_lat, min_lon, max_lat, max_lon)
3) Mocking External Dependencies
from unittest.mock import patch, MagicMock
import httpx
def test_tile_fetching_with_mock():
with patch("httpx.AsyncClient") as mock_client:
# Setup mock response
mock_response = MagicMock()
mock_response.content = b"fake_tile_data"
mock_response.raise_for_status.return_value = None
mock_client_instance = MagicMock()
mock_client_instance.get.return_value = mock_response
mock_client.return_value.__aenter__.return_value = mock_client_instance
# Test the function
from src.day01_concurrency.tile_fetcher import fetch_tile
import asyncio
result = asyncio.run(fetch_tile(mock_client_instance, 5, 10, 12))
assert result == b"fake_tile_data"
Load Testing (lightweight)
1) pytest-benchmark
import pytest
from fastapi.testclient import TestClient
from src.day07_mock.mock_test.api import app
def test_api_performance(benchmark):
client = TestClient(app)
def make_request():
return client.get("/roads/bbox", params={
"min_lat": 37.0, "min_lon": -122.0,
"max_lat": 38.0, "max_lon": -121.0
})
result = benchmark(make_request)
assert result.status_code == 200
def test_concurrent_requests(benchmark):
import asyncio
import httpx
async def make_concurrent_requests():
async with httpx.AsyncClient() as client:
tasks = [
client.get("http://localhost:8000/roads/bbox", params={
"min_lat": 37.0, "min_lon": -122.0,
"max_lat": 38.0, "max_lon": -121.0
})
for _ in range(10)
]
responses = await asyncio.gather(*tasks)
return responses
# This would need a running server
# result = benchmark(lambda: asyncio.run(make_concurrent_requests()))
2. Load Testing with Locust
Create locustfile.py for more sophisticated load testing:
from locust import HttpUser, task, between
class RoadNetworkUser(HttpUser):
wait_time = between(1, 3)
@task(3)
def test_bbox_query(self):
self.client.get("/roads/bbox", params={
"min_lat": 37.0, "min_lon": -122.0,
"max_lat": 38.0, "max_lon": -121.0
})
@task(1)
def test_connected_roads(self):
self.client.get("/roads/R001/connected")
@task(1)
def test_update_road(self):
self.client.post("/roads/R001/update", json={"speed_limit": 55})
Test Coverage
1. Measuring Coverage
# Install coverage tools
pip install pytest-cov
# Run tests with coverage
pytest --cov=src --cov-report=html
# Generate detailed coverage report
pytest --cov=src --cov-report=term-missing
2. Coverage Configuration
Create .coveragerc file:
[run]
source = src
omit =
*/tests/*
*/__init__.py
[report]
exclude_lines =
pragma: no cover
def __repr__
raise AssertionError
raise NotImplementedError
Best Practices
1. Test Organization
tests/
├── unit/ # Unit tests
│ ├── test_road_network.py
│ └── test_api.py
├── integration/ # Integration tests
│ └── test_end_to_end.py
└── conftest.py # Shared fixtures
2. Test Naming
# ✅ Descriptive test names
def test_find_roads_in_bbox_returns_empty_list_for_area_with_no_roads():
pass
def test_update_road_speed_limit_updates_correct_field():
pass
# ❌ Unclear test names
def test_bbox():
pass
def test_update():
pass
3. Test Data Management
@pytest.fixture(scope="session")
def sample_roads_data():
"""Load test data once for all tests."""
return load_test_data_from_file("sample_roads.csv")
@pytest.fixture
def road_network(sample_roads_data):
"""Create fresh network for each test."""
with patch("builtins.open", mock_open(read_data=sample_roads_data)):
return RoadNetwork("dummy_path.csv")
Common Pitfalls
1. Testing Implementation Details
# ❌ Testing internal state
def test_internal_index():
network = RoadNetwork("data.csv")
assert "_index" in network.__dict__ # Too specific
# ✅ Testing public behavior
def test_can_find_road_by_id():
network = RoadNetwork("data.csv")
road = network.get_road_by_id("R001")
assert road is not None
2. Over-Mocking
# ❌ Mocking everything
@patch("pathlib.Path")
@patch("csv.DictReader")
@patch("shapely.wkt.loads")
def test_over_mocked(mock_loads, mock_reader, mock_path):
# Test becomes brittle and hard to maintain
# ✅ Mock only external dependencies
def test_with_minimal_mocking():
with patch("builtins.open", mock_open(read_data=sample_csv)):
network = RoadNetwork("data.csv")
# Test the actual logic
3. Ignoring Edge Cases
# ❌ Only testing happy path
def test_bbox_query():
roads = network.find_roads_in_bbox(37.0, -122.0, 38.0, -121.0)
assert len(roads) > 0
# ✅ Test edge cases too
def test_bbox_query_empty_area():
roads = network.find_roads_in_bbox(0.0, 0.0, 1.0, 1.0)
assert len(roads) == 0
def test_bbox_query_invalid_coordinates():
with pytest.raises(ValueError):
network.find_roads_in_bbox(38.0, -122.0, 37.0, -121.0)
Next Steps
After completing this day: 1. Achieve >80% test coverage on core functionality 2. Add integration tests with real data 3. Implement performance benchmarks 4. Set up continuous integration (CI) pipeline 5. Add property-based tests for complex logic
Advanced Testing Patterns
Chaos Engineering for Geospatial Systems
class SpatialChaosExperiment:
"""Chaos engineering for testing spatial system resilience."""
def __init__(self, spatial_service: SpatialService):
self.service = spatial_service
self.failures_injected = []
async def inject_coordinate_drift(self, drift_factor: float = 0.001):
"""Inject small coordinate drifts to test precision handling."""
original_transform = self.service.coordinate_transformer.transform
def drifted_transform(x: float, y: float) -> Tuple[float, float]:
drift_x = x + (random.random() - 0.5) * drift_factor
drift_y = y + (random.random() - 0.5) * drift_factor
return original_transform(drift_x, drift_y)
self.service.coordinate_transformer.transform = drifted_transform
self.failures_injected.append("coordinate_drift")
async def simulate_spatial_index_corruption(self):
"""Simulate partial spatial index corruption."""
# Randomly remove some entries from spatial index
if hasattr(self.service, '_spatial_index'):
index = self.service._spatial_index
entries = list(index.intersection(index.bounds))
corrupted_entries = random.sample(entries, len(entries) // 10)
for entry_id in corrupted_entries:
try:
index.delete(entry_id, index.get_bounds(entry_id))
except:
pass # Index might already be corrupted
self.failures_injected.append("index_corruption")
async def induce_memory_pressure(self, target_mb: int = 100):
"""Create memory pressure to test resource handling."""
memory_hog = []
try:
# Allocate memory in chunks
for _ in range(target_mb):
memory_hog.append(b'x' * (1024 * 1024)) # 1MB chunks
# Hold memory for test duration
await asyncio.sleep(1)
finally:
del memory_hog
self.failures_injected.append("memory_pressure")
@pytest.fixture
async def chaos_experiment(spatial_service):
experiment = SpatialChaosExperiment(spatial_service)
yield experiment
# Cleanup: restore original state if possible
if hasattr(spatial_service, '_reset_to_clean_state'):
await spatial_service._reset_to_clean_state()
class TestSpatialResilience:
"""Test suite for spatial system resilience under chaos conditions."""
@pytest.mark.asyncio
async def test_bbox_query_under_coordinate_drift(self, chaos_experiment):
"""Test that bbox queries remain stable under coordinate drift."""
# Inject coordinate drift
await chaos_experiment.inject_coordinate_drift(drift_factor=0.0001)
# Original query
bbox = (37.0, -122.0, 38.0, -121.0)
results_before = await chaos_experiment.service.query_bbox(bbox)
# Query with drift should still return reasonable results
results_after = await chaos_experiment.service.query_bbox(bbox)
# Results should be similar (allowing for some drift tolerance)
assert abs(len(results_before) - len(results_after)) <= 2
@pytest.mark.asyncio
async def test_spatial_operations_during_memory_pressure(self, chaos_experiment):
"""Test spatial operations continue working under memory pressure."""
# Create memory pressure
memory_task = asyncio.create_task(
chaos_experiment.induce_memory_pressure(target_mb=50)
)
try:
# Perform spatial operations during memory pressure
bbox = (37.0, -122.0, 38.0, -121.0)
results = await chaos_experiment.service.query_bbox(bbox)
# Should not fail completely, might have reduced performance
assert isinstance(results, list)
finally:
await memory_task
Mutation Testing for Spatial Algorithms
# Install: pip install mutmut
# Usage: mutmut run --paths-to-mutate=src/spatial_algorithms/
class TestSpatialAlgorithmRobustness:
"""Test suite designed to catch mutations in spatial algorithms."""
def test_distance_calculation_edge_cases(self):
"""Comprehensive tests to catch distance calculation mutations."""
from src.spatial_algorithms import calculate_distance
# Test identical points
assert calculate_distance(0, 0, 0, 0) == 0
# Test symmetric property
d1 = calculate_distance(1, 1, 2, 2)
d2 = calculate_distance(2, 2, 1, 1)
assert abs(d1 - d2) < 1e-10
# Test triangle inequality
p1, p2, p3 = (0, 0), (1, 0), (1, 1)
d12 = calculate_distance(*p1, *p2)
d23 = calculate_distance(*p2, *p3)
d13 = calculate_distance(*p1, *p3)
assert d12 + d23 >= d13 - 1e-10
# Test known distances
assert abs(calculate_distance(0, 0, 3, 4) - 5.0) < 1e-10
# Test extreme values
huge_val = 1e10
assert calculate_distance(0, 0, huge_val, 0) == huge_val
def test_bbox_intersection_mutations(self):
"""Tests designed to catch bbox intersection logic mutations."""
from src.spatial_algorithms import bbox_intersects
# Test complete overlap
bbox1 = (0, 0, 2, 2)
bbox2 = (1, 1, 3, 3)
assert bbox_intersects(bbox1, bbox2) == True
# Test no overlap
bbox1 = (0, 0, 1, 1)
bbox2 = (2, 2, 3, 3)
assert bbox_intersects(bbox1, bbox2) == False
# Test edge touching (depends on implementation)
bbox1 = (0, 0, 1, 1)
bbox2 = (1, 1, 2, 2)
result = bbox_intersects(bbox1, bbox2)
assert isinstance(result, bool) # Should not crash
# Test degenerate bboxes
bbox1 = (0, 0, 0, 0) # Point
bbox2 = (0, 0, 1, 1)
result = bbox_intersects(bbox1, bbox2)
assert isinstance(result, bool)
Fuzzing for Geospatial Input Validation
import hypothesis.strategies as st
from hypothesis import given, assume, settings, HealthCheck
class TestGeospatialInputFuzzing:
"""Fuzz testing for geospatial input handling."""
@given(
lat=st.floats(min_value=-90, max_value=90, allow_nan=False),
lon=st.floats(min_value=-180, max_value=180, allow_nan=False)
)
def test_coordinate_validation_never_crashes(self, lat, lon):
"""Coordinate validation should never crash, regardless of input."""
from src.spatial_validation import validate_coordinate
# Should either return True or raise a specific exception
try:
result = validate_coordinate(lat, lon)
assert isinstance(result, bool)
except (ValueError, TypeError) as e:
# Acceptable to raise validation errors
assert str(e) # Error message should not be empty
@given(
coords=st.lists(
st.tuples(
st.floats(min_value=-180, max_value=180, allow_nan=False),
st.floats(min_value=-90, max_value=90, allow_nan=False)
),
min_size=3,
max_size=1000
)
)
@settings(suppress_health_check=[HealthCheck.too_slow], deadline=5000)
def test_polygon_validation_fuzzing(self, coords):
"""Polygon validation should handle arbitrary coordinate sequences."""
from src.spatial_validation import validate_polygon
# Ensure polygon is closed
if coords and coords[0] != coords[-1]:
coords.append(coords[0])
try:
result = validate_polygon(coords)
assert isinstance(result, bool)
# If valid, polygon should have basic properties
if result:
assert len(coords) >= 4 # Minimum for closed polygon
assert coords[0] == coords[-1] # Must be closed
except (ValueError, TypeError, OverflowError) as e:
# Acceptable exceptions for invalid input
pass
@given(
geojson_like=st.recursive(
st.one_of(
st.floats(allow_nan=False, allow_infinity=False),
st.text(),
st.booleans(),
st.none()
),
lambda children: st.one_of(
st.lists(children, max_size=20),
st.dictionaries(st.text(max_size=50), children, max_size=10)
),
max_leaves=100
)
)
def test_geojson_parser_robustness(self, geojson_like):
"""GeoJSON parser should handle arbitrary JSON-like structures."""
from src.geojson_parser import parse_geojson
try:
result = parse_geojson(geojson_like)
# If parsing succeeds, result should be valid
assert hasattr(result, 'type') or result is None
except (ValueError, TypeError, KeyError) as e:
# Expected for invalid GeoJSON
pass
Performance and Load Testing Framework
class SpatialLoadTestFramework:
"""Framework for load testing spatial services."""
def __init__(self, base_url: str):
self.base_url = base_url
self.session = httpx.AsyncClient()
self.metrics = {
'requests_sent': 0,
'requests_succeeded': 0,
'requests_failed': 0,
'total_latency': 0,
'min_latency': float('inf'),
'max_latency': 0,
'latencies': []
}
async def generate_spatial_workload(
self,
concurrent_users: int = 50,
requests_per_user: int = 100,
test_duration_seconds: int = 300
):
"""Generate realistic spatial query workload."""
async def user_session(user_id: int):
"""Simulate individual user behavior."""
requests_made = 0
start_time = time.time()
while (requests_made < requests_per_user and
time.time() - start_time < test_duration_seconds):
# Generate realistic spatial query
bbox = self._generate_realistic_bbox()
request_start = time.time()
try:
response = await self.session.get(
f"{self.base_url}/api/features/bbox",
params={
'min_lat': bbox[0], 'min_lon': bbox[1],
'max_lat': bbox[2], 'max_lon': bbox[3]
},
timeout=30.0
)
latency = time.time() - request_start
self._record_success(latency)
if response.status_code != 200:
self._record_failure(f"HTTP {response.status_code}")
except Exception as e:
self._record_failure(str(e))
requests_made += 1
# Realistic user think time
await asyncio.sleep(random.uniform(0.1, 2.0))
# Run concurrent user sessions
tasks = [
asyncio.create_task(user_session(i))
for i in range(concurrent_users)
]
await asyncio.gather(*tasks, return_exceptions=True)
return self._generate_report()
def _generate_realistic_bbox(self) -> Tuple[float, float, float, float]:
"""Generate realistic bounding boxes based on typical usage patterns."""
# Focus on populated areas with varying zoom levels
city_centers = [
(37.7749, -122.4194), # San Francisco
(40.7128, -74.0060), # New York
(51.5074, -0.1278), # London
(35.6762, 139.6503), # Tokyo
]
center = random.choice(city_centers)
# Random zoom level affects bbox size
zoom_level = random.randint(10, 18)
size = 1.0 / (2 ** (zoom_level - 10)) # Smaller at higher zoom
return (
center[0] - size,
center[1] - size,
center[0] + size,
center[1] + size
)
@pytest.mark.asyncio
@pytest.mark.load_test
async def test_spatial_api_under_load():
"""Load test for spatial API endpoints."""
framework = SpatialLoadTestFramework("http://localhost:8000")
report = await framework.generate_spatial_workload(
concurrent_users=25,
requests_per_user=50,
test_duration_seconds=60
)
# Performance assertions
assert report['success_rate'] > 0.95 # 95% success rate
assert report['p95_latency'] < 2.0 # 95th percentile under 2 seconds
assert report['avg_latency'] < 0.5 # Average under 500ms
CI/CD Integration Patterns
# pytest.ini configuration for geospatial testing
"""
[tool:pytest]
minversion = 6.0
addopts =
-ra
-q
--strict-markers
--strict-config
--cov=src
--cov-branch
--cov-report=term-missing
--cov-report=html
--cov-fail-under=80
--hypothesis-show-statistics
markers =
unit: Unit tests
integration: Integration tests
load_test: Load and performance tests
chaos: Chaos engineering tests
slow: Tests that take more than 5 seconds
spatial: Tests that require spatial data/operations
external: Tests that require external services
testpaths = tests
python_files = test_*.py
python_classes = Test*
python_functions = test_*
filterwarnings =
error
ignore::UserWarning
ignore::DeprecationWarning
"""
class GeospatialTestConfig:
"""Configuration for geospatial test environments."""
@staticmethod
def setup_test_database():
"""Set up PostGIS test database."""
import psycopg2
from psycopg2.extensions import ISOLATION_LEVEL_AUTOCOMMIT
# Create test database with PostGIS
conn = psycopg2.connect(
host="localhost",
user="postgres",
password="password"
)
conn.set_isolation_level(ISOLATION_LEVEL_AUTOCOMMIT)
cursor = conn.cursor()
cursor.execute("DROP DATABASE IF EXISTS test_geospatial")
cursor.execute("CREATE DATABASE test_geospatial")
# Connect to test database and enable PostGIS
test_conn = psycopg2.connect(
host="localhost",
user="postgres",
password="password",
database="test_geospatial"
)
test_cursor = test_conn.cursor()
test_cursor.execute("CREATE EXTENSION IF NOT EXISTS postgis")
test_conn.commit()
return "postgresql://postgres:password@localhost/test_geospatial"
@staticmethod
def setup_test_data():
"""Generate test spatial datasets."""
import geopandas as gpd
from shapely.geometry import Point, Polygon
# Generate synthetic spatial data
test_features = []
for i in range(1000):
# Random points around San Francisco
lat = 37.7749 + random.uniform(-0.1, 0.1)
lon = -122.4194 + random.uniform(-0.1, 0.1)
feature = {
'id': f'feature_{i}',
'geometry': Point(lon, lat),
'properties': {
'name': f'Test Feature {i}',
'category': random.choice(['restaurant', 'shop', 'park']),
'rating': random.uniform(1.0, 5.0)
}
}
test_features.append(feature)
# Create GeoDataFrame and save as test file
gdf = gpd.GeoDataFrame(test_features)
gdf.to_file('test_data/synthetic_features.geojson', driver='GeoJSON')
return gdf
# GitHub Actions workflow for geospatial testing
"""
name: Geospatial Testing Pipeline
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
services:
postgres:
image: postgis/postgis:13-3.1
env:
POSTGRES_PASSWORD: password
POSTGRES_DB: test_geospatial
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 5432:5432
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install GDAL
run: |
sudo apt-get update
sudo apt-get install -y gdal-bin libgdal-dev
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install -r requirements-test.txt
- name: Run unit tests
run: pytest tests/ -m "unit" --cov=src --cov-report=xml
- name: Run integration tests
run: pytest tests/ -m "integration"
env:
DATABASE_URL: postgresql://postgres:password@localhost:5432/test_geospatial
- name: Run spatial accuracy tests
run: pytest tests/ -m "spatial" --hypothesis-show-statistics
- name: Upload coverage reports
uses: codecov/codecov-action@v3
with:
file: ./coverage.xml
flags: unittests
name: codecov-umbrella
"""
Data-Driven Testing for Geospatial Systems
Test Data Management
class SpatialTestDataManager:
"""Manages test datasets for geospatial testing."""
def __init__(self, data_dir: Path):
self.data_dir = Path(data_dir)
self.datasets = {}
def register_dataset(self, name: str, generator_func: Callable):
"""Register a test dataset generator."""
self.datasets[name] = generator_func
def get_dataset(self, name: str, **kwargs) -> Any:
"""Get or generate a test dataset."""
cache_key = f"{name}_{hash(str(kwargs))}"
cache_file = self.data_dir / f"{cache_key}.pkl"
if cache_file.exists():
with open(cache_file, 'rb') as f:
return pickle.load(f)
# Generate dataset
dataset = self.datasets[name](**kwargs)
# Cache for future use
cache_file.parent.mkdir(parents=True, exist_ok=True)
with open(cache_file, 'wb') as f:
pickle.dump(dataset, f)
return dataset
# Test data generators
def generate_road_network(num_roads: int = 100, area_bounds: Tuple = None):
"""Generate synthetic road network for testing."""
if area_bounds is None:
area_bounds = (37.7, -122.5, 37.8, -122.4) # SF area
roads = []
for i in range(num_roads):
# Generate random road segments
start_lat = random.uniform(area_bounds[0], area_bounds[2])
start_lon = random.uniform(area_bounds[1], area_bounds[3])
# Random direction and length
bearing = random.uniform(0, 360)
length = random.uniform(0.001, 0.01) # degrees
end_lat = start_lat + length * math.cos(math.radians(bearing))
end_lon = start_lon + length * math.sin(math.radians(bearing))
road = {
'id': f'road_{i}',
'geometry': f'LINESTRING({start_lon} {start_lat}, {end_lon} {end_lat})',
'properties': {
'name': f'Test Road {i}',
'speed_limit': random.choice([25, 35, 45, 55]),
'road_type': random.choice(['residential', 'arterial', 'highway'])
}
}
roads.append(road)
return roads
# Usage in tests
@pytest.fixture(scope="session")
def spatial_test_data():
manager = SpatialTestDataManager(Path("test_data"))
manager.register_dataset("road_network", generate_road_network)
return manager
class TestWithSpatialData:
def test_road_network_loading(self, spatial_test_data):
"""Test road network loading with different dataset sizes."""
small_network = spatial_test_data.get_dataset("road_network", num_roads=10)
large_network = spatial_test_data.get_dataset("road_network", num_roads=1000)
assert len(small_network) == 10
assert len(large_network) == 1000
Resources
Testing Frameworks and Tools
- pytest Documentation
- pytest-cov Coverage Plugin
- Hypothesis Property-Based Testing
- mutmut Mutation Testing
- locust Load Testing
Geospatial Testing
Quality Engineering
- FastAPI Testing Guide
- Test-Driven Development by Example
- Growing Object-Oriented Software, Guided by Tests
- Building Quality Software by NASA
Chaos Engineering
This comprehensive testing module ensures that geospatial systems are robust, reliable, and performant under all conditions, from normal operation to extreme failure scenarios.