Summary of "Стратегия тестирования баз данных — вебинар"

Webinar overview

This summary covers a webinar by Alexander (database engineer / team lead at Bestos Technologies) about database testing strategies for large schemas and migration projects (source → target DB). The talk focused on types of database tests, migration-specific challenges, test generation and automation, running tests at scale, measuring code coverage, and recommended processes and tooling.

Key concepts and testing types

Test categories:
- Unit tests: functions and procedures.
- CRUD tests: insert/update/select/delete.
- Integration tests: database ↔ other components.
- Stress / load tests and performance tests.
- Permission and configuration tests.
Test structure:
- Single test scripts (SQL) with parameterization.
- Grouping of tests into projects or suites for individual or batch runs.
Migration-specific checks:
- Object presence and validity after migration.
- Data comparisons, including type and whitespace/format issues.
- Behavioral equivalence of stored code (procedures/functions).
- Preservation and formatting of user-facing error messages.
- Differences between DBMS dialects and transaction modes.

Test-first approach recommended: write tests for the source and expected target behavior before migration where possible.

Large-schema / migration project realities

Example project scale cited: ~6,000 tables/views, ~5,000 triggers, ~100,000 stored procedures/functions across layers.
Team and environment constraints:
- Small test teams (2–5 people) vs huge object counts.
- Limited test environments and many customer configurations.
- Repository size limits (e.g., Bitbucket 2 GB) and VPN/access constraints.
Maintenance cost: tests must be updated as refactorings and conversion rules change — recurring project expense.

Test coverage strategies and scoping

Scoping choices to decide with stakeholders:
- Test only objects actually called by applications (top-layer).
- Cover layer-by-layer (progressively deeper).
- Cover only migrated objects.
- Attempt whole-database coverage (very costly).
Prioritization guidance:
- Start with objects exercised by applications (triggers, tables, views) to get early value.
- Expand deeper by layer as time and budget allow.
- Decide up front whether to include auxiliary objects (DB-admin, third-party).

Test creation approaches and automation

Creation approaches:
- Manual authoring (slow; ~15 minutes per test on average).
- Template-based project generation.
- Intelligent automated generation using schema metadata.
Automation practices described:
- Inspect the data dictionary (column types, defaults, constraints) to generate inputs.
- Use customer-provided test data and application trace logs to derive parameters and sequences.
- Use Perl scripts to generate project/test XML templates; store individual tests as editable JSON for quick fixes.
- Use templates so mass changes (e.g., conversion-rule changes) can be applied programmatically.
Benefits:
- Faster ramp-up and less tedious than fully manual work.
- Easier bulk adaptation after schema or type changes.

Test execution at scale

Orchestration:
- Jenkins runs test projects across cloud VMs/instances with multiple parallel runners to reduce wall time.
Performance example:
- Serial run on one 16 GB / 8-core machine: ~15–17 hours.
- Parallel runs (2 machines) roughly halve run time; 4+ parallel instances can bring runs under a working day.
Test design goal: minimize test mutual interference (locking) so concurrency is effective.

Code coverage: measuring and reporting

Coverage metrics:
- Count of covered objects.
- Line-level vs basic code-block coverage.
- Coverage ratio = covered blocks / total blocks.
Best practices:
- Start collecting coverage from day 0 to measure progress.
- Define what “100%” means before collecting statistics.
- Exclude unreachable code using pragmas/annotations so coverage is meaningful.
Instrumentation and tooling examples:
- Oracle: built-in coverage collection in recent versions (e.g., Oracle 12.2) — enable collection, run tests, disable, store stats.
- Piggly (Ruby): compiles functions/procedures with debug calls to log execution and produce HTML reports. The speaker’s team adapted it to write debug data to DB tables.
- Custom pipeline: collect logs/traces, filter by test-run sources (IP addresses) or use separate schemas to distinguish test vs application traffic, then merge coverage results.
Reporting:
- Jenkins + test-runner output provides pass/fail counts and code-coverage per schema/project.
- Example reported scale: ~24k CRUD tests for ~3.5k tables, ~25k unit tests for ~15k functions, 39 test projects; fail rate ~0.2% for “core” tests in their setup.

Operational recommendations and best practices

Define measurement scope and what “100%” means before collecting statistics.
Decide scope with stakeholders early to avoid wasted effort.
Automate test generation and project creation from day 1 where possible.
Use metadata (dictionary) and application traces to generate realistic test inputs.
Keep tests independent to enable parallel runs and reduce runtime.
Plan and budget for test maintenance — updates are required as conversion rules or DB logic change.
Separate test traffic from application traffic using separate schemas/databases or filtering to support parallel execution and merging of coverage data.
Store tests in editable formats (XML/JSON) so individual tests can be quickly inspected and edited without reloading whole projects.

Tools and product mentions

Bestos Technologies internal “test organizer” module (free mode available on their website).
Jenkins for orchestration and parallel execution.
Perl scripts for project/test generation from dictionary metadata.
Piggly (open-source Ruby project) for instrumentation and coverage reporting (adapted).
Bitbucket (mentioned for repository-size constraints).
General use of cloud VMs / CI runners.

Practical outcomes and numbers from the presented project

Object counts: thousands of DB objects (6k tables/views, 5k triggers, ~100k SPs/functions).
Tests created: ~24k CRUD tests (covering ~3.5k tables), ~25k unit tests (covering ~15k functions), 39 projects.
Test execution times: single-server serial runs ~15–17 hours; parallelization reduces runtime proportionally.
Coverage behavior: adding new code increased object count coverage but reduced block-level percentage — illustrates the need for a consistent measurement baseline.

Challenges highlighted

Limited staff and environments for very large projects.
Repository size and access constraints (e.g., Bitbucket 2 GB cap).
Mass refactorings requiring sweeping test updates unless templated/automated.
Sparse initial documentation — tests often become the best living documentation.
Difficulty convincing customers to invest in coverage and ongoing maintenance.

Actionable steps and tutorial-style guide

Define scope: decide which schemas and objects to include.
Prefer tests-first for new or changed objects when possible.
Use dictionary metadata and production traces to auto-generate realistic tests.
Store tests in editable formats (JSON/XML) to support quick edits and templated mass changes.
Automate project/test generation and CI execution early in the project.
Collect coverage metrics from day 0 and agree on what is measured.
Use pragmas/annotations to exclude unreachable code blocks from coverage calculations.

Speakers and sources

Alexander — database engineer and team lead, Bestos Technologies (primary presenter).
Bestos Technologies — provider of migration and testing services; maintains an internal test/organizer product.
Other mentions: Katya (answered a Q&A question), Piggly, Jenkins, Bitbucket.

Share this summary

Is the summary off?

If you think the summary is inaccurate, you can reprocess it with the latest model.

Summarize another video

Summary of "Стратегия тестирования баз данных — вебинар"

Webinar overview

Key concepts and testing types

Large-schema / migration project realities

Test coverage strategies and scoping

Test creation approaches and automation

Test execution at scale

Code coverage: measuring and reporting

Operational recommendations and best practices

Tools and product mentions

Practical outcomes and numbers from the presented project

Challenges highlighted

Actionable steps and tutorial-style guide

Speakers and sources

Category

Share this summary

Is the summary off?

Video

Summary of "Стратегия тестирования баз данных — вебинар"

Webinar overview

Key concepts and testing types

Large-schema / migration project realities

Test coverage strategies and scoping

Test creation approaches and automation

Test execution at scale

Code coverage: measuring and reporting

Operational recommendations and best practices

Tools and product mentions

Practical outcomes and numbers from the presented project

Challenges highlighted

Actionable steps and tutorial-style guide

Speakers and sources

Category ?

Share this summary

Is the summary off?

Video

Category