mirror of
https://github.com/EstrellaXD/Auto_Bangumi.git
synced 2026-02-03 02:04:06 +08:00
docs(dev): add database developer guide
Comprehensive documentation covering: - Database architecture and components - Model schemas (Bangumi, RSSItem, Torrent, User) - Common CRUD operations for each sub-database - Caching strategy and invalidation - Migration system and how to add new migrations - Performance patterns (batch queries, regex matching, indexes) - Testing setup with factories - Common issues and solutions Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
394
docs/dev/database.md
Normal file
394
docs/dev/database.md
Normal file
@@ -0,0 +1,394 @@
|
||||
# Database Developer Guide
|
||||
|
||||
This guide covers the database architecture, models, and operations in AutoBangumi.
|
||||
|
||||
## Overview
|
||||
|
||||
AutoBangumi uses **SQLite** as its database with **SQLModel** (Pydantic + SQLAlchemy hybrid) for ORM. The database file is located at `data/data.db`.
|
||||
|
||||
### Architecture
|
||||
|
||||
```
|
||||
module/database/
|
||||
├── engine.py # SQLAlchemy engine configuration
|
||||
├── combine.py # Database class, migrations, session management
|
||||
├── bangumi.py # Bangumi (anime subscription) operations
|
||||
├── rss.py # RSS feed operations
|
||||
├── torrent.py # Torrent tracking operations
|
||||
└── user.py # User authentication operations
|
||||
```
|
||||
|
||||
## Core Components
|
||||
|
||||
### Database Class
|
||||
|
||||
The `Database` class in `combine.py` is the main entry point. It inherits from SQLModel's `Session` and provides access to all sub-databases:
|
||||
|
||||
```python
|
||||
from module.database import Database
|
||||
|
||||
with Database() as db:
|
||||
# Access sub-databases
|
||||
bangumis = db.bangumi.search_all()
|
||||
rss_items = db.rss.search_active()
|
||||
torrents = db.torrent.search_all()
|
||||
```
|
||||
|
||||
### Sub-Database Classes
|
||||
|
||||
| Class | Model | Purpose |
|
||||
|-------|-------|---------|
|
||||
| `BangumiDatabase` | `Bangumi` | Anime subscription rules |
|
||||
| `RSSDatabase` | `RSSItem` | RSS feed sources |
|
||||
| `TorrentDatabase` | `Torrent` | Downloaded torrent tracking |
|
||||
| `UserDatabase` | `User` | Authentication |
|
||||
|
||||
## Models
|
||||
|
||||
### Bangumi Model
|
||||
|
||||
Core model for anime subscriptions:
|
||||
|
||||
```python
|
||||
class Bangumi(SQLModel, table=True):
|
||||
id: int # Primary key
|
||||
official_title: str # Display name (e.g., "Mushoku Tensei")
|
||||
title_raw: str # Raw title for torrent matching (indexed)
|
||||
season: int = 1 # Season number
|
||||
episode_offset: int = 0 # Episode numbering adjustment
|
||||
season_offset: int = 0 # Season numbering adjustment
|
||||
rss_link: str # Comma-separated RSS feed URLs
|
||||
filter: str # Exclusion filter (e.g., "720,\\d+-\\d+")
|
||||
poster_link: str # TMDB poster URL
|
||||
save_path: str # Download destination path
|
||||
rule_name: str # qBittorrent RSS rule name
|
||||
added: bool = False # Whether rule is added to downloader
|
||||
deleted: bool = False # Soft delete flag (indexed)
|
||||
archived: bool = False # For completed series (indexed)
|
||||
needs_review: bool = False # Offset mismatch detected
|
||||
needs_review_reason: str # Reason for review
|
||||
suggested_season_offset: int # Suggested season offset
|
||||
suggested_episode_offset: int # Suggested episode offset
|
||||
air_weekday: int # Airing day (0=Sunday, 6=Saturday)
|
||||
```
|
||||
|
||||
### RSSItem Model
|
||||
|
||||
RSS feed subscriptions:
|
||||
|
||||
```python
|
||||
class RSSItem(SQLModel, table=True):
|
||||
id: int # Primary key
|
||||
name: str # Display name
|
||||
url: str # Feed URL (unique, indexed)
|
||||
aggregate: bool = True # Whether to parse torrents
|
||||
parser: str = "mikan" # Parser type: mikan, dmhy, nyaa
|
||||
enabled: bool = True # Active flag
|
||||
connection_status: str # "healthy" or "error"
|
||||
last_checked_at: str # ISO timestamp
|
||||
last_error: str # Last error message
|
||||
```
|
||||
|
||||
### Torrent Model
|
||||
|
||||
Tracks downloaded torrents:
|
||||
|
||||
```python
|
||||
class Torrent(SQLModel, table=True):
|
||||
id: int # Primary key
|
||||
name: str # Torrent name (indexed)
|
||||
url: str # Torrent/magnet URL (unique, indexed)
|
||||
rss_id: int # Source RSS feed ID
|
||||
bangumi_id: int # Linked Bangumi ID (nullable)
|
||||
qb_hash: str # qBittorrent info hash (indexed)
|
||||
downloaded: bool = False # Download completed
|
||||
```
|
||||
|
||||
## Common Operations
|
||||
|
||||
### BangumiDatabase
|
||||
|
||||
```python
|
||||
with Database() as db:
|
||||
# Create
|
||||
db.bangumi.add(bangumi) # Single insert
|
||||
db.bangumi.add_all(bangumi_list) # Batch insert (deduplicates)
|
||||
|
||||
# Read
|
||||
db.bangumi.search_all() # All records (cached, 5min TTL)
|
||||
db.bangumi.search_id(123) # By ID
|
||||
db.bangumi.match_torrent("torrent name") # Find by title_raw match
|
||||
db.bangumi.not_complete() # Incomplete series
|
||||
db.bangumi.get_needs_review() # Flagged for review
|
||||
|
||||
# Update
|
||||
db.bangumi.update(bangumi) # Update single record
|
||||
db.bangumi.update_all(bangumi_list) # Batch update
|
||||
|
||||
# Delete
|
||||
db.bangumi.delete_one(123) # Hard delete
|
||||
db.bangumi.disable_rule(123) # Soft delete (deleted=True)
|
||||
```
|
||||
|
||||
### RSSDatabase
|
||||
|
||||
```python
|
||||
with Database() as db:
|
||||
# Create
|
||||
db.rss.add(rss_item) # Single insert
|
||||
db.rss.add_all(rss_items) # Batch insert (deduplicates)
|
||||
|
||||
# Read
|
||||
db.rss.search_all() # All feeds
|
||||
db.rss.search_active() # Enabled feeds only
|
||||
db.rss.search_aggregate() # Enabled + aggregate=True
|
||||
|
||||
# Update
|
||||
db.rss.update(id, rss_update) # Partial update
|
||||
db.rss.enable(id) # Enable feed
|
||||
db.rss.disable(id) # Disable feed
|
||||
db.rss.enable_batch([1, 2, 3]) # Batch enable
|
||||
db.rss.disable_batch([1, 2, 3]) # Batch disable
|
||||
```
|
||||
|
||||
### TorrentDatabase
|
||||
|
||||
```python
|
||||
with Database() as db:
|
||||
# Create
|
||||
db.torrent.add(torrent) # Single insert
|
||||
db.torrent.add_all(torrents) # Batch insert
|
||||
|
||||
# Read
|
||||
db.torrent.search_all() # All torrents
|
||||
db.torrent.search_by_qb_hash(hash) # By qBittorrent hash
|
||||
db.torrent.search_by_url(url) # By URL
|
||||
db.torrent.check_new(torrents) # Filter out existing
|
||||
|
||||
# Update
|
||||
db.torrent.update_qb_hash(id, hash) # Set qb_hash
|
||||
```
|
||||
|
||||
## Caching
|
||||
|
||||
### Bangumi Cache
|
||||
|
||||
`search_all()` results are cached at the module level with a 5-minute TTL:
|
||||
|
||||
```python
|
||||
# Module-level cache in bangumi.py
|
||||
_bangumi_cache: list[Bangumi] | None = None
|
||||
_bangumi_cache_time: float = 0
|
||||
_BANGUMI_CACHE_TTL: float = 300.0 # 5 minutes
|
||||
|
||||
# Cache invalidation
|
||||
def _invalidate_bangumi_cache():
|
||||
global _bangumi_cache, _bangumi_cache_time
|
||||
_bangumi_cache = None
|
||||
_bangumi_cache_time = 0
|
||||
```
|
||||
|
||||
**Important:** The cache is automatically invalidated on:
|
||||
- `add()`, `add_all()`
|
||||
- `update()`, `update_all()`
|
||||
- `delete_one()`, `delete_all()`
|
||||
- `archive_one()`, `unarchive_one()`
|
||||
- Any RSS link update operations
|
||||
|
||||
### Session Expunge
|
||||
|
||||
Cached objects are **expunged** from the session to prevent `DetachedInstanceError`:
|
||||
|
||||
```python
|
||||
for b in bangumis:
|
||||
self.session.expunge(b) # Detach from session
|
||||
```
|
||||
|
||||
## Migration System
|
||||
|
||||
### Schema Versioning
|
||||
|
||||
Migrations are tracked via a `schema_version` table:
|
||||
|
||||
```python
|
||||
CURRENT_SCHEMA_VERSION = 7
|
||||
|
||||
# Each migration: (version, description, [SQL statements])
|
||||
MIGRATIONS = [
|
||||
(1, "add air_weekday column", [...]),
|
||||
(2, "add connection status columns", [...]),
|
||||
(3, "create passkey table", [...]),
|
||||
(4, "add archived column", [...]),
|
||||
(5, "rename offset to episode_offset", [...]),
|
||||
(6, "add qb_hash column", [...]),
|
||||
(7, "add suggested offset columns", [...]),
|
||||
]
|
||||
```
|
||||
|
||||
### Adding a New Migration
|
||||
|
||||
1. Increment `CURRENT_SCHEMA_VERSION` in `combine.py`
|
||||
2. Add migration tuple to `MIGRATIONS` list:
|
||||
|
||||
```python
|
||||
MIGRATIONS = [
|
||||
# ... existing migrations ...
|
||||
(
|
||||
8,
|
||||
"add my_new_column to bangumi",
|
||||
[
|
||||
"ALTER TABLE bangumi ADD COLUMN my_new_column TEXT DEFAULT NULL",
|
||||
],
|
||||
),
|
||||
]
|
||||
```
|
||||
|
||||
3. Add idempotency check in `run_migrations()`:
|
||||
|
||||
```python
|
||||
if "bangumi" in tables and version == 8:
|
||||
columns = [col["name"] for col in inspector.get_columns("bangumi")]
|
||||
if "my_new_column" in columns:
|
||||
needs_run = False
|
||||
```
|
||||
|
||||
4. Update the corresponding Pydantic model in `module/models/`
|
||||
|
||||
### Default Value Backfill
|
||||
|
||||
After migrations, `_fill_null_with_defaults()` automatically fills NULL values based on model defaults:
|
||||
|
||||
```python
|
||||
# If model defines:
|
||||
class Bangumi(SQLModel, table=True):
|
||||
my_field: bool = False
|
||||
|
||||
# Then existing rows with NULL will be updated to False
|
||||
```
|
||||
|
||||
## Performance Patterns
|
||||
|
||||
### Batch Queries
|
||||
|
||||
`add_all()` uses a single query to check for duplicates instead of N queries:
|
||||
|
||||
```python
|
||||
# Efficient: single SELECT
|
||||
keys_to_check = [(d.title_raw, d.group_name) for d in datas]
|
||||
conditions = [
|
||||
and_(Bangumi.title_raw == tr, Bangumi.group_name == gn)
|
||||
for tr, gn in keys_to_check
|
||||
]
|
||||
statement = select(Bangumi.title_raw, Bangumi.group_name).where(or_(*conditions))
|
||||
```
|
||||
|
||||
### Regex Matching
|
||||
|
||||
`match_list()` compiles a single regex pattern for all title matches:
|
||||
|
||||
```python
|
||||
# Compile once, match many
|
||||
sorted_titles = sorted(title_index.keys(), key=len, reverse=True)
|
||||
pattern = "|".join(re.escape(title) for title in sorted_titles)
|
||||
title_regex = re.compile(pattern)
|
||||
|
||||
# O(1) lookup per torrent instead of O(n)
|
||||
for torrent in torrent_list:
|
||||
match = title_regex.search(torrent.name)
|
||||
```
|
||||
|
||||
### Indexed Columns
|
||||
|
||||
The following columns have indexes for fast lookups:
|
||||
|
||||
| Table | Column | Index Type |
|
||||
|-------|--------|------------|
|
||||
| `bangumi` | `title_raw` | Regular |
|
||||
| `bangumi` | `deleted` | Regular |
|
||||
| `bangumi` | `archived` | Regular |
|
||||
| `rssitem` | `url` | Unique |
|
||||
| `torrent` | `name` | Regular |
|
||||
| `torrent` | `url` | Unique |
|
||||
| `torrent` | `qb_hash` | Regular |
|
||||
|
||||
## Testing
|
||||
|
||||
### Test Database Setup
|
||||
|
||||
Tests use an in-memory SQLite database:
|
||||
|
||||
```python
|
||||
# conftest.py
|
||||
@pytest.fixture
|
||||
def db_engine():
|
||||
engine = create_engine("sqlite:///:memory:")
|
||||
SQLModel.metadata.create_all(engine)
|
||||
yield engine
|
||||
engine.dispose()
|
||||
|
||||
@pytest.fixture
|
||||
def db_session(db_engine):
|
||||
with Session(db_engine) as session:
|
||||
yield session
|
||||
```
|
||||
|
||||
### Factory Functions
|
||||
|
||||
Use factory functions for creating test data:
|
||||
|
||||
```python
|
||||
from test.factories import make_bangumi, make_torrent, make_rss_item
|
||||
|
||||
def test_bangumi_search():
|
||||
bangumi = make_bangumi(title_raw="Test Title", season=2)
|
||||
# ... test logic
|
||||
```
|
||||
|
||||
## Design Notes
|
||||
|
||||
### No Foreign Keys
|
||||
|
||||
SQLite foreign key enforcement is disabled by default. Relationships (like `Torrent.bangumi_id`) are managed in application logic rather than database constraints.
|
||||
|
||||
### Soft Deletes
|
||||
|
||||
The `Bangumi.deleted` flag enables soft deletes. Queries should filter by `deleted=False` for user-facing data:
|
||||
|
||||
```python
|
||||
statement = select(Bangumi).where(Bangumi.deleted == false())
|
||||
```
|
||||
|
||||
### Torrent Tagging
|
||||
|
||||
Torrents are tagged in qBittorrent with `ab:{bangumi_id}` for offset lookup during rename operations. This enables fast bangumi identification without database queries.
|
||||
|
||||
## Common Issues
|
||||
|
||||
### DetachedInstanceError
|
||||
|
||||
If you access cached objects from a different session:
|
||||
|
||||
```python
|
||||
# Wrong: accessing cached object in new session
|
||||
bangumis = db.bangumi.search_all() # Cached
|
||||
with Database() as new_db:
|
||||
new_db.session.add(bangumis[0]) # Error!
|
||||
|
||||
# Right: objects are expunged, work independently
|
||||
bangumis = db.bangumi.search_all()
|
||||
bangumis[0].title_raw = "New Title" # OK, but won't persist
|
||||
```
|
||||
|
||||
### Cache Staleness
|
||||
|
||||
If manual SQL updates bypass the ORM, invalidate the cache:
|
||||
|
||||
```python
|
||||
from module.database.bangumi import _invalidate_bangumi_cache
|
||||
|
||||
with engine.connect() as conn:
|
||||
conn.execute(text("UPDATE bangumi SET ..."))
|
||||
conn.commit()
|
||||
|
||||
_invalidate_bangumi_cache() # Important!
|
||||
```
|
||||
Reference in New Issue
Block a user