AdSlicerProXP is a fully automated advertisement detection and removal system built for long-form VHS captures, analog transfers, and TV recordings. It analyzes black slugs between program segments, determines likely commercial blocks, and outputs:
- A clean show file (ads removed)
- A folder of isolated commercial clips
- Full diagnostic logs in JSON, CSV, and EDL
- ffmeta chapter markers for media player navigation
- A structured ML dataset (
dataset.jsonl) with 74 per-segment feature columns - A run manifest recording all parameters and detection statistics
The pipeline is self-contained (ffmpeg bundled), batch-friendly, and designed for noisy analog sources where black slugs vary in length and clarity.
Topics
| βοΈ | How It Works β Detection pipeline, confidence scoring, export modes |
| βΆ | Quick Start β Get from launch to a clean show master |
| π§ | All Parameters β Complete parameter reference |
| πΎ | Preset System β Built-in presets, saving, and file format |
| π | Tuning Guide β Fixing false positives, missed breaks, noisy tape |
| π | ML Dataset Output β The 74-column dataset.jsonl format |
| π‘ | Use Cases β Archiving, batch processing, ML, compilations |
Quick Start
1. Choose your input
Select a single video file, or switch to Batch Folder mode and select a directory. All matching video files inside will be queued.
2. Set your output folder
A subfolder is created per input file β you'll never lose track of which output came from which tape. Results are never overwritten; re-processing appends _1, _2, etc.
3. Pick a preset
Open the Presets menu and choose the one closest to your source material. See Preset System for descriptions.
4. Dry run first
Enable Dry Run before committing to a full export. This runs the full detection pipeline and writes all logs, but cuts no media. Review detect.json and check dataset.jsonl for cuts where only sig_black_boundary fired and confidence is below 0.90 β those are the weakest detections and worth inspecting first.
5. Final export
Disable Dry Run. Enable Re-encode for frame-accurate archival cuts (H.264/AAC). Run.
Recommended workflow: 1. Enable Dry Run 2. Review activity log and detect.json 3. Adjust parameters if needed, re-run dry 4. Disable Dry Run, enable Re-encode 5. Run final export
-c copy for speed. Stream copy snaps to the nearest keyframe β a few frames of error at each boundary. Enable reencode for surgical precision.Output Structure
Embedding chapter markers
Keep segments become Content N chapters; commercial blocks become Advertisement N. Embed into the show file:
ffmpeg -i show.mp4 -i logs/chapters.ffmeta \ -map_metadata 1 -c copy show_with_chapters.mp4
Recognized by VLC, mpv, Kodi, and any player that reads ffmpeg metadata.
How It Works
Each commercial candidate passes through four independent detection passes in sequence. Every signal that fires is recorded against the interval and reflected in its confidence score.
Detection Passes
Uses ffmpeg
blackdetect to locate near-black frames. Segments shorter than blackMinDur are discarded. Segments within mergeGap seconds are merged.Uses ffmpeg
silencedetect. Candidates overlapping a silence segment by β₯ 0.5 s receive silence_overlap and a +0.05 confidence boost. Set silenceNoiseDb β₯ 0 to disable.Uses ffmpeg
showinfo to compute per-frame luma stddev. Frames with stddev β€ uniformMaxStddev are classified as uniform slates. Candidates overlapping β₯ 0.3 s receive uniform_overlap and +0.04 boost. Set uniformMaxStddev to 0 to disable.Uses ffmpeg
select=scene. Blocks exceeding 1.3Γ the file average scene rate receive high_scene_rate and +0.04 boost. Set sceneThreshold to 0 to disable.Scoring Guards
| Guard | Parameter | Comskip equivalent |
|---|---|---|
| Uncorroborated penalty | automatic | punish_modifier |
| Minimum show segment | minShowSegment | min_show_segment_length |
| Edge protection | alwaysKeepFirst / alwaysKeepLast | always_keep_first/last_seconds |
| 30s boundary snapping | requireDiv5 | require_div5 |
| Asymmetric trim | removeBefore / removeAfter | remove_before / remove_after |
Confidence Score
Every CutInterval carries a confidence float (0.0β1.0) and a signals list. The activity log renders confidence as a star rating:
| Score | Display | Meaning |
|---|---|---|
| β₯ 1.0 | β β β | Multiple corroborating signals |
| β₯ 0.8 | β β β | At least one corroborating signal |
| < 0.8 | β ββ | Black boundary only β no corroboration |
All Parameters
Input / Output
| Parameter | Type | Description |
|---|---|---|
| inputMode | singleFile | batchDir | Process one file or a whole folder |
| inputPath | path | Input file or folder path |
| glob | pattern | Comma-separated globs for batch mode (e.g. *.mp4,*.mov,*.dv) |
| outdir | path | Base output directory β a subfolder is created per input file |
Black Frame Detection
| Parameter | Default | Description |
|---|---|---|
| blackMinDur | 0.10 s | Minimum black segment duration. Shorter flashes are discarded. |
| pixTh | 0.08 | Pixel luma threshold for blackdetect. Lower = stricter black definition. |
| picTh | 0.98 | Fraction of pixels per frame that must be below pixTh. |
| mergeGap | 1.5 s | Merge black segments separated by β€ this gap. Prevents flickering slugs from splitting boundaries. |
Cut Behaviour
| Parameter | Default | Description |
|---|---|---|
| edgePadPre | 0.20 s | Padding added before each cut boundary. |
| edgePadPost | 0.06 s | Padding added after each cut boundary. |
| minCommercial | 5 s | Minimum gap to classify as a commercial break. |
| maxCommercial | 240 s | Maximum gap to classify as a commercial break. |
| includeBlack | false | Include surrounding black frames inside exported commercial clips. |
| reencode | false | Re-encode output with H.264/AAC for frame-accurate cuts. |
| dryRun | false | Write logs only β no media files are created. |
Advanced Detection (Comskip-derived)
| Parameter | Default | Comskip equiv. | Description |
|---|---|---|---|
| silenceNoiseDb | -40 dB | max_silence | Audio noise floor. Set β₯ 0 to disable silence detection. |
| silenceMinDur | 0.5 s | min_silence | Minimum silence duration to register as a segment. |
| minShowSegment | 30 s | min_show_segment_length | Minimum keep-segment length. Cuts that would leave shorter keeps are demoted. |
| alwaysKeepFirst | 0 s | always_keep_first_seconds | Hard-protect first N seconds from being cut. |
| alwaysKeepLast | 0 s | always_keep_last_seconds | Hard-protect last N seconds from being cut. |
| uniformMaxStddev | 8.0 | non_uniformity | Luma stddev ceiling for uniform frame detection. Set to 0 to disable. |
| sceneThreshold | 0.4 | schange_threshold | Scene change sensitivity. Set to 0 to disable. |
| removeBefore | 0 s | remove_before | Trim from the content side of each cut. |
| removeAfter | 0 s | remove_after | Trim from the ad side of each cut. |
| requireDiv5 | false | require_div5 | Snap or drop candidates not within 3 s of a 30-second multiple. |
Verbosity
| Value | Output |
|---|---|
| 0 | Errors only |
| 1 | Milestones + raw blackdetect/silencedetect logs written to logs/ |
| 2 | Full step-by-step + all raw filter logs written to logs/ |
Preset System
AdSlicerProXP ships with three built-in presets and a full save/load system.
Built-in Presets
| File | Purpose |
|---|---|
| default.json | Balanced starting point for typical VHS |
| vhs_noisy.json | Loose thresholds for degraded/worn tape |
| broadcast_strict.json | Strict thresholds with 30s snapping for clean off-air captures |
Preset Menu
Presets ββ BUILT-IN ββββββββββββββ Broadcast strict Default VHS noisy ββ MY PRESETS ββββββββββββ my_custom_settings ββββββββββββββββββββββββββ Save Current as Presetβ¦ ββββββββββββββββββββββββββ Open User Presets Folderβ¦ Reload Presets
User Preset Locations
| Platform | Path |
|---|---|
| macOS | ~/Library/Application Support/net.schwwaaa.adslicerproxp/presets/ |
| Windows | %APPDATA%\net.schwwaaa.adslicerproxp\presets\ |
| Linux | ~/.config/net.schwwaaa.adslicerproxp/presets/ |
Use Presets β Open User Presets Folderβ¦ to open this location. Drop any .json file there and use Reload Presets to make it appear in the menu.
Preset File Format
Plain JSON. _preset sets the menu label; _description sets the tooltip. Unrecognized keys are silently ignored.
{
"_preset": "My custom VHS settings",
"_description": "Tuned for my specific deck and capture card.",
"blackMinDur": 0.10,
"pixTh": 0.08,
"picTh": 0.98,
"mergeGap": 1.5,
"minCommercial": 5,
"maxCommercial": 240,
"silenceNoiseDb": -40,
"requireDiv5": false
}
Adding a Built-in Preset to the Build
Drop a .json file into src-tauri/presets/ and rebuild. The tauri.conf.json resources glob picks it up β no code changes needed.
Tuning Guide
Thresholds may require tuning for darker or noisier analog captures. Always Dry Run first and review dataset.jsonl before committing to export.
Too many false positives (content being cut)
- Raise
blackMinDur(0.15β0.25) β require longer slugs - Raise
picTh(0.99) β require nearly pure black frames - Increase
minShowSegment(60β120 s) β prevent short content being consumed - Enable
requireDiv5for clean broadcast β non-multiples of 30 s are not real ad breaks - Raise
minCommercialβ filter breaks too short to be real commercials - Check
dataset.jsonlfor cuts where onlysig_black_boundaryfired β weakest detections
Missed commercials (breaks not detected)
- Lower
blackMinDur(0.06β0.08) β accept shorter slugs - Raise
pixTh(0.10β0.14) β more permissive black definition - Lower
picTh(0.90β0.95) β allow noisier black frames - Increase
mergeGapfor flickering VHS slug patterns - Lower
sceneThreshold(0.25β0.35) β catch more cuts within blocks
Noisy or degraded VHS
- Raise
pixTh+ lowerpicThβ the standard analog adjustment - Raise
uniformMaxStddev(12β18) β VHS black slugs are never truly uniform - Set
removeBefore 0.1β recovers content clipped by ambiguous slug entry points - Disable
requireDiv5β VHS timing is irregular - Lower
silenceNoiseDbto -35 dB β VHS audio floor is noisier
Clean off-air broadcast
- Enable
requireDiv5β US TV commercials are exact 15/30/60/90 s units - Set
alwaysKeepFirst 15andalwaysKeepLast 15β protect cold opens and credits - Lower
uniformMaxStddevto 5β6 β broadcast slates are near-perfect - Raise
sceneThresholdto 0.45 β hard cuts only; avoid dissolve false positives
Recommended workflow
- Run with
dryRunenabled β reviewdetect.jsonand the activity log - Check
dataset.jsonlβ confidence below 0.90 or onlysig_black_boundaryfiring is worth inspecting - Adjust parameters and re-run dry until the plan is correct
- Remove
dryRunand enablereencodefor final archival export
ML Dataset Output
Every run writes logs/dataset.jsonl β one JSON object per line, one line per segment, 74 columns. Load it directly:
import pandas as pd
df = pd.read_json("logs/dataset.jsonl", lines=True)
Column Groups
Identity (4 cols)
| Column | Type | Description |
|---|---|---|
| run_id | string | ISO-8601 UTC timestamp of the processing run |
| source_file | string | Input filename stem |
| segment_index | int | Index within this label type |
| timeline_position | int | Sequential position in the overall file timeline |
Timing (9 cols)
| Column | Description |
|---|---|
| start_s, end_s, dur_s | Absolute timestamps and duration in seconds |
| start_norm, end_norm, dur_norm | Position and duration as fraction of file length (0β1) |
| offset_from_start_s | Seconds from start of recording |
| offset_from_end_s | Seconds from end of recording |
Signal Indicators β all 0.0 or 1.0 (11 cols)
| Column | Fires when |
|---|---|
| sig_black_boundary | Interval is bracketed by a black slug |
| sig_within_commercial_range | Duration within [min_commercial, max_commercial] |
| sig_silence_overlap | Silence corroboration fired |
| sig_uniform_overlap | Uniform frame corroboration fired |
| sig_high_scene_rate | Scene rate exceeds 1.3Γ file average |
| sig_demoted_min_show_segment | Was a commercial candidate, demoted by show guard |
| sig_always_keep_first/last | Interval falls within the always-keep window |
| sig_content_between_commercials | Standard keep between two commercial blocks |
| sig_div5_snapped | Boundary was snapped to a 30 s multiple |
Classification (3 cols)
| Column | Values |
|---|---|
| label | "commercial" or "keep" |
| label_int | 1 = commercial, 0 = keep |
| confidence | Detection confidence score 0.0β1.0 |
Usage Examples
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
df = pd.read_json("logs/dataset.jsonl", lines=True)
# Feature matrix
X = df[[
"dur_s", "dur_norm", "start_norm",
"black_left_dur_s", "black_right_dur_s",
"silence_coverage", "has_silence_overlap",
"scene_change_rate", "scene_change_rate_vs_avg",
"sig_black_boundary", "sig_silence_overlap",
]]
y = df["label_int"]
# Compare across parameter tuning runs
runs = pd.concat([
pd.read_json("run1/logs/dataset.jsonl", lines=True),
pd.read_json("run2/logs/dataset.jsonl", lines=True),
])
runs.groupby("param_scene_threshold")["run_commercial_ratio"].mean()
# Inspect low-confidence cuts
df[(df["label"] == "commercial") & (df["confidence"] < 0.9)]
param_* columns are fully denormalised. Individual files can be concatenated across runs and remain independently queryable.Use Cases
Archiving Full Broadcasts With and Without Commercials
A collector digitizes a 1992 NBC Sunday Night Movie from VHS. They want to preserve the entire broadcast β including vintage promos and ads β but also need a clean version for watching. AdSlicerProXP outputs both automatically: show/ (clean movie) and commercials/ (all ad blocks), plus detect.json for reproducible archive metadata.
Digitizing VHS Tapes With Automatic Cleanup
A preservation group receives 300 home-recorded VHS tapes spanning 1986β2004. Batch-mode processing handles entire shelves at once. Threshold tuning ensures detection works across varied analog sources. Hundreds of hours are segmented, cleaned, exported into uniform directory structures, and logged for verification β eliminating months of manual editing.
Preparing Footage for YouTube or Streaming
A creator uploading 1990s cartoons needs commercial breaks removed to avoid Content ID strikes. AdSlicerProXP with reencode produces frame-accurate clean masters with no leftover partial-commercial frames.
Building ML Training Sets
A research lab training a commercial-boundary detection model needs ground-truth timestamps for black slugs, silence regions, ad gaps, and keep segments. AdSlicerProXP's dataset.jsonl provides a complete labeled dataset β 74 feature columns per segment β without manual annotation.
Creating Commercial Compilations
An editor wants all McDonald's commercials from 1997 ABC broadcasts. Commercials are already cleanly extracted into individual files in commercials/ β drop them into a timeline or sort by brand via captioning or logo detection.
High-Volume TV Archive Processing
A university lab processes 1,200 Betacam SP and VHS tapes from 1980β2002. AdSlicerProXP's structured run_manifest.json and detect.json provide the audit trail. Every tape gets a uniform directory layout with consistent metadata ready for digital asset management ingestion.
Building from Source
First-time setup
Download static ffmpeg/ffprobe builds into src-tauri/binaries/. Run once before first build.
./build.sh setup-bins
Dev mode
cd src-tauri cargo tauri dev
Release builds
| Command | Target |
|---|---|
| ./build.sh | Auto-detect current OS |
| ./build.sh mac-universal | macOS arm64 + x86_64 fat binary |
| ./build.sh mac-arm | macOS Apple Silicon only |
| ./build.sh mac-x86 | macOS Intel only |
| ./build.sh windows | Windows x86_64 |
ffmpeg and ffprobe are bundled automatically. Users need no external dependencies.
Adding a built-in preset
Drop a .json file into src-tauri/presets/ and run ./build.sh. The tauri.conf.json resources glob picks it up β no code changes needed.