S3 Integration - Episodes

Pod Engine can sync episode data via an S3 file based integration. This is available for our enterprise customers.

How Schema Versioning Works

All documents will always include the following fields:

PropertyTypeDescription
schema_version
number

The numeric version of the schema. Starts at 1, and increments with each new version.

Schema Versions

Version Release Date Description
4 3/4/2026 Added sentence-level VTT content Go to Details
3 10/6/2025 Added denormalized podcast title and external IDs Go to Details
2 9/28/2025 Added sponsors and guest data Go to Details
1 7/2/2025 Initial release with basic metadata Go to Details

Schema Versions Details

Version 4Latest
Released on 3/4/2026

Changelog

  • Added transcript_sentence_vtt
PropertyTypeDescription
schema_version
number

The numeric version of the schema. Starts at 1, and increments with each new version.

episode_db_created_at
date

The date the episode was created in the Pod Engine database. This is when we first saw the episode and typically is close to episode_rss_published_at but can diverge especially when a podcast is first created/seen by Pod Engine.

episode_description_generated_shortnullable
string

The generated short description of the episode

episode_duration_secondsnullable
number

The duration of the episode in seconds

episode_enclosure_url
string

The url of the episode enclosure

episode_guests_and_hostsnullable
array
of
object

Guests and hosts data. Null means it hasn been analyzed

image_urlnullable
string ⓘ

Constraints:

url

The URL of the guest image, if available.

name
string

The name of the guest.

organizations
array
of
string

An array of organizations the guest is affiliated with.

roles
array
of
string

An array of possible job roles and titles

type
enum ⓘ

Allowed values

hostguestunknownmentioned

The type of guest, either a person or a company.

episode_has_transcript
boolean

Whether the episode has a transcript

episode_image_urlnullable
string

The url of the episode image. Null means the episode doesnt have a specific image

episode_rss_description
string

The description of the episode

episode_rss_published_at
date

The published date of the episode

episode_rss_title
string

The title of the episode

episode_sponsors_and_advertisersnullable
array
of
object

Sponsors and advertisers data. Null means it hasn been analyzed

coupon_codesnullable
string

Any coupon codes provided by the sponsor, if applicable.

name
string

The name of the sponsor.

snippet
string

A brief snippet or description of the sponsor.

urls
array
of
string ⓘ

Constraints:

url

An array of URLs associated with the sponsor.

podcast_apple_id
number

The Apple Podcasts ID of the podcast

podcast_is_auto_transcribed
boolean

Whether the episode will be automatically transcribed or not. This is a podcast level setting. Note that the document will not be updated if this value changes before the transcript is available

podcast_rss_title_latest
string

The latest title of the podcast from the RSS feed. This can change over time and if it is changed on the podcast this wont trigger an episode document update. This denormalized field is included so if the podcast isn already created at episode ingestion it can be

podcast_rss_title_original
string

The original title of the podcast from the RSS feed. This is the first podcast title Pod Engine ever dected and will not change. This denormalized field is included so if the podcast isn already created at episode ingestion it can be

podcast_spotify_idnullable
string

The Spotify ID of the podcast. This is used to link to the Spotify page for the podcast and can be used to fetch additional metadata about the podcast. Null means the podcast is not on Spotify or Pod Engine does not have a valid Spotify ID for it

podengine_episode_id
string ⓘ

Constraints:

uuid

The podengine unique id for the individual episode

podengine_episode_slug
string

The podengine unique slug for the individual episode

podengine_podcast_id
string ⓘ

Constraints:

uuid

The podengine unique id for the podcast

podengine_podcast_slug
string

The podengine unique slug for the podcast

transcript_sentence_vttnullable
string

The sentence-level VTT content. Null means no sentence VTT is available (e.g. old transcripts or episodes without transcripts)

transcript_textnullable
string

The text of the transcript. Null means the episode doesnt have a transcript

Version 3
Released on 10/6/2025

Changelog

  • Added podcast_rss_title_latest
  • Added podcast_rss_title_original
  • Added podcast_apple_id
  • Added podcast_spotify_id
PropertyTypeDescription
schema_version
number

The numeric version of the schema. Starts at 1, and increments with each new version.

episode_db_created_at
date

The date the episode was created in the Pod Engine database. This is when we first saw the episode and typically is close to episode_rss_published_at but can diverge especially when a podcast is first created/seen by Pod Engine.

episode_description_generated_shortnullable
string

The generated short description of the episode

episode_duration_secondsnullable
number

The duration of the episode in seconds

episode_enclosure_url
string

The url of the episode enclosure

episode_guests_and_hostsnullable
array
of
object

Guests and hosts data. Null means it hasn been analyzed

image_urlnullable
string ⓘ

Constraints:

url

The URL of the guest image, if available.

name
string

The name of the guest.

organizations
array
of
string

An array of organizations the guest is affiliated with.

roles
array
of
string

An array of possible job roles and titles

type
enum ⓘ

Allowed values

hostguestunknownmentioned

The type of guest, either a person or a company.

episode_has_transcript
boolean

Whether the episode has a transcript

episode_image_urlnullable
string

The url of the episode image. Null means the episode doesnt have a specific image

episode_rss_description
string

The description of the episode

episode_rss_published_at
date

The published date of the episode

episode_rss_title
string

The title of the episode

episode_sponsors_and_advertisersnullable
array
of
object

Sponsors and advertisers data. Null means it hasn been analyzed

coupon_codesnullable
string

Any coupon codes provided by the sponsor, if applicable.

name
string

The name of the sponsor.

snippet
string

A brief snippet or description of the sponsor.

urls
array
of
string ⓘ

Constraints:

url

An array of URLs associated with the sponsor.

podcast_apple_id
number

The Apple Podcasts ID of the podcast

podcast_is_auto_transcribed
boolean

Whether the episode will be automatically transcribed or not. This is a podcast level setting. Note that the document will not be updated if this value changes before the transcript is available

podcast_rss_title_latest
string

The latest title of the podcast from the RSS feed. This can change over time and if it is changed on the podcast this wont trigger an episode document update. This denormalized field is included so if the podcast isn already created at episode ingestion it can be

podcast_rss_title_original
string

The original title of the podcast from the RSS feed. This is the first podcast title Pod Engine ever dected and will not change. This denormalized field is included so if the podcast isn already created at episode ingestion it can be

podcast_spotify_idnullable
string

The Spotify ID of the podcast. This is used to link to the Spotify page for the podcast and can be used to fetch additional metadata about the podcast. Null means the podcast is not on Spotify or Pod Engine does not have a valid Spotify ID for it

podengine_episode_id
string ⓘ

Constraints:

uuid

The podengine unique id for the individual episode

podengine_episode_slug
string

The podengine unique slug for the individual episode

podengine_podcast_id
string ⓘ

Constraints:

uuid

The podengine unique id for the podcast

podengine_podcast_slug
string

The podengine unique slug for the podcast

transcript_textnullable
string

The text of the transcript. Null means the episode doesnt have a transcript

Version 2
Released on 9/28/2025

Changelog

  • Added episode_sponsors_and_advertisers
  • Added episode_guests_and_hosts
  • Added podcast_is_auto_transcribed
PropertyTypeDescription
schema_version
number

The numeric version of the schema. Starts at 1, and increments with each new version.

episode_db_created_at
date

The date the episode was created in the Pod Engine database. This is when we first saw the episode and typically is close to episode_rss_published_at but can diverge especially when a podcast is first created/seen by Pod Engine.

episode_description_generated_shortnullable
string

The generated short description of the episode

episode_duration_secondsnullable
number

The duration of the episode in seconds

episode_enclosure_url
string

The url of the episode enclosure

episode_guests_and_hostsnullable
array
of
object

Guests and hosts data. Null means it hasn been analyzed

image_urlnullable
string ⓘ

Constraints:

url

The URL of the guest image, if available.

name
string

The name of the guest.

organizations
array
of
string

An array of organizations the guest is affiliated with.

roles
array
of
string

An array of possible job roles and titles

type
enum ⓘ

Allowed values

hostguestunknownmentioned

The type of guest, either a person or a company.

episode_has_transcript
boolean

Whether the episode has a transcript

episode_image_urlnullable
string

The url of the episode image. Null means the episode doesnt have a specific image

episode_rss_description
string

The description of the episode

episode_rss_published_at
date

The published date of the episode

episode_rss_title
string

The title of the episode

episode_sponsors_and_advertisersnullable
array
of
object

Sponsors and advertisers data. Null means it hasn been analyzed

coupon_codesnullable
string

Any coupon codes provided by the sponsor, if applicable.

name
string

The name of the sponsor.

snippet
string

A brief snippet or description of the sponsor.

urls
array
of
string ⓘ

Constraints:

url

An array of URLs associated with the sponsor.

podcast_is_auto_transcribed
boolean

Whether the episode will be automatically transcribed or not. This is a podcast level setting. Note that the document will not be updated if this value changes before the transcript is available

podengine_episode_id
string ⓘ

Constraints:

uuid

The podengine unique id for the individual episode

podengine_episode_slug
string

The podengine unique slug for the individual episode

podengine_podcast_id
string ⓘ

Constraints:

uuid

The podengine unique id for the podcast

podengine_podcast_slug
string

The podengine unique slug for the podcast

transcript_textnullable
string

The text of the transcript. Null means the episode doesnt have a transcript

Version 1
Released on 7/2/2025
PropertyTypeDescription
schema_version
number

The numeric version of the schema. Starts at 1, and increments with each new version.

episode_db_created_at
date

The date the episode was created in the Pod Engine database. This is when we first saw the episode and typically is close to episode_rss_published_at but can diverge especially when a podcast is first created/seen by Pod Engine.

episode_description_generated_shortnullable
string

The generated short description of the episode

episode_duration_secondsnullable
number

The duration of the episode in seconds

episode_enclosure_url
string

The url of the episode enclosure

episode_has_transcript
boolean

Whether the episode has a transcript

episode_image_urlnullable
string

The url of the episode image. Null means the episode doesnt have a specific image

episode_rss_description
string

The description of the episode

episode_rss_published_at
date

The published date of the episode

episode_rss_title
string

The title of the episode

podcast_is_auto_transcribed
boolean

Whether the episode will be automatically transcribed or not. This is a podcast level setting. Note that the document will not be updated if this value changes before the transcript is available

podengine_episode_id
string ⓘ

Constraints:

uuid

The podengine unique id for the individual episode

podengine_episode_slug
string

The podengine unique slug for the individual episode

podengine_podcast_id
string ⓘ

Constraints:

uuid

The podengine unique id for the podcast

podengine_podcast_slug
string

The podengine unique slug for the podcast

transcript_textnullable
string

The text of the transcript. Null means the episode doesnt have a transcript

Get started with the API and MCP server

One plan, everything included

  • Full Pod Engine web app access
  • Full API access — same data, programmatic
  • MCP server
  • 10,000 searches / mo
  • 10,000 podcast and episode lookups / mo
  • 1,000 transcripts / mo
  • Historical podcast charts
  • Contacts, emails, social & YouTube data
  • 30 alerts

Get Started

$100 / month

Pay monthly. Cancel anytime.

Available Discounts

We believe great tools should be accessible to everyone building amazing things.

Eligible for a discount? Contact us to learn more.

Bootstrapped Startups

50% off first year

Students & Educators

50% off first year

Nonprofits

50% off first year

Pay Annually

2 months free

Need Guest Booking, an Agency plan, or higher limits?

See the full pricing page for addons and Agency options, or schedule a call.

Need more than 10,000 searches / mo? Get in touch and we'll tailor a plan.