A ChatGPT Prompt to Format YouTube Transcripts for Readability

ChatGPT Prompt for Transcript Formatting

Format those pesky YouTube Video transcripts with the help of your AI assistant for better readability and read along.

Bastian Moritz
Mar 2024

There are so many great things to learn on the YouTube, like cooking.

The problem is that the transcriptions of the YouTube videos usually are not formatted very well. In fact they are not formatted at all. Because unless you listen to the Huberman Lab podcasts and a few others that have professional transcriptions.

But thanks to AI, you now have an assistant to turn any unformatted transcription into a first formatted version fairly quickly.

And let’s be honest, a formatted transcript is more enjoyable to follow along as you watch a lecture, speech, or a video podcast.

Here is the right prompt sequence that

  1. Makes it easy to fill the rather complex prompt with the essential information
  2. Gets you a prompt that beautifully formats your transcription.

As always, the disclaimer: AI can misinterpret things. Most transcription services use AI but offer services with human review.

And as always, we follow the 3 part structure:

  1. Considerations on why the prompt is designed the way it is: AI Assistant for formatting Video transcripts.
  2. Jump right to the prompt sequence “Prompting for Formatting a Transcription.”
  3. A sample application demonstrating the ChatGPT transcription formatting prompt.

Considerations to Effectively Format a Transcription

I start with asking myself what would help an assistant to make it clear for them to provide me with a formatted transcription. To format a transcription effectively, certain details can significantly enhance the quality and usability of the output.

What information and guidance would ChatGPT help, so it can provide me with the best possible formatting for a bunch of text?

Some key pieces of information that we should consider are:

Speakers Identification

If possible, providing names or identifiers for each speaker.

This helps in clearly attributing dialogue to the correct person, making the transcript more readable and useful.

The number of speakers involved

Knowing the number of speakers allows for accurate speaker differentiation.

Not that a speaker could be clearly identified without indications in the transcript (which there are non in the standard YouTube generated transcripts), but again it helps my assistant to know about the conversation structure and how we can anticipate that sentences are formed.

Names or identifiers for each speaker

Names could be mentioned by other speakers, even the transcript does not differentiate between speakers.

Gender and pitch of voice might help if you transcribe using one of the multimodal capabilities like recording via voice or uploading a video (in the future).

Audio Quality Notes

Any notes on the quality of the audio or particular sections where the audio might be unclear, or background noise, or low volume sections.

Maybe just saying that there will be overlapping speech help in preparing for potential ambiguities in the audio.

Overall, while we have textual input this information might help the AI to identify confusing segments of the transcription where the auto transcription will have made lots of mistakes and generated word sequences with little to no sense.

Pronunciation Guides for Unusual Terms

If your audio includes technical jargon, foreign languages, or names that have specific pronunciations, providing a pronunciation guide can improve accuracy.

Maybe you want to start creating a Library of Terms that you provide to the AI as well.

Context and Kind

Is this a lecture or a talk? A discussion or an Interview? A speech followed by a Q&A?

This context helps my assistant to anticipate what kind of sentences can be expected and what kind of sentences they need to form.

Is this an interview and are there many questions followed by answers? Or is this a discussion with lots of interruptions because the speakers are so engaged?


Similar to the pronunciation and Terminology Library, just stating the topic or context of the conversation can aid in understanding the subject matter, which can be crucial for correctly interpreting and transcribing industry-specific terminology, jargon, or colloquialisms.


Including timestamps can be very helpful, especially for longer recordings. They allow you to locate specific parts of the audio quickly.

While Timestamps can be particularly useful for referencing, editing, or reviewing specific segments of the conversation.

But they make the input script much more crowded and can potentially lead the AI to be misguided

You need to experiment with this if timestamps are a thing you need. It would be lovely if you share your experiences with us!

Chapters, Section Markers or Highlights

If the audio contains distinct sections or topics, markers or notes indicating these transitions can help organize the transcription more effectively.

For example, some creators, while not having their own transcripts, have created chapters to divide their YouTube videos into relevant chunks. Thes chapters appear in the transcript.

Preferred Formatting Style

Now we come to the desired output.

Information about your preferred formatting style (e.g., verbatim vs. non-verbatim, paragraph breaks, how to handle non-verbal cues) ensures that the transcript meets your specific needs.

If the transcript should adhere to specific layout guidelines mention these requirements.

Special Instructions

Any specific instructions like how to handle unclear speech, pauses, laughter, or other non-verbal elements in the audio.

Prompt for Formatting a Transcription

This is a comprehensive prompt you can use when requesting a formatting of a transcription from ChatGPT

You can fill in the details relevant to your specific transcription.

Providing comprehensive information leads to a more accurate and useful final product. The more detailed and clear your instructions, the easier it is to tailor the transcription to your specific needs.

The following prompt is designed to provide ChatGPT with comprehensive information, ensuring that the formatting of the transcription is tailored to meet your specific needs.

Hello ChatGPT, I have a transcription from a video that I need formatted.

Here are the specific details and requirements for the formatting:

  1. Number of Speakers (to aid in differentiation): [Indicate the number of speakers in the video]
  2. Speaker Identification (identifiers): [Specify if the speakers are identified in the transcription and if you want them labeled by name or identifier]
  3. Specify the nature of the recording (e.g., lecture, talk, discussion, interview, etc.) to set expectations for the dialogue structure.
  4. Context/Subject Matter: [Briefly describe the topic or context of the video]
  5. Timestamps and Structure: [Mention if timestamps are included and if you want them retained or formatted in a specific way. Indicate if there are chapters, section markers]
  6. Library of Terms: Provide a library of terms to assist in interpreting and transcribing subject-specific terminology accurately.
  7. Special Formatting Requests: [Any specific formatting instructions not covered above, like paragraph breaks, handling of non-verbal cues (pauses, laughter, etc.) and unclear speech, etc.]
  8. Any Additional Notes or Instructions: [Include any other specific requirements or instructions for the formatting]

I will start posting the transcript in the next query.

Example for a Formatting Task of YouTube Transcripts

I am using this 1 hour talk “Intro to Large Language Models” by AI researcher Andrej Karpathy.

It is the busy person's intro to LLMs meant for the general audience and a great introduction for anyone who wants to understand LLMs, “the core technical component behind systems like ChatGPT, Claude, and Bard.” And if course there is a busy businessperson’s executive summary to LLMs available as well.

Mar 2024
Latest Update
Ready? Set. Growth!
Learn about growing your organization and the impact of its mission and other insights & stories about Customer-centricity and Organic Growth: