M4A to Text: Transcribing iPhone Voice Memos (and other M4A files)

iOS 18's built-in transcription, plus three other ways — ranked by what your recording actually needs.

M4A is AAC audio inside an MP4 container — the default format for iPhone Voice Memos and most Mac recording apps. iOS 18 and later transcribes Voice Memos on the device for free, but only in English, only for one speaker, and without exportable formats. For everything else — longer recordings, multi-speaker meetings, non-English audio, or files you need to share as DOCX or SRT — DeluxeScribe transcribes M4A files in 99 languages with speaker labels in 5–10 minutes per hour of audio. Free tier is 60 minutes; paid plans start at $10/month.
  • 60 minutes free
  • No credit card
  • 99 languages
  • Speaker labels

Last verified June 23, 2026

First: get the M4A off your iPhone (3 methods that work)

All three methods below preserve the M4A format — none of them silently convert to a lossier format on the way over.

AirDrop to a Mac (fastest)

Open Voice Memos → tap the three dots on a recording → Share → AirDrop → pick your Mac. The file arrives in Downloads as the original M4A. This is the only zero-friction method if you live in the Apple ecosystem.

Save to Files app → iCloud Drive (works on Windows too)

Same Share menu → Save to Files → pick iCloud Drive. From a Mac or Windows PC with iCloud, the file appears in the iCloud Drive folder within a minute or two. Best when you need the file on a non-Apple machine.

Email or Messages to yourself (no extra apps)

Share → Mail or Messages → send to your own address. Slowest for long recordings (email providers cap attachments at ~25 MB), but requires no setup. Useful in a pinch when you don’t have AirDrop or iCloud handy.

Does iPhone already transcribe Voice Memos? (Yes, with limits)

iOS 18 added on-device transcription to the Voice Memos app. Open any recording, tap the transcript icon (looks like a text bubble), and you get a synced transcript next to the waveform. It’s genuinely useful and runs without an internet connection.

What works:

  • Short clips, clean audio, one speaker
  • English (officially the only supported language at launch)
  • Recordings made elsewhere, imported into Voice Memos
  • Tap-to-jump from transcript text to audio position

What doesn’t:

  • Recordings longer than ~30 minutes (transcript stops generating)
  • Multi-speaker meetings — there’s no speaker labelling
  • Most non-English audio
  • Exporting the transcript as a file (you can copy-paste only)

When to use it: a 3-minute note to yourself in English. When to switch:anything you need to share, edit, or process — including most of the recordings you’d actually want a transcript of.

The 4 ways to transcribe M4A files

MethodCostMax lengthLanguagesSpeaker labels
iPhone Voice Memos built-inFree~30 minEnglishNo
DeluxeScribe (or similar service)Free trial → $10/mo5 GB99Yes
Self-hosted WhisperFreeUnlimited (your machine)99Not by default
Apple Live CaptionsFreeReal-time onlyEnglish (+ some)No

1. iPhone Voice Memos built-in (free, short English clips)

Covered above. Best for short personal notes; not useful for anything you need to share or process.

2. DeluxeScribe or similar AI service

Upload the M4A, pick the language (or leave on auto-detect), wait 5–10 minutes per hour of audio. Returns a clickable transcript with speaker labels, exportable as TXT, DOCX, PDF, SRT, VTT, or JSON. The free tier (60 minutes, no card) covers a few hour-long interviews without paying.

3. Self-hosted Whisper

Free, private, accurate. Requires Python and patience on a CPU; fast on a recent GPU. M4A is supported directly:

pip install openai-whisper
whisper voice-memo.m4a --model large-v3 --language English

For speaker labels, pair Whisper with whisper-diarization — it adds a setup step but produces output comparable to commercial services.

4. Apple’s Live Captions accessibility feature

On iPhone (iOS 16+) and Mac (macOS Ventura+), Live Captions transcribes any audio playing on the device in real time — including audio from another app or a video call. Useful for getting captions from a meeting you’re watching live, not for transcribing existing recordings (it doesn’t save the transcript by default).

What is an M4A file? (Quick explainer)

M4A is the audio-only variant of MP4. The container is the same as MP4 video; it just contains an audio track (usually AAC, sometimes Apple Lossless) and no video track. The .m4aextension is a convention to signal “this MP4 has audio only” — internally it’s identical to an MP4 with the video stripped out.

Why iPhone uses it: AAC at 128 kbps sounds roughly equivalent to MP3 at 192 kbps, so you store ~30% smaller files for the same perceived quality. Apple defaulted to AAC across iTunes and the iPhone in 2003 and has stuck with it.

Why renaming .m4a.mp3 doesn’t work: the extension lies but the file bytes don’t. Any MP3 decoder will reject it. To actually convert, re-encode with ffmpeg:

ffmpeg -i input.m4a -acodec libmp3lame -q:a 4 output.mp3

But for transcription, don’t bother — upload the M4A directly.

Accuracy by recording scenario

ScenarioRealistic accuracyNotes
Voice memo, phone on desk, one speaker92–97%iPhone’s built-in mic is good at close range.
Interview with lavalier mic95–98%Best case — same as a podcast setup.
Multi-person meeting, phone in centre of table80–90%Speaker labels are correct ~85% of the time.
Pocket / muffled / wind60–80%Often faster to re-record than to transcribe.

Common M4A errors and what they mean

“Unsupported file”

Often the file is actually .m4p (DRM-protected, from old iTunes Store purchases), or the extension is .m4a but the contents are something else. Check with ffprobe yourfile.m4a.

File won’t upload

Either the file is over the service’s size limit, or the MIME type is being misreported by the browser. Try uploading from a desktop browser instead of mobile; if that still fails, re-export from Voice Memos at a lower bitrate in Settings → Voice Memos → Audio Quality.

Transcription returns empty text

The audio track is silent or the codec inside the container isn’t actually audio. Play the file locally first — if you hear nothing, the recording itself failed.

Wrong language detected

Voice Memos sometimes tags the file with the device’s UI language rather than the spoken language. Set the language manually in the transcription settings.

When DeluxeScribe is the right fit

The right answer is uswhen you have a long recording, multiple speakers, non-English audio, or need anything beyond “view in Notes” (DOCX, SRT, JSON export). Most of the M4A files people actually want transcribed fit one of those four criteria.

The right answer isn’t uswhen it’s a single short English voice memo and you have an iPhone with iOS 18+ — the built-in transcription is free, instant, and on-device. Don’t pay for what Apple already gives you.

Transcribe iPhone recordings in 99 languages

60 minutes free, no credit card. Speaker labels, six export formats, 5 GB file size.

How this page was verified

Tested across 18 source files: solo voice memos, two-person interviews, four-person meetings, and lecture recordings. iPhone Voice Memos transcription was verified on an iPhone running iOS 18.5; behaviour may vary on later versions. Format-spec claims (AAC, MP4 container) reference RFC 6381 and ISO/IEC 14496-14. Apple Voice Memos behaviour is from Apple’s Voice Memos User Guide. Accuracy figures use word error rate (WER) against human-corrected transcripts.

Frequently Asked Questions

Why is my iPhone recording .m4a and not .mp3?

iPhone uses AAC (Advanced Audio Coding) inside an MP4 container, with the .m4a extension. AAC produces smaller files at the same perceived quality as MP3, which matters when you're storing recordings on a phone. Apple has used this default since the original iPhone.

Can I convert M4A to MP3 first and get better accuracy?

No — re-encoding adds artefacts without adding information, so MP3 conversion can slightly reduce accuracy. Upload the M4A directly. The only time conversion helps is if a specific tool refuses to accept M4A; even then, prefer WAV over MP3 to avoid double compression.

Does this work with Android voice recorder files?

Yes. Most Android recorders produce M4A (AAC), AMR, or 3GP. All three are widely supported. If your file is .amr and your tool rejects it, convert to M4A with ffmpeg: ffmpeg -i in.amr -c:a aac out.m4a.

Will iPhone's built-in transcription work on an old recording?

Yes. iOS 18+ transcribes any recording in the Voice Memos app, including ones you imported. Open the recording, tap the transcript icon, and it processes on-device for English. For other languages or longer recordings, use a dedicated service.

What's the maximum M4A file size I can upload?

DeluxeScribe accepts files up to 5 GB. For a typical M4A (AAC at 128 kbps), that's roughly 90 hours of audio. Most other services cap at 1–2 GB; check before splitting unnecessarily.

Are my voice memos kept private?

DeluxeScribe uses encrypted infrastructure and doesn't train on customer audio. Files can be deleted at any time. For maximum privacy, run Whisper locally — your audio never leaves your device. Apple's on-device Voice Memos transcription is also fully local.

Can it identify different speakers in a meeting recording?

Yes — speaker diarization is automatic on DeluxeScribe and most modern services. iOS 18's built-in transcription does not currently label speakers. For multi-person meetings, a dedicated service is the right choice.