16 comments

  • blopker 2 hours ago
    Nice! I really like how many variations on this idea are coming out. MacWhisper used to be great, but is kinda of a buggy mess now.

    I'm making my own, for personal use. I did a survey of many and they all (that I could find) skip the fundamentals.

    The major issues that I've run into:

    - Crash recovery. Most of these apps are incredibly buggy and crash all the time, taking the recorded audio with them. Macwhisper is incredibly bad at this.

    - Disk space. Many of these apps save wav files to disk. After a few hours of meetings, you may end up with gigabytes eaten.

    - Microphone bleed. People don't always use headphones, the system mic will pick up the speaker sounds, causing duplicate (approximately) transcriptions.

    I've yet to find a solution that handles all these correctly, let alone having high quality transcriptions.

    Anyway, most of these apps are built around https://github.com/FluidInference/FluidAudio, if anyone is curious. Their readme has a big list of similar apps as well.

    • jv22222 2 hours ago
      Nice tip on FluidAudio that's the kind of thing I've been looking for. Thanks!
    • highmastdon 1 hour ago
      I’m using MacParakeet these days. If your language is supported, definitely give it a try. It’s much faster and lower footprint
  • Myrmornis 23 minutes ago
    I will be happy to spend £10 on this. One feature question though -- does it continue transcribing the meeting even if I've turned my volume down / muted it?
  • robertkarl 1 hour ago
    This looks sick. I was going to download it but for $10 I am more willing to attempt asking Claude to implement something like it, than to purchase.

    I would be more willing to purchase if it was open source and I could build from source to try it first.

    • addozhang 21 minutes ago
      I don't really recommend it. If the software is a one-time purchase, there's no need to rewrite it with an LLM. Rewriting the tokens could cost more than just $10.
      • anonymouse008 15 minutes ago
        * full price tokens, yes

        Not the subsidized subs

    • satvikpendem 39 minutes ago
      It's kinda funny how frontier LLMs change the game when it comes to software. If it becomes so good to make whatever little utility you want, why would I pay 10 dollars when an AI subscription is 20 bucks and I can build way more in a month for that $20? Especially since it's very likely people on show HN have simply used AI anyway, so why would I pay for your prompts?
  • denbyc 3 hours ago
    I'd love to have a purchase option not tied to the App Store if possible. I don't use an Apple account with my Mac, but I would love to try Trace.
    • addozhang 20 minutes ago
      Agreed, no need to tie it into Apple either.
  • addozhang 18 minutes ago
    非常不错的一件产品,也是我一直想要的,但是我更多会议是在公司的 mac 上,但肯定不允许我安装这类软件,虽然我愿意自己付费购买。
  • mushufasa 3 hours ago
    This looks like a good approach, though I would expect this to be a native macOs feature within 12 months -- this seems totally like it fits into their product roadmap.
  • nkmnz 3 hours ago
    Which Speech-to-Text is used? Is it possible to configure it? This might be crucial for supporting languages other than English - the model that comes built-in with macOS fails completely for German.
  • frabia 3 hours ago
    Super interesting! How accurate is the local model to transcribe audio compared to other cloud services? E.g. Google Meet, Otter, Granola, etc.
    • watchlight 2 hours ago
      A lot of the available models are Whisper or Faster-Whisper derived and shared across multiple apps. The tier names are often funny... "Tiny" "base" "small" "medium" "large" "large-v2" "large-v3" "large-v3-turbo" -en only variants, etc.

      In my experience, medium is often the sweet spot for English accuracy vs speed, especially if following-up with a post-processing pass. The large options are all fine, but can severely slow it down. There are some speed checks on my website if you're curious (link not posted because I don't want to hijack another post's app).

  • watchlight 7 hours ago
    Agreed with JohnBiz, the moment flagging is interesting and unusual, and a nice contrast to passive transcription. I only recently learned about MacWhisper (I'm Windows primarily) and was floored to learn how expensive the Pro option is. Nowadays it's not so hard to have some-level of DIY transcription, so crazy that it's priced with a premium.

    What's your diarization pipeline? Pyannote?

    I'd taken a different approach that used a LLM clean-up pass to summarize and progressively compress the transcript for ultra-long content, but I like the idea of targeted "pay attention here" flags.

  • nazca 3 hours ago
    I've been looking for this exact thing!
  • overflowy 3 hours ago
    Does it support multiple languages?
  • ipotapov 18 hours ago
    [dead]
  • JohnBizBiz 18 hours ago
    [flagged]
    • ZoneZealot 1 hour ago
      HN is not the place for LLM generated advertisements
  • satvikpendem 3 hours ago
    I don't see how this is different to literally the dozens of other offline transcription apps, many open source even unlike this one.