Mastodon Skip to content
  • Home
  • Aktuell
  • Tags
  • Über dieses Forum
Einklappen
Grafik mit zwei überlappenden Sprechblasen, eine grün und eine lila.
Abspeckgeflüster – Forum für Menschen mit Gewicht(ung)

Kostenlos. Werbefrei. Menschlich. Dein Abnehmforum.

  1. Home
  2. Uncategorized
  3. We've built our own text-to-speech system with an initial English language model we trained ourselves with fully open source data.

We've built our own text-to-speech system with an initial English language model we trained ourselves with fully open source data.

Geplant Angeheftet Gesperrt Verschoben Uncategorized
42 Beiträge 28 Kommentatoren 0 Aufrufe
  • Älteste zuerst
  • Neuste zuerst
  • Meiste Stimmen
Antworten
  • In einem neuen Thema antworten
Anmelden zum Antworten
Dieses Thema wurde gelöscht. Nur Nutzer mit entsprechenden Rechten können es sehen.
  • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

    We've built our own text-to-speech system with an initial English language model we trained ourselves with fully open source data. It will be added to our App Store soon and then included in GrapheneOS as a default enabled TTS backend once some more improvements are made to it.

    terminaltilt@climatejustice.socialT This user is from outside of this forum
    terminaltilt@climatejustice.socialT This user is from outside of this forum
    terminaltilt@climatejustice.social
    schrieb zuletzt editiert von
    #8

    @GrapheneOS

    This is great news! Thank you!

    1 Antwort Letzte Antwort
    0
    • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

      We're going to build our own speech-to-text implementation to go along with this too. We're starting with an English model for both but we can add other languages which have high quality training data available. English and Mandarin have by far the most training data available.

      grapheneos@grapheneos.socialG This user is from outside of this forum
      grapheneos@grapheneos.socialG This user is from outside of this forum
      grapheneos@grapheneos.social
      schrieb zuletzt editiert von
      #9

      Existing implementations of text-to-speech and speech-to-text didn't meet our functionality or usability requirements. We want at least very high quality, low latency and robust implementations of both for English included in the OS. It will help make GrapheneOS more accessible.

      grapheneos@grapheneos.socialG tchambers@indieweb.socialT 2 Antworten Letzte Antwort
      0
      • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

        We're going to build our own speech-to-text implementation to go along with this too. We're starting with an English model for both but we can add other languages which have high quality training data available. English and Mandarin have by far the most training data available.

        rodirik@norden.socialR This user is from outside of this forum
        rodirik@norden.socialR This user is from outside of this forum
        rodirik@norden.social
        schrieb zuletzt editiert von
        #10

        @GrapheneOS For German there is "Thorsten Voice" with an Open Dataset.

        https://www.thorsten-voice.de/en/datasets-2/

        byte@rage.loveB 1 Antwort Letzte Antwort
        0
        • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

          We've built our own text-to-speech system with an initial English language model we trained ourselves with fully open source data. It will be added to our App Store soon and then included in GrapheneOS as a default enabled TTS backend once some more improvements are made to it.

          P This user is from outside of this forum
          P This user is from outside of this forum
          peet@social.tchncs.de
          schrieb zuletzt editiert von
          #11

          @GrapheneOS very good news 👍. Thank you!!!

          1 Antwort Letzte Antwort
          0
          • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

            Existing implementations of text-to-speech and speech-to-text didn't meet our functionality or usability requirements. We want at least very high quality, low latency and robust implementations of both for English included in the OS. It will help make GrapheneOS more accessible.

            grapheneos@grapheneos.socialG This user is from outside of this forum
            grapheneos@grapheneos.socialG This user is from outside of this forum
            grapheneos@grapheneos.social
            schrieb zuletzt editiert von
            #12

            Our full time developer working on this already built their own Transcribro app for on-device speech-to-text available in the Accrescent app store. For GrapheneOS itself, we want actual open source implementations of these features rather than OpenAI's phony open source though.

            grapheneos@grapheneos.socialG hipsterelectron@circumstances.runH lunareclipse@snug.moeL D 4 Antworten Letzte Antwort
            0
            • rodirik@norden.socialR rodirik@norden.social

              @GrapheneOS For German there is "Thorsten Voice" with an Open Dataset.

              https://www.thorsten-voice.de/en/datasets-2/

              byte@rage.loveB This user is from outside of this forum
              byte@rage.loveB This user is from outside of this forum
              byte@rage.love
              schrieb zuletzt editiert von
              #13

              @rodirik @GrapheneOS oh hell yeah

              I’m learning German so not having that would be mildly annoying

              1 Antwort Letzte Antwort
              0
              • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

                Our full time developer working on this already built their own Transcribro app for on-device speech-to-text available in the Accrescent app store. For GrapheneOS itself, we want actual open source implementations of these features rather than OpenAI's phony open source though.

                grapheneos@grapheneos.socialG This user is from outside of this forum
                grapheneos@grapheneos.socialG This user is from outside of this forum
                grapheneos@grapheneos.social
                schrieb zuletzt editiert von
                #14

                Whisper is actually closed source. Open weights is another way of saying permissively licensed closed source. Our implementation of both text-to-speech and speech-to-text will be actual open source which means people can actually fork it and add/change/remove training data, etc.

                tedstechtips@mas.toT bernard@friends.ravergram.clubB notavi10@critter.cafeN hipsterelectron@circumstances.runH king_of_ooo@defcon.socialK 5 Antworten Letzte Antwort
                0
                • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

                  Whisper is actually closed source. Open weights is another way of saying permissively licensed closed source. Our implementation of both text-to-speech and speech-to-text will be actual open source which means people can actually fork it and add/change/remove training data, etc.

                  tedstechtips@mas.toT This user is from outside of this forum
                  tedstechtips@mas.toT This user is from outside of this forum
                  tedstechtips@mas.to
                  schrieb zuletzt editiert von
                  #15

                  @GrapheneOS You guys are the best 🙌

                  1 Antwort Letzte Antwort
                  0
                  • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

                    Whisper is actually closed source. Open weights is another way of saying permissively licensed closed source. Our implementation of both text-to-speech and speech-to-text will be actual open source which means people can actually fork it and add/change/remove training data, etc.

                    bernard@friends.ravergram.clubB This user is from outside of this forum
                    bernard@friends.ravergram.clubB This user is from outside of this forum
                    bernard@friends.ravergram.club
                    schrieb zuletzt editiert von
                    #16

                    @GrapheneOS
                    This is a great addition. I have been using Sherpa TTS https://
                    github.com/woheller69/ttsengine and Futo Keyboard for STT. https://keyboard.futo.org/

                    grapheneos@grapheneos.socialG 1 Antwort Letzte Antwort
                    0
                    • bernard@friends.ravergram.clubB bernard@friends.ravergram.club

                      @GrapheneOS
                      This is a great addition. I have been using Sherpa TTS https://
                      github.com/woheller69/ttsengine and Futo Keyboard for STT. https://keyboard.futo.org/

                      grapheneos@grapheneos.socialG This user is from outside of this forum
                      grapheneos@grapheneos.socialG This user is from outside of this forum
                      grapheneos@grapheneos.social
                      schrieb zuletzt editiert von
                      #17

                      @Bernard We started working on this because Sherpa didn't meet our requirements including overly high latency making it unsuitable for blind users to use with TalkBack.

                      1 Antwort Letzte Antwort
                      0
                      • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

                        Whisper is actually closed source. Open weights is another way of saying permissively licensed closed source. Our implementation of both text-to-speech and speech-to-text will be actual open source which means people can actually fork it and add/change/remove training data, etc.

                        notavi10@critter.cafeN This user is from outside of this forum
                        notavi10@critter.cafeN This user is from outside of this forum
                        notavi10@critter.cafe
                        schrieb zuletzt editiert von
                        #18

                        @GrapheneOS i could help with spanish and esperanto models if needed

                        1 Antwort Letzte Antwort
                        0
                        • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

                          We've built our own text-to-speech system with an initial English language model we trained ourselves with fully open source data. It will be added to our App Store soon and then included in GrapheneOS as a default enabled TTS backend once some more improvements are made to it.

                          breizh@pleroma.breizh.pmB This user is from outside of this forum
                          breizh@pleroma.breizh.pmB This user is from outside of this forum
                          breizh@pleroma.breizh.pm
                          schrieb zuletzt editiert von
                          #19

                          @GrapheneOS@grapheneos.social Well, some times ago, open-source TTS was pretty lacking, but now Kaldi / Sherpa is pretty good, did you check it? If yes, what was the problem with it?

                          grapheneos@grapheneos.socialG 1 Antwort Letzte Antwort
                          0
                          • breizh@pleroma.breizh.pmB breizh@pleroma.breizh.pm

                            @GrapheneOS@grapheneos.social Well, some times ago, open-source TTS was pretty lacking, but now Kaldi / Sherpa is pretty good, did you check it? If yes, what was the problem with it?

                            grapheneos@grapheneos.socialG This user is from outside of this forum
                            grapheneos@grapheneos.socialG This user is from outside of this forum
                            grapheneos@grapheneos.social
                            schrieb zuletzt editiert von
                            #20

                            @breizh It wasn't quite good enough and has very high latency which makes it unsuitable for use with TalkBack. We're making this because existing options including Sherpa don't meet our requirements. Otherwise, we could have forked those. It made more sense to make our own instead which we'll be able to continue improving long term. It's similar to our network location and geocoding implementations where we want things done a particular way focused on high quality in all areas we care about.

                            1 Antwort Letzte Antwort
                            0
                            • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

                              Existing implementations of text-to-speech and speech-to-text didn't meet our functionality or usability requirements. We want at least very high quality, low latency and robust implementations of both for English included in the OS. It will help make GrapheneOS more accessible.

                              tchambers@indieweb.socialT This user is from outside of this forum
                              tchambers@indieweb.socialT This user is from outside of this forum
                              tchambers@indieweb.social
                              schrieb zuletzt editiert von
                              #21

                              @GrapheneOS Fascinating is the text to speech and vice versa model and code you’re working on platform specific?

                              grapheneos@grapheneos.socialG 1 Antwort Letzte Antwort
                              0
                              • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

                                Our full time developer working on this already built their own Transcribro app for on-device speech-to-text available in the Accrescent app store. For GrapheneOS itself, we want actual open source implementations of these features rather than OpenAI's phony open source though.

                                hipsterelectron@circumstances.runH This user is from outside of this forum
                                hipsterelectron@circumstances.runH This user is from outside of this forum
                                hipsterelectron@circumstances.run
                                schrieb zuletzt editiert von
                                #22

                                @GrapheneOS i was really impressed with the efficacy and UI of transcribro. no surprise to hear that was the mark of a grapheneos app

                                1 Antwort Letzte Antwort
                                0
                                • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

                                  Whisper is actually closed source. Open weights is another way of saying permissively licensed closed source. Our implementation of both text-to-speech and speech-to-text will be actual open source which means people can actually fork it and add/change/remove training data, etc.

                                  hipsterelectron@circumstances.runH This user is from outside of this forum
                                  hipsterelectron@circumstances.runH This user is from outside of this forum
                                  hipsterelectron@circumstances.run
                                  schrieb zuletzt editiert von
                                  #23

                                  @GrapheneOS the "largeness" of language models is precisely a measure of the difficulty to reproduce them. this methodology has some similarities to something i proposed to huggingface a few years back in a cover letter. no surprise to see they were not interested in reproducibility or the scientific method

                                  hipsterelectron@circumstances.runH 1 Antwort Letzte Antwort
                                  0
                                  • tchambers@indieweb.socialT tchambers@indieweb.social

                                    @GrapheneOS Fascinating is the text to speech and vice versa model and code you’re working on platform specific?

                                    grapheneos@grapheneos.socialG This user is from outside of this forum
                                    grapheneos@grapheneos.socialG This user is from outside of this forum
                                    grapheneos@grapheneos.social
                                    schrieb zuletzt editiert von
                                    #24

                                    @tchambers It's not really platform specific. It currently runs on the CPU but we plan to add TPU support for Tensor and NPU support for Snapdragon in the future. It's made for GrapheneOS and we're not interested in doing any significant work on use outside of GrapheneOS. It will be possible to install it from our App Store on other Android 16+ operating systems but it's not our focus. We're focused on making GrapheneOS better and haven't gotten much out of making stuff available elsewhere.

                                    1 Antwort Letzte Antwort
                                    0
                                    • grapheneos@grapheneos.socialG grapheneos@grapheneos.social

                                      We've built our own text-to-speech system with an initial English language model we trained ourselves with fully open source data. It will be added to our App Store soon and then included in GrapheneOS as a default enabled TTS backend once some more improvements are made to it.

                                      4a4a0ea6f24fe54ca08a20f5ada65e42efdb692f6b8912ff6c1e521c024afa61@mostr.pub4 This user is from outside of this forum
                                      4a4a0ea6f24fe54ca08a20f5ada65e42efdb692f6b8912ff6c1e521c024afa61@mostr.pub4 This user is from outside of this forum
                                      4a4a0ea6f24fe54ca08a20f5ada65e42efdb692f6b8912ff6c1e521c024afa61@mostr.pub
                                      schrieb zuletzt editiert von
                                      #25
                                      that is awesome.

                                      how far if ever until we have a stable terminal app that can be run from any user profile?
                                      1 Antwort Letzte Antwort
                                      0
                                      • hipsterelectron@circumstances.runH hipsterelectron@circumstances.run

                                        @GrapheneOS the "largeness" of language models is precisely a measure of the difficulty to reproduce them. this methodology has some similarities to something i proposed to huggingface a few years back in a cover letter. no surprise to see they were not interested in reproducibility or the scientific method

                                        hipsterelectron@circumstances.runH This user is from outside of this forum
                                        hipsterelectron@circumstances.runH This user is from outside of this forum
                                        hipsterelectron@circumstances.run
                                        schrieb zuletzt editiert von
                                        #26

                                        @GrapheneOS i have also been trying to find similarly motivated people to collaborate with on a research project to reproduce the fawkes facial recognition poisoner upon a mobile device (ideally as an asynchronous but fully local image postprocessing technique) cc @xyhhx @bunnyhero

                                        hipsterelectron@circumstances.runH 1 Antwort Letzte Antwort
                                        0
                                        • hipsterelectron@circumstances.runH hipsterelectron@circumstances.run

                                          @GrapheneOS i have also been trying to find similarly motivated people to collaborate with on a research project to reproduce the fawkes facial recognition poisoner upon a mobile device (ideally as an asynchronous but fully local image postprocessing technique) cc @xyhhx @bunnyhero

                                          hipsterelectron@circumstances.runH This user is from outside of this forum
                                          hipsterelectron@circumstances.runH This user is from outside of this forum
                                          hipsterelectron@circumstances.run
                                          schrieb zuletzt editiert von
                                          #27

                                          @GrapheneOS @xyhhx @bunnyhero i have been putting it off repeatedly but the fawkes paper itself is very high quality and imo intended to be reproduced. if there are resources your team has developed or considered regarding modern hardware on mobile phones for statistical training and inference (fawkes especially requires a training step with local user input iirc) it would be tremendously helpful for our goals here.

                                          hipsterelectron@circumstances.runH 1 Antwort Letzte Antwort
                                          0
                                          Antworten
                                          • In einem neuen Thema antworten
                                          Anmelden zum Antworten
                                          • Älteste zuerst
                                          • Neuste zuerst
                                          • Meiste Stimmen



                                          Copyright (c) 2025 abSpecktrum (@abspecklog@fedimonster.de)

                                          Erstellt mit Schlaflosigkeit, Kaffee, Brokkoli & ♥

                                          Impressum | Datenschutzerklärung | Nutzungsbedingungen

                                          • Anmelden

                                          • Du hast noch kein Konto? Registrieren

                                          • Anmelden oder registrieren, um zu suchen
                                          • Erster Beitrag
                                            Letzter Beitrag
                                          0
                                          • Home
                                          • Aktuell
                                          • Tags
                                          • Über dieses Forum