We've built our own text-to-speech system with an initial English language model we trained ourselves with fully open source data.

grapheneos@grapheneos.social

We're going to build our own speech-to-text implementation to go along with this too. We're starting with an English model for both but we can add other languages which have high quality training data available. English and Mandarin have by far the most training data available.

terminaltilt@climatejustice.social

@GrapheneOS

This is great news! Thank you!

grapheneos@grapheneos.social

Existing implementations of text-to-speech and speech-to-text didn't meet our functionality or usability requirements. We want at least very high quality, low latency and robust implementations of both for English included in the OS. It will help make GrapheneOS more accessible.

rodirik@norden.social

@GrapheneOS For German there is "Thorsten Voice" with an Open Dataset.

https://www.thorsten-voice.de/en/datasets-2/

peet@social.tchncs.de

@GrapheneOS very good news . Thank you!!!

grapheneos@grapheneos.social

Our full time developer working on this already built their own Transcribro app for on-device speech-to-text available in the Accrescent app store. For GrapheneOS itself, we want actual open source implementations of these features rather than OpenAI's phony open source though.

byte@rage.love

@rodirik @GrapheneOS oh hell yeah

I’m learning German so not having that would be mildly annoying

grapheneos@grapheneos.social

Whisper is actually closed source. Open weights is another way of saying permissively licensed closed source. Our implementation of both text-to-speech and speech-to-text will be actual open source which means people can actually fork it and add/change/remove training data, etc.

tedstechtips@mas.to

@GrapheneOS You guys are the best

bernard@friends.ravergram.club

@GrapheneOS
This is a great addition. I have been using Sherpa TTS https://
github.com/woheller69/ttsengine and Futo Keyboard for STT. https://keyboard.futo.org/

grapheneos@grapheneos.social

@Bernard We started working on this because Sherpa didn't meet our requirements including overly high latency making it unsuitable for blind users to use with TalkBack.

notavi10@critter.cafe

@GrapheneOS i could help with spanish and esperanto models if needed

breizh@pleroma.breizh.pm

@GrapheneOS@grapheneos.social Well, some times ago, open-source TTS was pretty lacking, but now Kaldi / Sherpa is pretty good, did you check it? If yes, what was the problem with it?

grapheneos@grapheneos.social

@breizh It wasn't quite good enough and has very high latency which makes it unsuitable for use with TalkBack. We're making this because existing options including Sherpa don't meet our requirements. Otherwise, we could have forked those. It made more sense to make our own instead which we'll be able to continue improving long term. It's similar to our network location and geocoding implementations where we want things done a particular way focused on high quality in all areas we care about.

tchambers@indieweb.social

@GrapheneOS Fascinating is the text to speech and vice versa model and code you’re working on platform specific?

hipsterelectron@circumstances.run

@GrapheneOS i was really impressed with the efficacy and UI of transcribro. no surprise to hear that was the mark of a grapheneos app

hipsterelectron@circumstances.run

@GrapheneOS the "largeness" of language models is precisely a measure of the difficulty to reproduce them. this methodology has some similarities to something i proposed to huggingface a few years back in a cover letter. no surprise to see they were not interested in reproducibility or the scientific method

grapheneos@grapheneos.social

@tchambers It's not really platform specific. It currently runs on the CPU but we plan to add TPU support for Tensor and NPU support for Snapdragon in the future. It's made for GrapheneOS and we're not interested in doing any significant work on use outside of GrapheneOS. It will be possible to install it from our App Store on other Android 16+ operating systems but it's not our focus. We're focused on making GrapheneOS better and haven't gotten much out of making stuff available elsewhere.

grapheneos@grapheneos.social

that is awesome.

how far if ever until we have a stable terminal app that can be run from any user profile?

hipsterelectron@circumstances.run

@GrapheneOS i have also been trying to find similarly motivated people to collaborate with on a research project to reproduce the fawkes facial recognition poisoner upon a mobile device (ideally as an asynchronous but fully local image postprocessing technique) cc @xyhhx @bunnyhero

Abspeckgeflüster – Forum für Menschen mit Gewicht(ung)

We've built our own text-to-speech system with an initial English language model we trained ourselves with fully open source data.