Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm looking for documentation about transcribing audio streaming data coming from WebRTC using Google Cloud Speach-To-Text. I'm using aiortc as a library in Python to handle the video and audio stream coming from a client web app.

Here is a snippet of the class that I'm using to process the audio data.

class AudioTransformTrack(MediaStreamTrack):
        kind = "audio"
    
        def __init__(self, track):
            super().__init__()
            self.track = track
    
        async def recv(self):
            frame = await self.track.recv()
            data_np = frame.to_ndarray().astype(dtype='float32').reshape(1920, )
            # print("data_np.shape:", data_np.shape)
            y_16k = librosa.resample(data_np, 48000, 16000)
            audio_data = y_16k.astype(dtype='int16').tobytes()
            return frame

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
3.9k views
Welcome To Ask or Share your Answers For Others

1 Answer

等待大神答复

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...