1

I'm trying to decode a video with FFmpeg and convert it to an openGL texture and display it inside a cocos2dx engine. I've managed to do that and it displays the video as i wanted to, now the problem is performance wise. I get a Sprite update every frame(game is fixed 60fps, video is 30fps) so what i did was i decoded and converted frame interchangeably, didn't work great, now i have it set up to have a separate thread where i decode in an infinite while loop with sleep() just so it doesn't hog the cpu/program. What i currently have set up is 2 pbo framebuffers and a bool flag to tell my ffmpeg thread loop to decode another frame since i don't know how to manually wait when to decode another frame. I've searched online for a soultion to this kind of problem but didn't manage to get any answers.

I've looked at this: Decoding video directly into a texture in separate thread but it didn't solve my problem since it was just converting YUV to RGB inside opengl shaders which i haven't done yet but currently not an issue.

Additional info that might be useful is that i don't need to end thread until application exit and i'm open to using any video format, including lossless.

Ok so main decoding loop looks like this:

//.. this is inside of a constructor / init
//adding thread to array in order to save the thread    
global::global_pending_futures.push_back(std::async(std::launch::async, [=] {
        while (true) {
            if (isPlaying) {
                this->decodeLoop();
            }
            else {
                std::this_thread::sleep_for(std::chrono::milliseconds(3));
            }
        }
    }));

Reason why i use bool to check if frame was used is because main decoding function takes about 5ms to finish in debug and then should wait about 11 ms for it to display the frame, so i can't know when the frame was displayed and i also don't know how long did decoding take.

Decode function:

void video::decodeLoop() { //this should loop in a separate thread
    frameData* buff = nullptr;
    if (buf1.needsRefill) {
    /// buf1.bufferLock.lock();
        buff = &buf1;
        buf1.needsRefill = false;
        firstBuff = true;
    }
    else if (buf2.needsRefill) {
        ///buf2.bufferLock.lock();
        buff = &buf2;
        buf2.needsRefill = false;
        firstBuff = false;
    }

    if (buff == nullptr) {
        std::this_thread::sleep_for(std::chrono::milliseconds(1));
        return;//error? //wait?
    }

    //pack pixel buffer?

    if (getNextFrame(buff)) {
        getCurrentRBGConvertedFrame(buff);
    }
    else {
        loopedTimes++;
        if (loopedTimes >= repeatTimes) {
            stop();
        }
        else {
            restartVideoPlay(&buf1);//restart both
            restartVideoPlay(&buf2);
            if (getNextFrame(buff)) {
                getCurrentRBGConvertedFrame(buff);
            }
        }
    }
/// buff->bufferLock.unlock();

    return;
}

As you can tell i first check if buffer was used using bool needsRefill and then decode another frame.

frameData struct:

    struct frameData {
        frameData() {};
        ~frameData() {};

        AVFrame* frame;
        AVPacket* pkt;
        unsigned char* pdata;
        bool needsRefill = true;
        std::string name = "";

        std::mutex bufferLock;

        ///unsigned int crrFrame
        GLuint pboid = 0;
    };

And this is called every frame:

void video::actualDraw() { //meant for cocos implementation
    if (this->isVisible()) {
        if (this->getOpacity() > 0) {
            if (isPlaying) {
                if (loopedTimes >= repeatTimes) { //ignore -1 because comparing unsgined to signed
                    this->stop();
                }
            }

            if (isPlaying) {
                this->setVisible(true);

                if (!display) { //skip frame
                    ///this->getNextFrame();
                    display = true;
                }
                else if (display) {
                    display = false;
                    auto buff = this->getData();                    
                    width = this->getWidth();
                    height = this->getHeight();
                    if (buff) {
                        if (buff->pdata) {

                            glBindBuffer(GL_PIXEL_UNPACK_BUFFER, buff->pboid);
                            glBufferData(GL_PIXEL_UNPACK_BUFFER, 3 * (width*height), buff->pdata, GL_DYNAMIC_DRAW);


                            glTexSubImage2D(GL_TEXTURE_2D, 0, 0, 0, width, height, GL_RGB, GL_UNSIGNED_BYTE, 0);///buff->pdata);                            glBindBuffer(GL_PIXEL_UNPACK_BUFFER, 0);
                        }

                        buff->needsRefill = true;
                    }
                }
            }
            else { this->setVisible(false); }
        }
    }
}

getData func to tell which frambuffer it uses

video::frameData* video::getData() {
    if (firstBuff) {
        if (buf1.needsRefill == false) {
            ///firstBuff = false;
            return &buf1;///.pdata;
        }
    }
    else { //if false
        if (buf2.needsRefill == false) {
            ///firstBuff = true;
            return &buf2;///.pdata;
        }
    }
    return nullptr;
}

I'm not sure what else to include i pasted whole code to pastebin. video.cpp: https://pastebin.com/cWGT6APn video.h https://pastebin.com/DswAXwXV

To summarize the problem:

How do i properly implement decoding in a separate thread / how do i optimize current code?

Currently video is lagging when some other thread or main thread gets heavy and then it does not decode fast enough.

0

You need:

  • Two buffers, one can be filled by the decoder while the other is copied to the GPU. Also, use a var (e.g. bool useFirst) to tell which buffer is used for reading and which one for writting.
  • A worker thread that decodes the frame and fills a buffer. This thread reads useFirst to tell which buffer to fill. It doesn't need to protect the buffer with a mutex.
  • A std::condition that makes the thread to wait for a new frame coming from FFmpeg and for the buffer to be writtable.
  • A timer that fires each 1/60 seconds and executes a function to transfer data to the GPU (if a buffer is already filled) and then updates useFirst and the condition variable.
  • Perhaps a second thread that reads a frame from FFmpeg, but doesn't decode it.

The thread[s] should be detachable (instead of joinable) so they can live "forever". They must check another flag/condition/notify that makes them finish and delete.

You can also use only one buffer, with doubled or more size, and swap the write/read sections, as explained in Asynchronous Buffer Transfers.

  • I tried using condition_variable but it ends up that notify_all() just triggers thread forward, and since i notify_all() right in the main drawing thread it ends up hurting performance a lot more than just putting sleep in the looping thread. I tried then using notify_one() but to same result. I managed to detach() thread but doesn't make a performance difference. I've navigated the link about asynchronous buffer transfer and went to said topic but it doesn't display the actual article about it? I already use bool firstBuff to switch between first or second buffer. – Brigapes Jun 18 at 11:00

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.