Discord Plays Pokémon

This is a half-baked post I decided to publish because it was clear I wasn’t ever going to finish it. You can find the source code of this project on GitHub.

Idea

Twitch Plays Pokémon, but for Discord

Demo

Prior Art

Technical goals

  • Low-cost to deploy on AWS.
    • It therefore shouldn’t require many cores or a GPU
  • Require as few dependencies on the host as possible.
  • 100% automation, no manual action required.

Challenges

  • Discord doesn’t have a way for bots to stream video (there are APIs for audio), so we must use a “selfbot” or a “userbot”. This is against Discord’s terms of service, but it’s the only way forward. There are also no documented APIs for this.

Initial approach - webcam

Use a headful browser (Chrome) running Pokémon with EmulatorJS. Stream the video of headless browser to Discord. Use the stream video as a webcam input and pipe the audio as a microphone input.

This would require that the host could install kernel modules since there is a dependency on DKMS.

Result

I was able to get audio working, but not video.

Pivoting - using screen share

I really wanted to get the webcam approach working because it felt relatively elegant, but I knew I could get it done quicker if I compromised. I used a real desktop environment with xvfb and no webcam. I used Discord’s screen share functionality instead.

After about two hours of effort, this yielded a working albeit slow proof-of-concept. xvfb is not hardware accelerated, so everything is happenning in software. This is a problem when you are encoding video. The application worked, but it was unacceptable slow. Even the audio was significantly delayed. I tried downsizing the resolution from 1280x720 to 640x576, but it was still too slow.

Make it fast - hardware acceleration

Next, I needed to make the application fast.

Attempting GPU acceleration with Nvidia and Chrome

      "--disable-software-rasterizer",
      "--disable-frame-rate-limit",
      "--disable-gpu-driver-bug-workarounds",
      "--disable-gpu-driver-workarounds",
      "--disable-gpu-vsync",
      "--enable-accelerated-2d-canvas",
      "--enable-accelerated-video-decode",
      "--enable-accelerated-mjpeg-decode",
      "--enable-unsafe-webgpu",
      "--enable-features=Vulkan,UseSkiaRenderer,VaapiVideoEncoder,VaapiVideoDecoder,CanvasOopRasterization",
      "--disable-features=UseOzonePlatform,UseChromeOSDirectVideoDecoder",
      "--enable-gpu-compositing",
      "--enable-native-gpu-memory-buffers",
      "--enable-gpu-rasterization",
      "--enable-oop-rasterization",
      "--enable-raw-draw",
      "--enable-zero-copy",
      "--ignore-gpu-blocklist",
      "--use-gl=desktop"

Selkies: https://github.com/shepherdjerred/discord-plays-pokemon/commit/f510a9f000ea2f143a1a83ec1590d8f84f4a9db5

Chrome didn’t support Web RTC hardware acceleration on Linux, but Firefox did

Switch to Selenium due to Puppeteer limitations:

Polish

  • Disable screensaver
  • Auto-saving
  • Auto-loading game
  • Auto-loading most recent save
  • Turning bot off during inactivity
  • Web interface
  • Automatically setting Discord preferences for audio/video
  • Fully automating the process

Resources

Recent posts from blogs that I like

An Introduction to Google’s Approach to AI Agent Security

via Simon Willison

Notes on Cramer's rule

Cramer's rule is a clever solution to the classical system of linear equations Ax=b: \[\begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \\ \end{bmatrix} \begin{bmatrix}x_1 \\ x_2 \\ x_3\end{bmatrix} = \begin{bmatrix}b_1 \\ b_2 \\ b_3\end{bmatrix}\] Usi...

via Eli Bendersky

Brandjes: Paintings as witnesses to fires 1640-1813

Dramatic paintings of towns and cities on fire, usually at night, were popular during the Dutch Golden Age, and known as brandjes. Examples to well into the 19th century.

via The Eclectic Light Company