BindSight

From Hack Manhattan Wiki

Github Repository | A Bricodash Sub-Project | A Beadsland Creation

Concurrent, extensible, frame-scrubbing webcam gateway.

Web API service to stream doorcam and spacecam to Bricodash and public gateway, respectively, while tracking activity and performance of these and other webcams at the space. Will be more efficient and reliable than spawning PHP and Python processes on an as-they-come basis.

Written in Elixir, will be taking advantage of various new features of the language, building on the strengths of Erlang/OTP, including Mint (web client), GenStage (backpressure event pipelines) and ultimately mix release (build-time deployment packaging).

Background

Presently, webcam-related features of Bricodash are provided on a two-prong basis.

Spacecam

The public-facing Camera showing current activity in the space is presently relayed via a PHP script. This script is configured to relay various HM cameras, including those for the exterior door, the CR-10 and (when online) the hydroponics lab. In the case of the spacecam, all relayed requests are tracked to feed the sous veil Creepy Eyeball: snapshot requests touch a file identifying the requesting pid; stream requests trigger touch two files, one on initiation of the relay, one at process termination. The Creepy Eyeball feature uses the modification times on all such files to transition the opacity of the eyeball animated gif that indicates observers are viewing the camera feed.

The Apache server hosting the Bricodash back-end imposes a cap on the number of executing PHP processes, returning a 504 Gateway Timeout error when this cap is exceeded. Under normal usage, this isn't an issue, but PHP scripts are meant to assemble dynamic content and deliver it, not to provide a data stream as a continuously running process of indeterminate duration. The current setup is thus not ideal.

A legacy gateway for the same Camera remains on Bo.x0.rs. When users of our Slack type "Who's at the space?", the snapshot that our resident slackbot posts in response is drawn from the Bo.x0.rs gateway. Bo.x0.rs also tracks camera access, exposing it via the sousveillance Chrome notification mechanism. However, the sous veil Creepy Eyeball and the sousveillance Chrome notifications system do not, at present, interact or share information.

Doorcam

Bricodash displays a feed of the exterior door camera. Rather than the PHP script, the doorcam display is drawn from a Python3 CGI script.

Our current setup for the door camera feed is prone to stalling and data corruption. The CGI script inspects individual frames, discarding those that are revealed to be invalid JPG format or, while valid, have corruption that renders as a grey field on the lower half of the image. A Slack webhook regime has also been explored for tracking stalling of the feed. However, even after dropping corrupt frames between the webcam and the server, additional frame corruption has appeared between server and client. Thus client-side Javascript has also been implemented to drop bad frames and trigger webhooks on stalls. Performance of the feed was such that the webhooks proved too noisy and so it is currently disabled.

Due to hardware limitation of both Bricodash clients, the standard webcam MJPEG push stream proved too much of a firehose to accept without risk of the client lagging or freezing entirely. For this reason, the camera feed on both clients is implemented not as a stream but a throttled frames-per-second snapshot flipbook. Initially, this was managed by the CGI script. However, with the introduction of Javascript-invoked Slack webhooks to detect stalls and corruption client-side, the feed is now an entirely pull-based snapshot flipbook.

These flipbooks are executed as upward to 10 snapshot requests per second, from each client, all of which are relayed directly to the webcam. It is possible that this has lead to greater instability on the external door camera that could be resolved by extracting frames from a single stream rather than pounding the server with near two dozen requests per second.

The CGI script also tracks delays on flipbook snapshot requests from the Bricodash clients, reading such delays as evidence of a stalled client. This triggers a restart of the Bricodash client on the Chromecast, and would trigger a webhook alert for the pishop client (if webhooks were not currently disabled). This trigger does not distinguish between delays in flipbook snapshot requests due to stalls of the client and delays due to timeouts on the client while waiting for a stalled doorcam feed. Which means that stalls on the webcam can presently result is a restart of the Chromecast that is not, itself, stalled.

Functional Specification

BindSight will replace both the PHP and Python gateways, and will be designed to allow for integration between the sous veil Creepy Eyeball observer-tracking system and the Bo.x0.rs sousveillance view-tracking and Chrome notifications system.

Features will include:

  • a GenStage pipeline to broadcast from a single stream from each camera source
  • concurrent frame corruption checking to minimize latency
  • high performance under stream and flipbook stress testing
  • plug-in architecture for performance monitoring and notifications
  • running as an independent system service (daemon)
  • configurable support for using domain-specific SSL cert
  • distributed plug-in for integration with Bo.x0.rs

BindSight will leverage Elixir's GenStage behavior to convert a single MJPEG stream into a series of frames that will then be served as any number of snapshots or reconstituted MJPEG streams on demand. This will be implemented as two half-pipelines, or spigots. For each webcam, as slurpspigot will consume an MJPEG stream, releasing it as individual frames, initiate frame corruption checking, track camera performance, and broadcast to sinks. For each client request, a spewspigot will subscribe to the appropriate slurpspigot, scrub out corrupt frames, watermark the last available image in the event of a camera timeout, and serve the resulting snapshot or stream via HTTPS.

Custom plug-ins (including several local plug-ins) will be installable per camera (slurpspigot) or per client/request (spewspigot). These will operate out-of-band as GenStage processes running concurrently with the throughput of the main pipeline. Likewise, frame corruption checks will run asynchronously, allowing frames to be passed down the pipeline independent of JPEG decompression and pixel inspection. Concurrency is the core strength of the Erlang virtual machine (BEAM), which is designed to obtain exceptional performance by spinning up large numbers of lightweight threads.

Erlang, and thus Elixir, is also designed for distributed systems. We will take advantage of this for integration with Bo.x0.rs, relying on Erlang's native node-discovery to manage message passing between souseveillance and sous veil, despite these systems running on different hardware.

Current Project Plan

Fault Tolerance

  • refactor spigot as behaviour
  • insert exit monitor into children
  • exit monitor crashes on exit received; supervisor strategy of rest_on_one
  • webapi exit monitor reconnects to new spew and keeps running

Refactoring

  • no reason to split on doubledash
  • reconfigure profile to run when profile_seconds > 0
  • refactor adhere/hold to own Adhere genstage
  • rebalance chunk load to other genstages
  • digest: refactor handle_events/3 case statement to function clauses
  • digest: drop EOL determination
  • digest: refactor further if possible

Error Recovery

  • deferred spigot spinup in event camera offline at startlink
  • fix EOL check to verify in case of dropped bytes
  • test logging on down camera
  • don't nag when connection down (periodic warn)
  • ensure recovery when source stream fails

Validation

  • review whether batch requires task
  • consumer-producer in spew to filter out :corrupt/:greytoss/fail messages
  • retool Validate stage to send tuple {:ok, binary} or {_status, binary}
  • retool Digest stage to sent tuple {:ok, binary} or {:fail, error}
  • retool polling functions to send tuple {:fail, error} when such occur
  • consumer-producer in slurp to introduce :corrupt/:greytoss on :test stream

Deep Validation

  • implement greytoss checking
  • replace async batching with task/agent batching, for JIT evaluation
  • watermark with source timestamp
  • configurable agpl watermark
  • provide for timeout watermarking

Advanced

  • review supervision tree for batch tasks & agents
  • review memory usage to mitigate ProcBin leaks
  • integrate certificate used by Apache
  • configure to launch as daemon
  • bootstrap to obtain dependencies and compile cold

Robustness

  • type guards on get_env
  • @impl on all OTP callbacks
  • type guards all public functions
  • type checks on unstructured opts
  • typespecs and dialyze throughout

Subsystem Integration

  • slurp snoop to track performance on each camera
  • slurp snoop to webhook on corruption/timeouts
  • spew snoop to swap out CGI for upt/chk touch points
  • spew snoop to track client fps performance

Sousveillance Integration

  • spew snoop to swap out PHP for sous veil touch points
  • distributed sous veil client for cross-platform data exchange
  • integration with sousveillance watch-back system

Wrap-up

  • add robots.txt route
  • configure public ipcam as :test
  • flesh out documentation
  • investigate periodic multi-hit events: server severing connections prematurely? badly behaved client device?
  • investigate specific issue of pishop launch causing multi-hit failures on all other clients
  • stress test
  • explore sobelow security checking