Online theatre has had an interesting journey over the past few months. Intent on making their work available to a remote audience, we’ve seen theatre companies streaming their current or archived shows, commissioning monologues, or conducting play-readings over videoconferencing services. At Mesmer, we’ve long been involved with immersive and interactive theatre. Companies like Secret Cinema and Dot Dot Dot have given us the opportunity to develop ways of working that allow the audience to interact with the performers, choose their own path around the show. We started thinking: how could we utilise the technology we already have to create remote, online immersive events?
Party Skills for the End of the World was a 2017 show produced by Manchester International Festival and Shoreditch Town Hall, and created by Nigel Barett and Louise Mari of Shunt. Audiences could freely roam in a warehouse of surprises; a huge walkaround party where in various rooms, they were taught important skills designed to be able to help them with any social situation they might find themselves in at the end of the world. These varied from martial arts workshops to pigeon training 101!
MIF approached us to see if we could help develop a way of moving this huge and immersive show online. Working with the rest of the production team, we came up with a few parameters:
- The show needed to be interactive. Audiences should be able to communicate back to performers at various points.
- We needed the ability to separate the audience into groups, send them off to be taught skills, and then reconvene them again.
- The audience had to be aware of each other, even while performances were happening. They ought to feel like they were all in the same space.
- The result should be on Zoom. There are plenty of other ways of doing conferencing or streaming but Zoom had by this point become the standard for audiences and our performers, so we used Zoom.
Straight off the bat, Zoom already includes a lot of these required features — and for better or worse most audience members have become intimately familiar with its interface over the last few months. In particular it offers the following features that we wanted to exploit:
- Capable of up to 1000 ACTIVE participants
- Works across multiple devices (so less technical support needed at our end)
- Fairly reliable track record
- Spotlighting and Pinning – more about this later
- Security features to control ticketing and moderation
As we set about testing all this and how we could use these features to our advantage, we soon noticed two features almost did what we needed but not quite.
Although we wanted the show to be immersive and interactive, we needed at some points to control what the audience was experiencing. We didn’t want them to just look at whatever Zoom thought was the most important person or, as we found is the case, the loudest person. Spotlighting is Zoom’s way of ‘cutting’ between camera feeds to present you with who it thinks is the most important person. Usually it does this automatically based on who is speaking — but there is also a feature where any ‘host’ (Zoom’s term for the admin of a call) can choose particular feeds to spotlight.
A few lockdown theatre shows have used this quite well; a stage manager or camera director uses the Spotlight feature to cut between performers’ feeds in time with the script, possibly including audience feeds in immersive sections, or screen-sharing their own feed for visuals or graphics. But this leaves you reliant on Zoom to get the timing right, and trying to find the right feed in a list of 1000 people can be quite hard.
Another potential stumbling block is Gallery view. Any user of Zoom can switch away from the default Speaker view to Gallery view and see everyone on the call: 49 camera feeds per page. In theory this is GREAT for building the sense of community — with everyone watching and reacting together. But it makes it impossible to direct the attention of the audience or to spotlight a performer, which risks making the show incomprehensible. If we wanted to avoid people switching over to Gallery view, we needed to recreate its advantages in Speaker view.
We had to rethink these features and come up with a way of creating a main show feed in Speaker view that audience members would be happy with for most of the show. This would have to incorporate the performers and the audience gallery, scaled in such a way that it was clear who the focus was on, but also allowing the audience to see the rest of the crowd and anyone reacting to the instructions of the performers.
This Hero Feed (to borrow an immersive theatre term), would need to be a properly curated, designed and controllable video edit, created in a separate piece of video production software and then fed back into the Zoom call and spotlit by the host. There are a few options for video production software that can do the layouts (we used OBS), but we would need to figure out the best way of getting the individual feeds into it.
Zoom Rooms and Pinning
Our first attempt to set up this Hero Feed used Zoom Rooms. This is Zoom’s software solution to a permanent corporate meeting room with pre-set cameras and screens. The meeting room computer can log into a Zoom call and display up to three participants at a time on dedicated physical screens. The participants from the call are Pinned to those screens; so the view never switches away from them when other people speak.
For our purposes, it allows us to have a computer which always has specific participants from a Zoom call displayed on its screens, and it lets us do that independently of what is currently spotlit in the Zoom call itself. We set up two Zoom Rooms on two separate computers in our studio and Pinned six performers to these computers’ outputs. We then used NDI to send those six video feeds over our internal network to our video production software.
So we had six independent video signals taken from the Zoom call, which we could layout and cut between at will in OBS. It worked, but the audio presented a problem. The sound of our performers was being sent straight from their device to the Zoom call. Whereas their video stream was being sent to the Zoom call, then Pinned in the Zoom Room, then across our network to our video production system, composited in OBS with everything else, then finally sent out on our Hero Feed on Zoom. As you can imagine, this introduces some delay into the video and pulls it out of sync with the sound. With no way of isolating the performers’ audio from the rest of the Zoom call, we had no way of syncing them back up again.
In addition, Zoom software tries to be a bit too clever when audio is coming from multiple sources, or when people are speaking simultaneously. It has algorithms that determine who can be heard, which make it difficult to balance anyone, especially against music. All this made it extremely challenging for the performers on Zoom since they would be watching the Hero feed to respond to their fellow performers, who are reacting out of sync with the audio and might be badly mixed. Back to the drawing board.
To get around this we separated the main performers out from the Zoom call. Instead of logging onto Zoom, we used OBS.ninja (excellent new open-source tech developed by Steve Seguin) for the performer feeds. This did two things: it created a video stream by sending video frames directly from each performer to our studio, and it created an in-browser video conference call with all the performers together. We called the in-browser conference call ‘the Stage’, and as well as the performers, it contained the showcaller and the mixed Hero Feed.
With every performer feed coming into our OBS system separately, many of the problems from before were solved. We had control over each performer’s audio level, greater control over the quality of each performer’s video feed, and the showcaller could freely cue the actors without being heard on the Hero Feed.
The Final Setup
The setup used for the show was based around a main Zoom call, which consisted of the audience, the skillers (the people who would teach the skills), and a team of moderators who were checking tickets in the waiting room, moving people into skill rooms and watching out for bad behaviour. The Zoom call was set to spotlight a single Hero Feed, piped in from our OBS system. This was the ‘main room’, in which Nigel and Lou would present the show. At various times during the show, the audience would be separated into one of twelve breakout rooms, each containing a skiller and mod, where they would spend ten minutes learning a skill before being returned to the main room. If they didn’t like their breakout room, they could return to the main room and be given a different one.
- In a separate online space, Nigel, Lou and a couple of other main room performers were logged into the Stage, cued by the showcaller, and could also see and hear the Hero feed whenever needed. They performed to the main room, in time with each other and the audio.
- Two separate computers were logged into the call running Zoom Rooms, each of them Pinning either specific gallery pages, or audience members who were responding the performers. We could pick who was Pinned throughout the show.
- Finally, we had a separate Mac Pro to playback the sound design and pre-recorded video. QLab ran the sound while we used Milumin to playback video clips. Video was sent out of this via NDI and audio via Dante to our OBS system.
- All of this: the performers’ individual feeds, audio from QLab, Pinned audience members or gallery pages, and pre-made graphics, fed into OBS and was used to create our Hero feed. Once in OBS, we could layer, panel and do effects to it…?
The whole system was set up in the Mesmer studio, which we laid out like a TV control gallery with two rows, socially distanced 2m apart. It comes equipped with a 10gbps internal network (neccesary when we’re bouncing video between so many computers via NDI) and a reliable 1gbps dedicated internet connection. All together including the control and playback systems, we used six of our powerful animation computers and two iPads (to control the Zoom Rooms), as well as a bunch of other ancillary equipment we had like sound mixers and control surfaces. Blue-i kindly donated a projector so we could keep track of all the feeds on a large projected output.
On the evening of the 22nd May, we did two performances of the show, each with around 200 attendees. Technically, things went really well. There’s a few things that could still be developed, but we proved that this is a functional way of mixing multiple performers, allowing them to interact with each other and with an audience in a manner that isn’t too dissimilar to a normal show, whilst keeping the whole thing accessible in Zoom.
Having said that, there were a few issues that we couldn’t solve inside Zoom: we had occasional problems with audience members being kicked out of the call when being moved to breakout rooms, some people who couldn’t access all of Zoom’s features on their device, and there is a long-standing difficulty with anything that requires both the audience and the performers to be in time with music. Looking ahead, it’s possible that a lot of these issues could be solved by moving online performance onto its own dedicated conferencing system. But with two weeks to plan and execute this project, building something from scratch was unfeasible.
We also had to consult on and develop systems for moderating the various rooms, making sure ticket-holders could be identified and regulated, and making sure we could rehearse the project before letting the audience in. All of this is beyond the scope of this case study, but an interesting exercise. Again, a bespoke system would give us more control over all of this, but with the time we had we were able to make Zoom’s existing waiting room and co-host features do everything we needed to do,. We used Discord for private admin communication and showcalling.
There’s still a long way to go before the system will manage something quite as complex and audience-heavy as a full scale live immersive show, but if there was ever a time to develop such a system, it’s now!
Salvador Bettancourt Avila & Ian William Galloway, June 2020
Edited by Emily Bagshaw