Introduction
With recent improvements in the capabilities of cloud infrastructure and quality of wireless networks, mobile cloud computing systems have become an attractive choice for hosting computationally complex services. Motivated by the increasing popularity of online games [1], researchers have focused on design and implementation of cloud-assisted architectures for latency-sensitive online games [2]–[5]. In contrast to traditional platforms such as game consoles and desktop computers where most of the computations are conducted in the client device, in cloud-assisted designs, the majority of the complex tasks including game status updating, graphic rendering, and video encoding are run in cloud servers.
Cloud gaming has numerous advantages for both players and game developers [2], [3] including: a) Cost reduction: Migrating the game status computation and graphic rendering to a remote server enables players to use thin clients. b) Platform independence: A greater variety of games can be played on the same device. c) Piracy prevention: The game source code is stored only in the game server. d) Resource enhancement: Servers have greater processing power and memory as compared to mobile devices.
However, designing a cloud gaming system that provides a high quality of experience (QoE) for players involves some difficult challenges. There are two important factors affecting QoE [6]. First, since game updates are streamed back to the thin client in the form of video frames, the design must maintain user-perceived video quality; neither frozen frames nor frame rate changes are desirable. Second, the interactive and real-time nature of a cloud game, particularly its dependence on the online generation of game updates, make it more delay sensitive.
An important QoE metric in cloud gaming is the response time which corresponds to the elapsed time between the generation of an action and the displayed result of that action. In fact, the response time reflects the accumulated delay in different components including network and processing delays. In cloud gaming, the sensitivity of users to response-time delay depends on the type of game; a low-latency game such as a first person shooter, which is the focus of this work, requires a response time on the order of 100 ms [3], [7].
Considerable work has been done to evaluate and improve the quality of bandwidth-efficient video streaming; see, for example, [8], [9]. To reduce the effect of network and server congestion, most video streaming systems employ large playback buffers and adaptive bitrate techniques. In low-latency gaming, however, a large video buffer is a non-starter. For a 30 fps system, three buffered frames contribute 100 ms delay. This latency constraint also precludes complex video compression methods [10] that code over multiple correlated frames. When image compression is restricted to individual frames, dynamic adjustment of the frame resolution based on client feedback could still be employed to facilitate smooth display. Practical techniques however use playback buffer backlog variations to signal rate changes [8] and thus are unsuitable for low-latency gaming.
Furthermore, network-layer QoS metrics such as packet throughput and delay do not precisely capture the viewer experience [2], [3]. Packet delay may only loosely correlate with the timely delivery of a frame and it generally fails to describe the likelihood of missing and frozen frames. In this work, we present a quantified representation of the user-perceived QoE of the cloud gaming system based on novel client-side measurements. The users observe the games through a fixed frame rate video. A user's QoE is degraded when there are imperfections in the sequence of displayed frames. In this work, we try to tackle QoE evaluation by direct examination of the missing frame process.
As the main performance objective of a cloud gaming system is the “timely” update of players regarding the game status, we follow the concept of age of information introduced in [11] to characterize the system performance. This approach measures system performance in terms of the average freshness of status updates. In the context of a mobile gaming system [3], a game player uses a mobile terminal to submit actions through a network to a game server. The game server responds to these actions and the inputs of the artificial intelligence of the game (aka the game AI) to generate responses in the form of video frames for the player. The inputs/actions of the players and the input of the game AI induce changes in the game state. In principle, the game state evolves continuously at a game server. In practice, the game server simulates the game by incorporating all inputs in order to update the game at a certain tick rate. A player observes the game via a full-motion video at a fixed nominal frame rate. Each video frame displayed at the mobile client represents a sample of the game status as provided to the player.
This work examines low-latency (sub 100 ms) gaming as enabled by edge cloud game servers. In this setting, video playback delays of even a handful of frames constitutes an outage. Short-duration frame freezes are also undesirable. Status age measures these frame freezes seen by the player when frames go missing. In this setting, we propose a system model for a low-latency edge-cloud gaming system and a simple analytical approximation. Based on ns-3 simulation of single-player and multi-player scenarios, we show that this analytical model provides an accurate performance characterization. In addition, we show how careful synchronization of the game server and the mobile client display can improve performance by roughly 20 percent.
The system model appears in Section II, followed by the analytic model in Section III. Simulation-based performance comparisons are provided in Section IV. In Section V, we describe related work and we conclude in Section VI.
System Model
As depicted in Figure 1, the gaming system has the following networked components:
A mobile client that submits user actions to the game server.
A game server that combines buffered user actions with game AI actions to generate game status updates.
A frame renderer that translates game server updates to video image frames (i.e., images to be displayed on the screen of the mobile client.)
A display buffer at the mobile client that displays buffered frames at a fixed frame rate.
The output of the game server is a sequence of instructions for the frame renderer to construct an image frame. These instructions may contain far fewer bytes than the images that they describe. We refer to these instructions as updates since they update the renderer (and ultimately the player) on the status of the game.
In the most general setting, the mobile client, the game server and the frame renderer are each entities connected by networks. In mobile gaming, the connection from the mobile client to the game server will include a wireless access link. In a conventional gaming system, the frame renderer is integrated in a relatively powerful client. In this case, the game server sends status updates across the wireless link to the mobile client. In our thin-client mobile setting, we consider two scenarios:
The frame renderer may be integrated with the game server so that forwarded frames from the game server to the renderer takes negligible time. In this case, image frames are transmitted over the wireless link to the mobile client.
The frame renderer may be a separate entity in the network. For example, the renderer may be located in an edge cloud server that renders image frames for many mobiles in the same local area [5].
Each of the network entities as well as the network interfaces may degrade the game performance. If the access network is slow, a player's actions will suffer queueing delay in the network. As the game server is likely to be processing game updates for multiple mobiles, possibly in multiple games, there may be queueing of actions at the game engine. Game updates, which may process multiple queued actions in a single frame, can have varying execution times as the task complexity can fluctuate. Randomness in game updates, in combination with randomness in the network can result in queueing of updates at the output of the game server. Randomness in the execution time required to render a frame can also result in both queueing at the input to the renderer and queueing of frames at the output of the frame renderer.
To analyze this complex situation, we propose to decompose the system. The “actions” submitted by the mobile are short instructions, possibly just a few bytes each in length. Furthermore the offered rate of these actions will be low; human responses at a rate of more than 100 actions per second are unlikely [12]. Thus, the traffic generated by the user actions will be small, on the order of 10 kb/s [2], relative to the data rates sustainable on even a moderate-rate access network. Thus in a well designed system, the delays in delivering these actions to the action queue at the game server should be negligible, relative to either the delays in human response or the downlink delay of transmitted frames.
Instead, we focus on the timely rendering of game server updates at the mobile display. Specifically, we assume updates are produced at a fixed frame rate
In terms of the game state at the game server, user actions that arrive for processing in the interval
The mobile client employs a time lag
the game server incorporates user inputs and game AI to produce update
;k update
is sent, possibly through a network, to the frame renderer;k the frame renderer generates the video frame
;k frame
is sent to the mobile for display at timek .T_{k}
If frame
The mobile client can optimize the lag
In general, the game server sends game status updates via a network to the frame renderer. It is difficult to analyze this without making strong assumptions regarding both the network and the update delivery protocol. In prior status updating work, the disadvantages of queueing of status updates has been noted in a variety of contexts [11], [13]. Specifically, status age suffers if a service facility is processing old updates when newer updates are in the system. That is, the delivery of a newer update offers a larger reduction in status age and also obviates the need for sending the older update. In short, the objective of the system is to display the most recently generated frame by avoiding the processing and queueing of old updates and frames at the expense of newer updates and frames.
To facilitate this goal, we adopt a form of preemptive stop-and-wait protocol for the forwarding of updates from the game server to the renderer. We describe this as an application-layer protocol, although its logic could be implemented at the transport layer. Specifically, the game server initiates the creation of update
The logic of this preemptive service is assumed to be employed throughout the system. At the input to the frame renderer, update
The Update Age: An Analytic Model
To build a tractable analytic model, we assume that the time required for processing and sending update
At time
, the mobile displays its most recently buffered frame.T_{k} If
, then frameY_{k}\leq\tau will be displayed at timek .T_{k} If
, then frame\tau < Y_{k}\leq T+ \tau will be received by the mobile at timek . It will be displayed at timekT+Y_{k} unless it has been preempted by frameT_{k+1} .k+1
The idea behind this model is that until time
This model is idealized in certain ways. First, if up-date/frame
Even with this simplified system model, basic tradeoffs between system configurations are not well understood. Where should the frame renderer be located? What frame rate optimizes the user experience? How should the lag
Example change in status update age
A. Notation
For random variable
B. Status Update Age Analysis
Frame
Thus, the age process
The time-average age of the status updates is the area under the age graph in Figure 2 normalized by the time interval of observation. For simplicity of exposition, the observation interval is chosen to be \begin{equation*}
\Delta^{(K)}=\frac{1}{KT}\int_{\tau}^{KT+\tau}\Delta (t)dt.
\tag{1}
\end{equation*}
We decompose the area defined by the integral (1) into a sum of disjoint polygonal areas \begin{equation*}
\Delta^{(K)}=\frac{1}{KT}\sum_{k=0}^{K-1}Q_{k}.
\tag{2}
\end{equation*}
The area \begin{equation*}
Q_{k}=T^{2}/2+\tau T+X_{k}T^{2}.
\tag{3}
\end{equation*}
It follows from (2) that \begin{equation*}
\Delta^{(K)}=T\left[\frac{1}{2}+\frac{\tau}{T}+\frac{1}{K}\sum_{k=0}^{K-1}X_{k}\right].
\tag{4}
\end{equation*}
The time-average age is
We first observe that \begin{equation*}
X_{k}= \min \left\{i\geq 0 \left\vert \begin{array}{l}Y_{k-i} \leq iT+\tau, \\ Y_{k-i}\leq(B-1)T+\tau. \end{array}\ \right.\right\}
\tag{5}
\end{equation*}
We note in (5) that the first condition ensures that frame
C. Missing Updates Markov Chain Analysis
Given that buffering outdated updates is generally a bad idea, we now present a Markov chain analysis of \begin{equation*}
X_{k}= \min\{i\geq 0\vert Y_{k-i}\leq\min(1,\ i)T+\tau\}.
\tag{6}
\end{equation*}
This implies \begin{equation*}
\mathrm{P}[X_{k}=0]=\mathrm{P}[Y_{k}\leq\tau]
\tag{7}
\end{equation*}
\begin{align*}
\mathrm{P}[X_{k}=j]=\mathrm{P}[Y_{k-j} \leq T+\tau,\ Y_{k-j+1} > T+\tau,& \\
\ldots,\ Y_{k-1} > T+\tau,\ Y_{k} > \tau].& \tag{8}
\end{align*}
With the assumption that the \begin{equation*}
P_{X_{k}}(j)=\begin{cases}
F_{Y}(\tau) &j=0,\\
F_{Y}(T+\tau)\overline{F}_{Y}(T+\tau)^{j-1}\overline{F}_{Y}(\tau) &j\geq 1. \end{cases}\tag{9}
\end{equation*}
\begin{equation*}
p_{0}=\mathrm{P}[X_{k}=0\vert X_{k-1}=j]=F_{Y}(\tau),
\tag{10}
\end{equation*}
\begin{align*}
p_{1} &= \mathrm{P}[X_{k}=1\vert X_{k-1}=j]=F_{Y}(T+\tau)-F_{Y}(\tau), \tag{11} \\
q &= \mathrm{P}[X_{k}=j+1\vert X_{k-1}=j]=\overline{F}_{Y}(T+\tau). \tag{12}
\end{align*}
The Markov chain is ergodic as long as \begin{equation*}
\pi_{j}=\lim_{k\rightarrow\infty}\mathrm{P}[X_{k}=j]
\tag{13}
\end{equation*}
\begin{align*}
\pi_{0} &= \sum_{i=0}^{\infty}\pi_{i}F_{Y}(\tau)=F_{Y}(\tau), \tag{14} \\
\pi_{1}&=\overline{F}_{Y}(\tau)\pi_{0}+\sum_{i=1}^{\infty}\pi_{i}[F_{Y}(T+\tau)-F_{Y}(\tau)] \\
&= \pi_{0}+(1-\pi_{0})F_{Y}(T+\tau)-F_{Y}(\tau) \\
&=\overline{F}_{Y}(\tau)F_{Y}(T+\tau), \tag{15}
\end{align*}
\begin{align*}
\pi_{j} &= \overline{F}_{Y}(T+\tau)^{j-1}\pi_{1} \\
&= \overline{F}_{Y}(\tau)F_{Y}(T+\tau)\overline{F}_{Y}(T+\tau)^{j-1}. \tag{16}
\end{align*}
As we would expect, (16) is consistent with the PMF of \begin{equation*}
\lim_{K\rightarrow\infty}\frac{1}{K}\sum_{k=0}^{K-1}\mathrm{E}[X_{k}]=\sum_{j=1}^{\infty}j\pi_{j}=\frac{\overline{F}_{Y}(\tau)}{F_{Y}(T+\tau)}.
\tag{17}
\end{equation*}
The next claim then follows from (4) as
The average age of the system with frame period \begin{equation*}
\Delta_{2}(T,\ \tau)=T\left[\frac{1}{2}+\frac{\tau}{T}+\frac{\overline{F}_{Y}(\tau)}{F_{Y}(T+\tau)}\right].
\end{equation*}
Theorem 1 provides a simple characterization of the average in terms of the distribution of update delivery times. When the system is designed cautiously,
Despite the approximations made by the analytic model, we will see that it provides a surprisingly accurate calculation of the status age in a low-latency edge-cloud gaming system. For fixed
D. Lag Periodicity of the Age
Theorem 1 describes the average age \begin{equation*}
Y_{k-i}\leq(B-1)T+\tau=(B+j-1)T+\tau_{0}.
\tag{18}
\end{equation*}
To summarize, when \begin{equation*}
X_{k}= \min\left\{i\geq 0\left\vert \begin{array}{l} Y_{k-i}\leq iT+\tau_{0}, \\ Y_{k-i}\leq(B+j-1)T+\tau_{0}. \end{array}\ \right.\right\}
\tag{19}
\end{equation*}
Comparing (19) and (5), we see that buffering limit \begin{equation*}
\Delta_{B}(T, jT+\tau_{0})=\Delta_{B+j}(T,\ \tau_{0}).
\tag{20}
\end{equation*}
Evaluation
In this section we evaluate the role of the average age of the system as a metric in providing richer information on design and optimization of soft real-time interactive cloud applications. The architecture of our simulated cloud gaming system is illustrated in Figure 1 which consists of the client and server module connected through wireless network.
To evaluate the proposed analytical model, we implemented a mobile cloud gaming scenario in ns-3 where the main parameters of this simulation are summarized in Tables I and II. The simulation parameters are chosen to capture the complexity of cloud-gaming systems including bandwidth constraints, rendering, and encoding/decoding delays. In our simulations, we have considered both single-player games and multi-player games where in the multi-player scenario, different players submit their commands to the same game server, and the game server sends game status update messages to edge servers responsible for rendering the frames. Each client receives its frames from the edge server which is associated to. This model corresponds to geographically distributed players, each connected via aseparate edge cloud server.
Throughout this section, the term single-server refers to a scenario where processing tasks for both game status update and frame rendering are conducted in the same server. Similarly, the term multi-server refers to the alternative case where game status update is performed in a central server and frame rendering is done in edge cloud servers.
We start with a simple example in which the single-server execution duration has an exponential distribution with expected value \begin{equation*}
F_{Y}(y)=\begin{cases}
0 & y < y_{0},\\
1-e^{-\mu(y-y_{0})} & y\geq y_{0}.
\end{cases}
\tag{21}
\end{equation*}
Similarly, in the multi-server scenario, if both game and rendering servers have exponentially distributed execution times each with expected value \begin{equation*}
F_{Y}(y)=\begin{cases}
0 & y < y_{0},\\
1-e^{-\mu(y-y_{0})}(1+\mu(y-y_{0})) & y\geq y_{0}.
\end{cases}
\tag{22}
\end{equation*}
A. Analysis of the Average Age of the System
As indicated in section III-B, the age of the system depends on target frame rate and the fine tuning parameter
In all three presented systems, the maximum age corresponds to having
As indicated in Figure 4, the range of the values for system age is on the order of 50–63 msec. Although this might look trivial, for delay sensitive applications, especially gaming systems, it would be considered large and as shown above, the age presents an interpretation for the system such that the client device can optimize its synchronization with the server and adapt over time. This is possible for both single-player (Figures 4a and 4b) and multi-player games (Figure 4c).
In Figures 4a and 4b, in addition to the
Figure 5 illustrates the dependency of the average age on the system frame rate. We have compared lightly loaded channel with a busy channel for both
B. Effect of Channel Load on Age
To investigate the effect of suboptimal network and server resources on age of the system, we simulated two scenarios where clients are sending with rate 30 fps but the available channel is limited to 2 Mbps and 1.2 Mbps. Figures 6a and 6b illustrate the resulting age as a function of
Comparing Figure 6b with Figures 6a and 4a, we can see that in a more heavily loaded scenario, the age of the system has increased. Moreover, since the frames for different clients are queued in the network interface of the server, each game update experiences an exponential service time and an M/M/1 queue. Consequently,
C. Multi-Player Scenario
In this section, we simulated a multi-player scenario with geographically distributed set of edge servers where each player is associated to a different edge server with different processing power and edge delay (i.e., transmission time from edge server to the client terminal). Figure 7 illustrates the variations of age versus
Related Work
Most of the previous works in the area of mobile cloud gaming can be divided into two categories: i) Optimization of the user-perceived frame rendering of the game state using a variety of predictive techniques [14], [15]. ii) Design, optimization, and evaluation of commercial cloud gaming systems [7] including NVIDIA GRID [16] and PS Now [17]. Outatime [18] is a predictive-based solution implemented on top of commercial games to reduce latency and substitute the lack of buffer with predicted future game frames based on user-specific behavior patterns.
Average age of the system as a function of
Motivated by the popularity of commercial cloud gaming systems, we focus on design and optimization of edge-cloud gaming systems as described in [5]. Perhaps the most closely related system is the StreamMyGame (SMG) system evaluated in [2]. Just as in the experimental model evaluated here, the SMG system deploys a game server attached to the same local network as the client. Unlike this work, however, server processing times on the order of 300–400 ms dominated the response time of the SMG system [2].
It has been shown that user-perceived timeliness can be enhanced at the expense of consistency, a measure of how accurately a video frame describes the true game state. By contrast, we attempt to separate issues related to the depiction of the game state from the timeliness of the game state information at the client. Our approach is based on timeliness of the state information at the mobile client, rather than how that state is depicted for the client. While a game status update generated by the game server at time
In the area of age of information, the analysis of the properties of queue-based models have dominated the emerging literature; see, for example [11], [13], [19]–[21]. However, these analytic models are approximations of real systems. A first effort to evaluate the accuracy of analytic models by simulation was provided in [22]. The ns-3 perfomance evaluation in Section IV of the cloud gaming system is similar in spirit in that our analytic model is only an approximation of the gaming system with both networking and processing delays.
Conclusion
In this work, we presented a quantified representation of the user-perceived QoE of latency-sensitive real-time cloud-assisted applications such as gaming based on missing frames. As the objective of a cloud gaming system is the “timely” update of players regarding the status of the game, we employed an age of information metric to characterize the system performance. The age metric incorporates both the lag in response time and the latency associated with periodic framing. Our proposed metric, is in terms of both interaction delay and stream quality. Based on the proposed timeliness metric, we provide an analytical framework supported by extensive simulation for the problem of optimizing frame rate and lag synchronization of server and player. Based on the obtained results, age can be applied as a tool to synchronize game sessions based on the current status of the server. Furthermore, it can be used as a parameter in assigning edge servers to clients and designing resource allocation algorithms for cloud gaming systems. We have suggested that user QoE in a low latency gaming system can be captured by the delayed or missed video frames, and proposed the average age of video frames as a QoE metric. However, we observe that further study is needed to determine whether the average frame age is an effective measure of user-perceived QoE for various types of games with different sensitivity to latency. It may be that other metrics derived from the missing frame process
Appendix A Markov Chain Analysis
The
If
, thenX_{k}=0 If
, thenY_{k+1}\leq \tau ; otherwise,X_{k+1}=0 .X_{k+1}=1
If
, thenX_{k}=j > 0 If
, thenY_{k+1}\leq\tau .X_{k+1}=0 If
, then\tau < Y_{k+1}\leq T+\tau .X_{k+1}=1 If
andY_{k+1} > \tau , thenY_{k} > T+\tau .X_{k+1}=j+1
For \begin{equation*}
\mathrm{P}[X_{k+1}=0\vert X_{k}=j]=\mathrm{P}[Y_{k+1}\leq\tau]=F_{Y}(\tau).
\tag{23}
\end{equation*}
Of course, this implies \begin{align*}
&\mathrm{P}[X_{k+1}=j+1] \\
&\qquad =\mathrm{P}[X_{k}=j,\ Y_{k} > T+\tau,\ Y_{k+1} > \tau]. \tag{24}
\end{align*}
Note that \begin{align*}
&\mathrm{P}[X_{k+1}=j+1\vert X_{k}=j] \\
&=\mathrm{P}[Y_{k} > T+\tau,\ Y_{k+1} > \tau\vert X_{k}=j] \\
&=\mathrm{P}[Y_{k} > T+\tau\vert X_{k}=j] \\
&\qquad\ \ \times \mathrm{P}[Y_{k+1} > \tau\vert Y_{k} > T+\tau,\ X_{k}=j] \\
&=\mathrm{P}[Y_{k} > T+\tau\vert X_{k}=j,\ Y_{k} > \tau]\mathrm{P}[Y_{k+1} > \tau] \\
&=\mathrm{P}[Y_{k} > T+\tau\vert Y_{k} > \tau]\mathrm{P}[Y_{k+1} > \tau] \\
&=\frac{\mathrm{P}[Y_{k} > T+\tau]}{\mathrm{P}[Y_{k} > \tau]}\mathrm{P}[Y_{k+1} > \tau] \tag{25} \\
&=\overline{F}_{Y}(T+\tau), \tag{26}
\end{align*}
\begin{align*}
&\mathrm{P}[X_{k+1}=1\vert X_{k}=j] \\
&\qquad =\mathrm{P}[Y_{k}\leq T+\tau,\ Y_{k+1} > \tau\vert X_{k}=j] \\
&\qquad =\mathrm{P}[Y_{k}\leq T+\tau\vert X_{k}=j] \\
&\qquad\qquad \times \mathrm{P}[Y_{k+1} > \tau\vert Y_{k} > T+\tau,\ X_{k}=j] \\
&\qquad =\mathrm{P}[Y_{k}\leq T+\tau\vert Y_{k} > \tau]\mathrm{P}[Y_{k+1} > \tau] \\
&\qquad =\mathrm{P}[\tau < Y_{k}\leq T]=F_{Y}(T+\tau)-F_{Y}(\tau). \tag{27}
\end{align*}
We observe that (23), (26) and (27) verify that








