The ACM Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS) focuses on the measurement and performance evaluation of computer systems and operates in close collaboration with the ACM Special Interest Group SIGMETRICS. All papers in the last three (i.e., this one and the previous two) issues of POMACS will be presented during the ACM SIGMETRICS 2021 virtual conference. Each of the three issues (one per call for papers) contains papers selected by the editorial board via a rigorous review process that follows a hybrid conference and journal model, with reviews conducted by the 86 members of our POMACS editorial board. Each paper was either conditionally accepted (and shepherded), allowed a "one-shot" revision (to be resubmitted to one of the subsequent two deadlines), or rejected (with resubmission allowed after a year). Over the three issues during which we have served as editors, POMACS has published 35 papers out of 315 submissions (84 to the Summer call, 107 to the Fall call, and 124 to the Winter call). All submitted papers received at least 3 reviews and we held an online TPC meeting for each deadline/issue. Based on the indicated primary track, roughly 16% of the submissions were in the Learning track, 29% were in the Measurement & Applied Modeling track, 26% were in the Systems track, and 29% were in the Theory track. Many people contributed to the success of POMACS over the last year. First, we would like to thank the authors, who submitted their best work to SIGMETRICS/POMACS. Second, we would like to thank the TPC members who provided constructive feedback in their reviews to authors and participated in the online discussions and TPC meetings. We also thank the several external reviewers who provided their expert opinion on specific submissions that required additional input. We are also grateful to the SIGMETRICS Board Chair, Giuliano Casale, and to past TPC Chairs, Athina Markopoulou and Y. C. Tay, who provided a wealth of information and guidance. We also thank Adam Wierman for outlining the soft tracking guidelines that were adopted for SIGMETRICS 2021. Finally, we are grateful to the Organization Committee and to the SIGMETRICS Board for their efforts and initiatives in the middle of a pandemic, including making papers and videos available online and enabling online interactions.

A new mobile computing paradigm, dubbed mini-app, has been growing rapidly over the past few years since being introduced by WeChat in 2017. In this paradigm, a host app allows its end-users to install and run mini-apps inside itself, enabling the host app to build an ecosystem around (much like Google Play and Apple AppStore), enrich the host's functionalities, and offer mobile users elevated convenience without leaving the host app. It has been reported that there are over millions of mini-apps in WeChat. However, little information is known about these mini-apps at an aggregated level. In this paper, we present MiniCrawler, the first scalable and open source WeChat mini-app crawler that has indexed over 1,333,308 mini-apps. It leverages a number of reverse engineering techniques to uncover the interfaces and APIs in WeChat for crawling the mini-apps. With the crawled mini-apps, we then measure their resource consumption, API usage, library usage, obfuscation rate, app categorization, and app ratings at an aggregated level. The details of how we develop MiniCrawler and our measurement results are reported in this paper.

Safety compliance is paramount to the safe deployment of autonomous vehicle (AV) technologies in real-world transportation systems. As AVs will share road infrastructures with human drivers and pedestrians, it is an important requirement for AVs to obey standard driving rules. Existing AV software testing methods, including simulation and road testing, only check fundamental safety rules such as collision avoidance and safety distance. Scenario-dependent driving rules, including crosswalk and intersection rules, are more complicated because the expected driving behavior heavily depends on the surrounding circumstances. However, a testing framework is missing for checking scenario-dependent driving rules on various AV software.

In this paper, we design and implement a systematic framework AVChecker for identifying violations of scenario-dependent driving rules in AV software using formal methods. AVChecker represents both the code logic of AV software and driving rules in proposed formal specifications and leverages satisfiability modulo theory (SMT) solvers to identify driving rule violations. To improve the automation of systematic rule-based checking, AVChecker provides a powerful user interface for writing driving rule specifications and applies static code analysis to extract rule-related code logic from the AV software codebase. Evaluations on two open-source AV software platforms, Baidu Apollo and Autoware, uncover 19 true violations out of 28 real-world driving rules covering crosswalks, traffic lights, stop signs, and intersections. Seven of the violations can lead to severe risks of a collision with pedestrians or blocking traffic.

In networks with a minority and a majority community, it is well-studied that minorities are under-represented at the top of the social hierarchy. However, researchers are less clear about the representation of minorities from the lower levels of the hierarchy, where other disadvantages or vulnerabilities may exist. We offer a more complete picture of social disparities at each social level with empirical evidence that the minority representation exhibits two opposite phases: at the higher rungs of the social ladder, the representation of the minority community decreases; but, lower in the ladder, which is more populous, as you ascend, the representation of the minority community improves. We refer to this opposing phenomenon between the upper-level and lower-level as the chasm effect. Previous models of network growth with homophily fail to detect and explain the presence of this chasm effect. We analyze the interactions among a few well-observed network-growing mechanisms with a simple model to reveal the sufficient and necessary conditions for both phases in the chasm effect to occur. By generalizing the simple model naturally, we present a complete bi-affiliation bipartite network-growth model that could successfully capture disparities at all social levels and reproduce real social networks. Finally, we illustrate that addressing the chasm effect can create fairer systems with two applications in advertisement and fact-checks, thereby demonstrating the potential impact of the chasm effect on the future research of minority-majority disparities and fair algorithms.

This paper considers an online control problem involving two controllers. A central controller chooses an action from a feasible set that is determined by time-varying and coupling constraints, which depend on all past actions and states. The central controller's goal is to minimize the cumulative cost; however, the controller has access to neither the feasible set nor the dynamics directly, which are determined by a remote local controller. Instead, the central controller receives only an aggregate summary of the feasibility information from the local controller, which does not know the system costs. We show that it is possible for an online algorithm using feasibility information to nearly match the dynamic regret of an online algorithm using perfect information whenever the feasible sets satisfy a causal invariance criterion and there is a sufficiently large prediction window size. To do so, we use a form of feasibility aggregation based on entropic maximization in combination with a novel online algorithm, named Penalized Predictive Control (PPC) and demonstrate that aggregated information can be efficiently learned using reinforcement learning algorithms. The effectiveness of our approach for closed-loop coordination between central and local controllers is validated via an electric vehicle charging application in power systems.

Mean field models are a popular tool used to analyse load balancing policies. In some exceptional cases the waiting time distribution of the mean field limit has an explicit form. In other cases it can be computed as the solution of a set of differential equations. In this paper we study the limit of the mean waiting time E[Wλ] as the arrival rate λ approaches 1 for a number of load balancing policies in a large-scale system of homogeneous servers which finish work at a constant rate equal to one and exponential job sizes with mean 1 (i.e. when the system gets close to instability). As E[Wλ] diverges to infinity, we scale with -log(1-λ) and present a method to compute the limit limλ-> 1- -E[Wλ]/l(1-λ). We show that this limit has a surprisingly simple form for the load balancing algorithms considered. More specifically, we present a general result that holds for any policy for which the associated differential equation satisfies a list of assumptions. For the well-known LL(d) policy which assigns an incoming job to a server with the least work left among d randomly selected servers these assumptions are trivially verified. For this policy we prove the limit is given by 1/d-1. We further show that the LL(d,K) policy, which assigns batches of K jobs to the K least loaded servers among d randomly selected servers, satisfies the assumptions and the limit is equal to K/d-K. For a policy which applies LL(di) with probability pi, we show that the limit is given by 1/ ∑i pi di - 1. We further indicate that our main result can also be used for load balancers with redundancy or memory. In addition, we propose an alternate scaling -l(pλ) instead of -l(1-λ), where pλ is adapted to the policy at hand, such that limλ-> 1- -E[Wλ]/l(1-λ)=limλ-> 1- -E[Wλ]/l(pλ), where the limit limλ-> 0+ -E[Wλ]/l(pλ) is well defined and non-zero (contrary to limλ-> 0+ -E[Wλ]/l(1-λ)). This allows to obtain relatively flat curves for -E[Wλ]/l(pλ) for λ ∈ [0,1] which indicates that the low and high load limits can be used as an approximation when λ is close to one or zero. Our results rely on the earlier proven ansatz which asserts that for certain load balancing policies the workload distribution of any finite set of queues becomes independent of one another as the number of servers tends to infinity.

Application programs that exhibit strong locality of reference lead to minimized cache misses and better performance in different architectures. However, to maximize the performance of multithreaded applications running on emerging manycore systems, data movement in on-chip network should also be minimized. Unfortunately, the way many multithreaded programs are written does not lend itself well to minimal data movement. Motivated by this observation, in this paper, we target task-based programs (which cover a large set of available multithreaded programs), and propose a novel compiler-based approach that consists of four complementary steps. First, we partition the original tasks in the target application into sub-tasks and build a data reuse graph at a sub-task granularity. Second, based on the intensity of temporal and spatial data reuses among sub-tasks, we generate new tasks where each such (new) task includes a set of sub-tasks that exhibit high data reuse among them. Third, we assign the newly-generated tasks to cores in an architecture-aware fashion with the knowledge of data location. Finally, we re-schedule the execution order of sub-tasks within new tasks such that sub-tasks that belong to different tasks but share data among them are executed in close proximity in time. The detailed experiments show that, when targeting a state of the art manycore system, our proposed compiler-based approach improves the performance of 10 multithreaded programs by 23.4% on average, and it also outperforms two state-of-the-art data access optimizations for all the benchmarks tested. Our results also show that the proposed approach i) improves the performance of multiprogrammed workloads, and ii) generates results that are close to maximum savings that could be achieved with perfect profiling information. Overall, our experimental results emphasize the importance of dividing an original set of tasks of an application into sub-tasks and constructing new tasks from the resulting sub-tasks in a data movement- and locality-aware fashion.

The First-Come First-Served (FCFS) scheduling policy is the most popular scheduling algorithm used in practice. Furthermore, its usage is theoretically validated: for light-tailed job size distributions, FCFS has weakly optimal asymptotic tail of response time. But what if we don't just care about the asymptotic tail? What if we also care about the 99th percentile of response time, or the fraction of jobs that complete in under one second? Is FCFS still best? Outside of the asymptotic regime, only loose bounds on the tail of FCFS are known, and optimality is completely open.

In this paper, we introduce a new policy, Nudge, which is the first policy to provably stochastically improve upon FCFS. We prove that Nudge simultaneously improves upon FCFS at every point along the tail, for light-tailed job size distributions. As a result, Nudge outperforms FCFS for every moment and every percentile of response time. Moreover, Nudge provides a multiplicative improvement over FCFS in the asymptotic tail. This resolves a long-standing open problem by showing that, counter to previous conjecture, FCFS is not strongly asymptotically optimal.

The supermarket model is a popular load balancing model where each incoming job is assigned to a server with the least number of jobs among d randomly selected servers. Several authors have shown that the large scale limit in case of processor sharing servers has a unique insensitive fixed point, which naturally leads to the belief that the queue length distribution in such a system is insensitive to the job size distribution as the number of servers tends to infinity. Simulation results that support this belief have also been reported. However, global attraction of the unique fixed point of the large scale limit was not proven except for exponential job sizes, which is needed to formally prove asymptotic insensitivity. The difficulty lies in the fact that with processor sharing servers, the limiting system is in general not monotone.

In this paper we focus on the class of hyperexponential distributions of order 2 and demonstrate that for this class of distributions global attraction of the unique fixed point can still be established using monotonicity by picking a suitable state space and partial order. This allows us to formally show that we have asymptotic insensitivity within this class of job size distributions. We further demonstrate that our result can be leveraged to prove asymptotic insensitivity within this class of distributions for other load balancing systems.

Accurate prediction of network paths between arbitrary hosts on the Internet is of vital importance for network operators, cloud providers, and academic researchers. We present PredictRoute, a system that predicts network paths between hosts on the Internet using historical knowledge of the data and control plane. In addition to feeding on freely available traceroutes and BGP routing tables, PredictRoute optimally explores network paths towards chosen BGP prefixes. PredictRoute's strategy for exploring network paths discovers 4X more autonomous system (AS) hops than other well-known strategies used in practice today. Using a corpus of traceroutes, PredictRoute trains probabilistic models of routing towards prefixes on the Internet to predict network paths and their likelihood. PredictRoute's AS-path predictions differ from the measured path by at most 1 hop, 75% of the time. We expose PredictRoute's path prediction capability via a REST API to facilitate its inclusion in other applications and studies. We additionally demonstrate the utility of PredictRoute in improving real-world applications for circumventing Internet censorship and preserving anonymity online.

We study real-time routing policies in smart transit systems, where the platform has a combination of cars and high-capacity vehicles (e.g., buses or shuttles) and seeks to serve a set of incoming trip requests. The platform can use its fleet of cars as a feeder to connect passengers to its high-capacity fleet, which operates on fixed routes. Our goal is to find the optimal set of (bus) routes and corresponding frequencies to maximize the social welfare of the system in a given time window. This generalizes the Line Planning Problem, a widely studied topic in the transportation literature, for which existing solutions are either heuristic (with no performance guarantees), or require extensive computation time (and hence are impractical for real-time use). To this end, we develop a 1-1/e-ε approximation algorithm for the Real-Time Line Planning Problem, using ideas from randomized rounding and the Generalized Assignment Problem. Our guarantee holds under two assumptions: (i) no inter-bus transfers and (ii) access to a pre-specified set of feasible bus lines. We moreover show that these two assumptions are crucial by proving that, if either assumption is relaxed, the łineplanningproblem does not admit any constant-factor approximation. Finally, we demonstrate the practicality of our algorithm via numerical experiments on real-world and synthetic datasets, in which we show that, given a fixed time budget, our algorithm outperforms Integer Linear Programming-based exact methods.

Mean-field models are an established method to analyze large stochastic systems with N interacting objects by means of simple deterministic equations that are asymptotically correct when N tends to infinity. For finite N, mean-field equations provide an approximation whose accuracy is model- and parameter-dependent. Recent research has focused on refining the approximation by computing suitable quantities associated with expansions of order $1/N$ and $1/N^2$ to the mean-field equation. In this paper we present a new method for refining mean-field approximations. It couples the master equation governing the evolution of the probability distribution of a truncation of the original state space with a mean-field approximation of a time-inhomogeneous population process that dynamically shifts the truncation across the whole state space. We provide a result of asymptotic correctness in the limit when the truncation covers the state space; for finite truncations, the equations give a correction of the mean-field approximation. We apply our method to examples from the literature to show that, even with modest truncations, it is effective in models that cannot be refined using existing techniques due to non-differentiable drifts, and that it can outperform the state of the art in challenging models that cause instability due orbit cycles in their mean-field equations.

Ponzi schemes are financial scams that lure users under the promise of high profits. With the prosperity of Bitcoin and blockchain technologies, there has been growing anecdotal evidence that this classic fraud has emerged in the blockchain ecosystem. Existing studies have proposed machine-learning based approaches for detecting Ponzi schemes, i.e., either based on the operation codes (opcodes) of the smart contract binaries or the transaction patterns of addresses. However, state-of-the-art approaches face several major limitations, including lacking interpretability and high false positive rates. Moreover, machine-learning based methods are susceptible to evasion techniques, and transaction-based techniques do not work on smart contracts that have a small number of transactions. These limitations render existing methods for detecting Ponzi schemes ineffective. In this paper, we propose SADPonzi, a semantic-aware detection approach for identifying Ponzi schemes in Ethereum smart contracts. Specifically, by strictly following the definition of Ponzi schemes, we propose a heuristic-guided symbolic execution technique to first generate the semantic information for each feasible path in smart contracts and then identify investor-related transfer behaviors and the distribution strategies adopted. Experimental result on a well-labelled benchmark suggests that SADPonzi can achieve 100% precision and recall, outperforming all existing machine-learning based techniques. We further apply SADPonzi to all 3.4 million smart contracts deployed by EOAs in Ethereum and identify 835 Ponzi scheme contracts, with over 17 million US Dollars invested by victims. Our observations confirm the urgency of identifying and mitigating Ponzi schemes in the blockchain ecosystem.

This paper studies seeded graph matching for power-law graphs. Assume that two edge-correlated graphs are independently edge-sampled from a common parent graph with a power-law degree distribution. A set of correctly matched vertex-pairs is chosen at random and revealed as initial seeds. Our goal is to use the seeds to recover the remaining latent vertex correspondence between the two graphs. Departing from the existing approaches that focus on the use of high-degree seeds in $1$-hop neighborhoods, we develop an efficient algorithm that exploits the low-degree seeds in suitably-defined D-hop neighborhoods. Specifically, we first match a set of vertex-pairs with appropriate degrees (which we refer to as the first slice) based on the number of low-degree seeds in their D-hop neighborhoods. This approach significantly reduces the number of initial seeds needed to trigger a cascading process to match the rest of graphs. Under the Chung-Lu random graph model with n vertices, max degree Θ(√n), and the power-law exponent 2<β<3, we show that as soon as D> 4-β/3-β, by optimally choosing the first slice, with high probability our algorithm can correctly match a constant fraction of the true pairs without any error, provided with only Ω((log n)4-β) initial seeds. Our result achieves an exponential reduction in the seed size requirement, as the best previously known result requires n1/2+ε seeds (for any small constant ε>0). Performance evaluation with synthetic and real data further corroborates the improved performance of our algorithm.

This work presents the first-ever detailed and large-scale measurement analysis of storage consumption behavior of applications (apps) on smart mobile devices. We start by carrying out a five-year longitudinal static analysis of millions of Android apps to study the increase in their sizes over time and identify various sources of app storage consumption. Our study reveals that mobile apps have evolved as large monolithic packages that are packed with features to monetize/engage users and optimized for performance at the cost of redundant storage consumption.

We also carry out a mobile storage usage study with 140 Android participants. We built and deployed a lightweight context-aware storage tracing tool, called cosmos, on each participant's device. Leveraging the traces from our user study, we show that only a small fraction of apps/features are actively used and usage is correlated to user context. Our findings suggest a high degree of app feature bloat and unused functionality, which leads to inefficient use of storage. Furthermore, we found that apps are not constrained by storage quota limits, and developers freely abuse persistent storage by frequently caching data, creating debug logs, user analytics, and downloading advertisements as needed.

Finally, drawing upon our findings, we discuss the need for efficient mobile storage management, and propose an elastic storage design to reclaim storage space when unused. We further identify research challenges and quantify expected storage savings from such a design. We believe our findings will be valuable to the storage research community as well as mobile app developers.